Ja multiple choice
jcommonsenseqa_mc¶
JCommonsenseQA is a Japanese version of CommonsenseQA, which is a multiple-choice question answering dataset that requires commonsense reasoning ability. The dataset is built using crowdsourcing with seeds extracted from the knowledge base ConceptNet. This is a setup for multiple choice where the model chooses the correct answer based on the log-probabilities of the choices.
References:
- Hugging Face Dataset
- Original Repository
- JGLUE: Japanese General Language Understanding Evaluation
- JGLUE: 日本語言語理解ベンチマーク
local dataset_base_args = { path: 'llm-book/JGLUE', subset: 'JCommonsenseQA', choices_templates: ['{{ choice0 }}', '{{ choice1 }}', '{{ choice2 }}', '{{ choice3 }}', '{{ choice4 }}'], answer_index_template: '{{ label }}', }; { class_path: 'MultipleChoice', init_args: { eval_dataset: { class_path: 'HFMultipleChoiceDataset', init_args: dataset_base_args { split: 'validation' }, }, few_shot_generator: { class_path: 'RandomFewShotGenerator', init_args: { dataset: { class_path: 'HFMultipleChoiceDataset', init_args: dataset_base_args { split: 'train' }, }, num_shots: 0, }, }, prompt_template: ||| {% for item in few_shot_data %} 問題:{{ item.question }} 回答:「{{ item.choices[item.answer_index] }}」 {% endfor %} 問題:{{question}} ||| + '回答:「', }, }
xwinograd_ja¶
XWinograd is a multilingual collection of Winograd Schemas in six languages that can be used for evaluation of cross-lingual commonsense reasoning capabilities. This is an Japanese subset of the dataset.
References:
- Hugging Face Dataset
- It’s All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning
{ class_path: 'MultipleChoice', init_args: { eval_dataset: { class_path: 'HFMultipleChoiceDataset', init_args: { path: 'Muennighoff/xwinograd', subset: 'jp', split: 'test', choices_templates: [ '{{ option1 }}{{ sentence.split("_")[1] }}', '{{ option2 }}{{ sentence.split("_")[1] }}', ], answer_index_template: '{{ answer | int - 1 }}', input_templates: { context: '{{ sentence.split("_")[0] }}' }, }, }, prompt_template: '{{ context }}', }, }