ChatDataset
ChatDataset ¶
A dataset holding ChatInstance
.
Source code in flexeval/core/chat_dataset/base.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
__len__
abstractmethod
¶
__len__() -> int
Returns the number of chat instances in the dataset.
Source code in flexeval/core/chat_dataset/base.py
65 66 67 68 69 70 |
|
__getitem__
abstractmethod
¶
__getitem__(i: int) -> ChatInstance
Returns the i-th chat instance.
Source code in flexeval/core/chat_dataset/base.py
72 73 74 75 76 77 |
|
require_incremental_response ¶
require_incremental_response() -> bool
If true, the inputs consist of multiple user utterances and the model should generate responses for each utterance incrementally.
Otherwise, the model just has to continue the conversation from the last user utterance.
Source code in flexeval/core/chat_dataset/base.py
79 80 81 82 83 84 85 |
|
__repr__ ¶
__repr__() -> str
Source code in flexeval/core/chat_dataset/base.py
87 88 |
|
ChatInstance
dataclass
¶
A dataclass representing a single chat that will be fed to a chat language model.
Source code in flexeval/core/chat_dataset/base.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
messages
instance-attribute
¶
messages: list[dict[str, Any]]
A list of messages in the chat. The format of messages typically follows OpenAI's Chat Completions API.
[
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "I'd like to book a flight to Paris."
}
]
references
class-attribute
instance-attribute
¶
references: list[str] = field(default_factory=list)
A list of reference responses to the user's last message. The model's response will be evaluated against these references.
extra_info
class-attribute
instance-attribute
¶
extra_info: dict[str, Any] = field(default_factory=dict)
Extra information that can be used by passing to Metric
.
inputs
property
¶
inputs: list[dict[str, str]]
Alias for messages
.
This is used in FewShotGenerator
so that it can access the inputs with the same attribute name as
GenerationInstance
and MultipleChoiceInstance
.
__init__ ¶
__init__(
messages: list[dict[str, Any]],
references: list[str] = list(),
extra_info: dict[str, Any] = dict(),
) -> None
__post_init__ ¶
__post_init__() -> None
Source code in flexeval/core/chat_dataset/base.py
42 43 44 45 46 47 48 49 50 |
|
HFChatDataset ¶
Load ChatInstances from a Hugging Face dataset.
Parameters:
-
path
(str
) –The path to the Hugging Face dataset.
-
split
(str
) –The split of the dataset.
-
subset
(str | None
, default:None
) –The subset of the dataset.
-
dataset_kwargs
(dict[str, Any] | None
, default:None
) –The keyword arguments to pass to the Hugging Face dataset.
Source code in flexeval/core/chat_dataset/template_based.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
__init__ ¶
__init__(
path: str,
split: str,
input_template: str,
subset: str | None = None,
dataset_kwargs: dict[str, Any] | None = None,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
JsonlChatDataset ¶
Load ChatInstances from a JSONL file.
Parameters:
-
path
(str
) –The path to the JSONL file.
Source code in flexeval/core/chat_dataset/template_based.py
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
|
__init__ ¶
__init__(
path: str,
input_template: str,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
|
TemplateChatDataset ¶
This class only supports single-turn chat.
Parameters:
-
items
(list[dict[str, Any]]
) –A list of items in a dict format.
-
input_template
(str
) –A Jinja2 template for the user input.
-
reference_template
(str | None
, default:None
) –Specify the Jinja2 template to render the reference string if the dataset has a single reference.
-
reference_list_template
(str | None
, default:None
) –Specify the Jinja2 template to render a list of reference strings if the dataset has multiple references.
-
require_incremental_response
(bool
, default:False
) –Whether the dataset requires incremental response.
-
extra_info_templates
(dict[str, str] | None
, default:None
) –A dictionary of Jinja2 templates for extra information.
-
system_message_template
(str | None
, default:None
) –A Jinja2 template for the system message.
-
data_range
(tuple[int, int] | None
, default:None
) –The range of data to use.
-
keep_conditions
(dict[str, str] | None
, default:None
) –A dictionary to indicate the condition to filter certain items. The key is a Jinja2 template string to embed the item into a string, and the value is the value to keep.
-
remove_conditions
(dict[str, str] | None
, default:None
) –A dictionary to indicate the condition to remove certain items. The key is a Jinja2 template string to embed the item into a string, and the value is the value to remove.
Source code in flexeval/core/chat_dataset/template_based.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
|
reference_template
instance-attribute
¶
reference_template = (
from_string(reference_template)
if reference_template
else None
)
reference_list_template
instance-attribute
¶
reference_list_template = (
from_string(reference_list_template)
if reference_list_template
else None
)
__init__ ¶
__init__(
items: list[dict[str, Any]],
input_template: str,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/template_based.py
86 87 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/template_based.py
89 90 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/template_based.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
|
ChatbotBench ¶
This class loads data with the jsonl format used in chat evaluation benchmarks such as MT-Bench (Multi-turn Benchmark) or Vicuna QA Benchmark.
Example of a line from a jsonl file
{ "question_id": 00, "category": "writing", "turns": [ "Compose an engaging travel blog post about a recent trip to Hawaii.", "Rewrite your previous response. Start every sentence with the letter A." ] }
Source code in flexeval/core/chat_dataset/chatbot_bench.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
need_ref_categories
instance-attribute
¶
need_ref_categories = need_ref_categories or [
"math",
"coding",
"reasoning",
]
__init__ ¶
__init__(
path_or_name: str,
ref_path_or_name: str | None = None,
need_ref_categories: list[str] | None = None,
load_only_first_n: int | None = None,
) -> None
Source code in flexeval/core/chat_dataset/chatbot_bench.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/chatbot_bench.py
73 74 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/chatbot_bench.py
76 77 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/chatbot_bench.py
79 80 81 82 83 84 85 |
|
SacreBleuChatDataset ¶
Load datasets from the sacrebleu library. The available datasets are defined in sacrebleu.DATASETS.
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
__init__ ¶
__init__(name: str, langpair: str) -> None
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
11 12 13 14 15 16 17 18 19 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
21 22 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
24 25 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
27 28 29 30 31 32 |
|