ChatDataset
ChatDataset ¶
A dataset holding ChatInstance
.
Source code in flexeval/core/chat_dataset/base.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|
__len__
abstractmethod
¶
__len__() -> int
Returns the number of chat instances in the dataset.
Source code in flexeval/core/chat_dataset/base.py
71 72 73 74 75 76 |
|
__getitem__
abstractmethod
¶
__getitem__(i: int) -> ChatInstance
Returns the i-th chat instance.
Source code in flexeval/core/chat_dataset/base.py
78 79 80 81 82 83 |
|
require_incremental_response ¶
require_incremental_response() -> bool
If true, the inputs consist of multiple user utterances and the model should generate responses for each utterance incrementally.
Otherwise, the model just has to continue the conversation from the last user utterance.
Source code in flexeval/core/chat_dataset/base.py
85 86 87 88 89 90 91 |
|
__repr__ ¶
__repr__() -> str
Source code in flexeval/core/chat_dataset/base.py
93 94 |
|
ChatInstance
dataclass
¶
A dataclass representing a single chat that will be fed to a chat language model.
Source code in flexeval/core/chat_dataset/base.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|
messages
instance-attribute
¶
messages: list[dict[str, Any]]
A list of messages in the chat. The format of messages typically follows OpenAI's Chat Completions API.
[
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "I'd like to book a flight to Paris."
}
]
tools
class-attribute
instance-attribute
¶
tools: list[dict[str, Any]] | None = None
A list of definitions of tools in the chat. The format of tools typically follows OpenAI's Chat Completion API Currently, only function calling (tools with type="function") is supported.
references
class-attribute
instance-attribute
¶
references: list[str] = field(default_factory=list)
A list of reference responses to the user's last message. The model's response will be evaluated against these references.
extra_info
class-attribute
instance-attribute
¶
extra_info: dict[str, Any] = field(default_factory=dict)
Extra information that can be used by passing to Metric
.
inputs
property
¶
inputs: list[dict[str, str]]
Alias for messages
.
This is used in FewShotGenerator
so that it can access the inputs with the same attribute name as
GenerationInstance
and MultipleChoiceInstance
.
__init__ ¶
__init__(
messages: list[dict[str, Any]],
tools: list[dict[str, Any]] | None = None,
references: list[str] = list(),
extra_info: dict[str, Any] = dict(),
) -> None
__post_init__ ¶
__post_init__() -> None
Source code in flexeval/core/chat_dataset/base.py
48 49 50 51 52 53 54 55 56 |
|
HFChatDataset ¶
Load ChatInstances from a Hugging Face dataset.
Parameters:
-
path
(str
) –The path to the Hugging Face dataset.
-
split
(str
) –The split of the dataset.
-
subset
(str | None
, default:None
) –The subset of the dataset.
-
dataset_kwargs
(dict[str, Any] | None
, default:None
) –The keyword arguments to pass to the Hugging Face dataset.
Source code in flexeval/core/chat_dataset/template_based.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
__init__ ¶
__init__(
path: str,
split: str,
input_template: str,
subset: str | None = None,
dataset_kwargs: dict[str, Any] | None = None,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
JsonlChatDataset ¶
Load ChatInstances from a JSONL file.
Parameters:
-
path
(str
) –The path to the JSONL file.
Source code in flexeval/core/chat_dataset/template_based.py
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
|
__init__ ¶
__init__(
path: str,
input_template: str,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
|
TemplateChatDataset ¶
This class only supports single-turn chat.
Parameters:
-
items
(list[dict[str, Any]]
) –A list of items in a dict format. The "tools" key for each item can contain the list of function definitions. They should be in JSON Schema format as in the OpenAI Chat Completion API. https://platform.openai.com/docs/guides/function-calling?api-mode=chat#defining-functions
-
input_template
(str
) –A Jinja2 template for the user input.
-
reference_template
(str | None
, default:None
) –Specify the Jinja2 template to render the reference string if the dataset has a single reference.
-
reference_list_template
(str | None
, default:None
) –Specify the Jinja2 template to render a list of reference strings if the dataset has multiple references.
-
require_incremental_response
(bool
, default:False
) –Whether the dataset requires incremental response.
-
extra_info_templates
(dict[str, str] | None
, default:None
) –A dictionary of Jinja2 templates for extra information.
-
system_message_template
(str | None
, default:None
) –A Jinja2 template for the system message.
-
data_range
(tuple[int, int] | None
, default:None
) –The range of data to use.
-
keep_conditions
(dict[str, str] | None
, default:None
) –A dictionary to indicate the condition to filter certain items. The key is a Jinja2 template string to embed the item into a string, and the value is the value to keep.
-
remove_conditions
(dict[str, str] | None
, default:None
) –A dictionary to indicate the condition to remove certain items. The key is a Jinja2 template string to embed the item into a string, and the value is the value to remove.
Source code in flexeval/core/chat_dataset/template_based.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
|
reference_template
instance-attribute
¶
reference_template = (
from_string(reference_template)
if reference_template
else None
)
reference_list_template
instance-attribute
¶
reference_list_template = (
from_string(reference_list_template)
if reference_list_template
else None
)
__init__ ¶
__init__(
items: list[dict[str, Any]],
input_template: str,
reference_template: str | None = None,
reference_list_template: str | None = None,
require_incremental_response: bool = False,
extra_info_templates: dict[str, str] | None = None,
system_message_template: str | None = None,
data_range: tuple[int, int] | None = None,
keep_conditions: dict[str, str] | None = None,
remove_conditions: dict[str, str] | None = None,
) -> None
Source code in flexeval/core/chat_dataset/template_based.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/template_based.py
89 90 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/template_based.py
92 93 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/template_based.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
|
ChatbotBench ¶
This class loads data with the jsonl format used in chat evaluation benchmarks such as MT-Bench (Multi-turn Benchmark) or Vicuna QA Benchmark.
Example of a line from a jsonl file
{ "question_id": 00, "category": "writing", "turns": [ "Compose an engaging travel blog post about a recent trip to Hawaii.", "Rewrite your previous response. Start every sentence with the letter A." ] # 'tools' key is optional. # It should be in the same format as FunctionCalling in the OpenAI ChatCompletion API. # https://platform.openai.com/docs/guides/function-calling?api-mode=chat#defining-functions "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current temperature for a given location.", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City and country e.g. Bogotá, Colombia"}, }, "required": ["location"], "additionalProperties": False}, "strict": True }, }, ], # 'system_message' key is optional. # If set, it will be inserted in the first turn as a system prompt "system_message": "You are a helpful assistant." }
Source code in flexeval/core/chat_dataset/chatbot_bench.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
|
need_ref_categories
instance-attribute
¶
need_ref_categories = need_ref_categories or [
"math",
"coding",
"reasoning",
]
__init__ ¶
__init__(
path_or_name: str,
ref_path_or_name: str | None = None,
need_ref_categories: list[str] | None = None,
load_only_first_n: int | None = None,
) -> None
Source code in flexeval/core/chat_dataset/chatbot_bench.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/chatbot_bench.py
101 102 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/chatbot_bench.py
104 105 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/chatbot_bench.py
107 108 109 110 111 112 113 114 115 116 117 118 |
|
OpenAIMessagesDataset ¶
This class loads data with OpenAI-like format in jsonl file. The difference lies in that this class has 'tool_definition' field, in which available tools are listed.
Parameters:
-
file_path
(str | list[str] | None
, default:None
) –Path or list of paths to
.jsonl
file(s). -
message_key
(str
, default:'messages'
) –Key used to extract the list of messages from each JSON object.
-
tool_definitions_key
(str | None
, default:None
) –Key used to extract the list of tool definitions from each JSON object. Set to
None
(default) for data without tool_calls. -
drop_if_last_from_assistant
(bool
, default:False
) –If true, when the last utterance is given by assistant, drop it.
In Jsonl, each line must have a following structure:
{
'
Source code in flexeval/core/chat_dataset/openai_messages.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
__init__ ¶
__init__(
file_path: str | None = None,
message_key: str = "messages",
tool_definitions_key: str | None = None,
drop_if_last_from_assistant: bool = False,
) -> None
Source code in flexeval/core/chat_dataset/openai_messages.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/openai_messages.py
68 69 |
|
__getitem__ ¶
__getitem__(idx: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/openai_messages.py
71 72 |
|
SacreBleuChatDataset ¶
Load datasets from the sacrebleu library. The available datasets are defined in sacrebleu.DATASETS.
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
__init__ ¶
__init__(name: str, langpair: str) -> None
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
11 12 13 14 15 16 17 18 19 |
|
require_incremental_response ¶
require_incremental_response() -> bool
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
21 22 |
|
__len__ ¶
__len__() -> int
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
24 25 |
|
__getitem__ ¶
__getitem__(i: int) -> ChatInstance
Source code in flexeval/core/chat_dataset/sacrebleu_dataset.py
27 28 29 30 31 32 |
|