ResultRecorder
ResultRecorder ¶
An abstract base class for recording experiment results, including configuration, metrics, and model outputs.
This class defines the interface for different result recording implementations, such as saving to a local directory, uploading to wandb, or integrating with MLflow.
Source code in flexeval/core/result_recorder/base.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
record_config
abstractmethod
¶
record_config(
config: dict[str, Any], group: str | None = None
) -> None
Record the configuration parameters of the experiment.
Parameters:
-
config
(dict[str, Any]
) –A dictionary containing the configuration parameters of the evaluation.
-
group
(str | None
, default:None
) –An optional group name to organize the configuration.
Source code in flexeval/core/result_recorder/base.py
16 17 18 19 20 21 22 23 24 25 |
|
record_metrics
abstractmethod
¶
record_metrics(
metrics: dict[str, Any], group: str | None = None
) -> None
Record the evaluation metrics of the experiment.
Parameters:
-
metrics
(dict[str, Any]
) –A dictionary containing the evaluation metrics, where keys are metric names and values are the corresponding results.
-
group
(str | None
, default:None
) –An optional group name to organize the metrics.
Source code in flexeval/core/result_recorder/base.py
27 28 29 30 31 32 33 34 35 36 |
|
record_model_outputs
abstractmethod
¶
record_model_outputs(
model_outputs: list[dict[str, Any]],
group: str | None = None,
) -> None
Record the outputs generated by the model during evaluation.
Parameters:
-
model_outputs
(list[dict[str, Any]]
) –A list of dictionaries, where each dictionary represents a single model output. The structure of these dictionaries may vary depending on the specific model and task.
-
group
(str | None
, default:None
) –An optional group name to organize the model outputs.
Source code in flexeval/core/result_recorder/base.py
38 39 40 41 42 43 44 45 46 47 48 |
|
LocalRecorder ¶
A class to record the results in JSON format.
Parameters:
-
output_dir
(str
) –The directory to save the results.
Source code in flexeval/core/result_recorder/local_recorder.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
|
__init__ ¶
__init__(output_dir: str, force: bool = False) -> None
Source code in flexeval/core/result_recorder/local_recorder.py
46 47 48 |
|
record_config ¶
record_config(
config: dict[str, Any], group: str | None = None
) -> None
Source code in flexeval/core/result_recorder/local_recorder.py
61 62 63 64 65 66 67 68 69 70 |
|
record_metrics ¶
record_metrics(
metrics: dict[str, Any], group: str | None = None
) -> None
Source code in flexeval/core/result_recorder/local_recorder.py
72 73 74 75 76 77 78 79 80 81 |
|
record_model_outputs ¶
record_model_outputs(
model_outputs: list[dict[str, Any]],
group: str | None = None,
) -> None
Source code in flexeval/core/result_recorder/local_recorder.py
83 84 85 86 87 88 89 90 91 92 |
|
WandBRecorder ¶
A class to record the results to Weights & Biases.
Parameters:
-
init_kwargs
(dict[str, Any] | None
, default:None
) –The arguments for the
wandb.init
function. Please refer to the official documentation for the details.
Source code in flexeval/core/result_recorder/wandb_recorder.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
__init__ ¶
__init__(init_kwargs: dict[str, Any] | None = None) -> None
Source code in flexeval/core/result_recorder/wandb_recorder.py
17 18 19 20 21 22 23 24 25 |
|
record_config ¶
record_config(
config: dict[str, Any], group: str | None = None
) -> None
Source code in flexeval/core/result_recorder/wandb_recorder.py
27 28 29 30 31 |
|
record_metrics ¶
record_metrics(
metrics: dict[str, Any], group: str | None = None
) -> None
Source code in flexeval/core/result_recorder/wandb_recorder.py
33 34 35 36 37 |
|
record_model_outputs ¶
record_model_outputs(
model_outputs: list[dict[str, Any]],
group: str | None = None,
) -> None
Source code in flexeval/core/result_recorder/wandb_recorder.py
39 40 41 42 43 44 45 46 |
|
__del__ ¶
__del__() -> None
Source code in flexeval/core/result_recorder/wandb_recorder.py
48 49 |
|