ForecastingBenchmark#
- class ForecastingBenchmark(id_format: str | None = None, backend=None, backend_params=None, return_data=False)[source]#
Forecasting benchmark.
Run a series of forecasters against a series of tasks defined via dataset loaders, cross validation splitting strategies and performance metrics, and return results as a df (as well as saving to file).
- Parameters:
- id_format: str, optional (default=None)
A regex used to enforce task/estimator ID to match a certain format
- backendstring, by default “None”.
Parallelization backend to use for runs.
“None”: executes loop sequentially, simple list comprehension
“loky”, “multiprocessing” and “threading”: uses
joblib.Parallelloops“joblib”: custom and 3rd party
joblibbackends, e.g.,spark“dask”: uses
dask, requiresdaskpackage in environment- “dask_lazy”: same as “dask”, but changes the return to (lazy)
dask.dataframe.DataFrame.
“ray”: uses
ray, requiresraypackage in environment
Recommendation: Use “dask” or “loky” for parallel evaluate. “threading” is unlikely to see speed ups due to the GIL and the serialization backend (
cloudpickle) for “dask” and “loky” is generally more robust than the standardpicklelibrary used in “multiprocessing”.- backend_paramsdict, optional
additional parameters passed to the backend as config. Directly passed to
utils.parallel.parallelize. Valid keys depend on the value ofbackend:“None”: no additional parameters,
backend_paramsis ignored“loky”, “multiprocessing” and “threading”: default
joblibbackends
any valid keys for
joblib.Parallelcan be passed here, e.g.,n_jobs, with the exception ofbackendwhich is directly controlled bybackend. Ifn_jobsis not passed, it will default to-1, other parameters will default tojoblibdefaults. - “joblib”: custom and 3rd partyjoblibbackends, e.g.,spark. any valid keys forjoblib.Parallelcan be passed here, e.g.,n_jobs,backendmust be passed as a key ofbackend_paramsin this case. Ifn_jobsis not passed, it will default to-1, other parameters will default tojoblibdefaults. - “dask”: any valid keys fordask.computecan be passed, e.g.,scheduler“ray”: The following keys can be passed:
“ray_remote_args”: dictionary of valid keys for
ray.init- “shutdown_ray”: bool, default=True; False prevents
rayfrom shutting down after parallelization.
- “shutdown_ray”: bool, default=True; False prevents
“logger_name”: str, default=”ray”; name of the logger to use.
“mute_warnings”: bool, default=False; if True, suppresses warnings
- return_databool, optional (default=False)
Whether to return the prediction and the ground truth data in the results.
Methods
add_estimator(estimator[, estimator_id])Register an estimator to the benchmark.
add_task(dataset_loader, cv_splitter, scorers)Register a forecasting task to the benchmark.
run([output_file, force_rerun])Run the benchmarking for all tasks and estimators.
- add_task(dataset_loader: Callable | tuple, cv_splitter: BaseSplitter, scorers: list[BaseMetric], task_id: str | None = None, cv_global: BaseSplitter | None = None, error_score: str = 'raise', strategy: str = 'refit', cv_global_temporal: SingleWindowSplitter | None = None)[source]#
Register a forecasting task to the benchmark.
- Parameters:
- dataUnion[Callable, tuple]
Can be
a function which returns a dataset, like from sktime.datasets.
a tuple containing two data container that are sktime comptaible.
single data container that is sktime compatible (only endogenous data).
- cv_splitterBaseSplitter object
Splitter used for generating validation folds.
- scorersa list of BaseMetric objects
Each BaseMetric output will be included in the results.
- task_idstr, optional (default=None)
Identifier for the benchmark task. If none given then uses dataset loader name combined with cv_splitter class name.
- cv_global: sklearn splitter, or sktime instance splitter, default=None
If
cv_globalis passed, then global benchmarking is applied, as follows:1. the
cv_globalsplitter is used to split data at instance level, into a global training sety_train, and a global test sety_test_global. 2. The estimator is fitted to the global training sety_train. 3.cv_splitterthen splits the global test sety_test_globaltemporally, to obtain temporal splitsy_past,y_true.Overall, with
y_train,y_past,y_trueas above, the following evaluation will be applied:forecaster.fit(y=y_train, fh=cv.fh) y_pred = forecaster.predict(y=y_past) metric(y_true, y_pred)
- error_score“raise” or numeric, default=np.nan
Value to assign to the score if an exception occurs in estimator fitting. If set to “raise”, the exception is raised. If a numeric value is given, FitFailedWarning is raised.
- strategy{“refit”, “update”, “no-update_params”}, optional, default=”refit”
defines the ingestion mode when the forecaster is updated with new data
“refit” = forecaster is refitted to each training window
“update” = forecaster is updated with training window data, in sequence provided
“no-update_params” = fit to first training window, re-used without fit or update
- cv_global_temporal: SingleWindowSplitter, default=None
ignored if cv_global is None. If passed, it splits the Panel temporally before the instance split from cv_global is applied. This avoids temporal leakage in the global evaluation across time series. Has to be a SingleWindowSplitter. cv is applied on the test set of the combined application of cv_global and cv_global_temporal.
- Returns:
- A dictionary of benchmark results for that forecaster
- add_estimator(estimator: BaseEstimator, estimator_id: str | None = None)[source]#
Register an estimator to the benchmark.
- Parameters:
- estimatordict, list or BaseEstimator object
Estimator to add to the benchmark.
if
BaseEstimator, single estimator.estimator_idis generated as the estimator’s class name if not provided.If
dict, keys are ``estimator_id``s used to customise identifier ID and values are estimators.If
list, each element is an estimator. ``estimator_id``s are generated automatically using the estimator’s class name.
- estimator_idstr, optional (default=None)
Identifier for estimator. If none given then uses estimator’s class name.
- run(output_file: str = None, force_rerun: str | list[str] = 'none')[source]#
Run the benchmarking for all tasks and estimators.
If
output_fileis provided, results will be saved to a file or location, in a format inferred from the file extension.The exact format is determined by the storage backend used, see documentation on storage handlers in
sktime.benchmarking._storage_handlers.get_storage_backend.- Parameters:
- output_filestr or None (default)
Path to save the results to. If None, results will not be saved.
- force_rerunUnion[str, list[str]], optional (default=”none”)
If “none”, will skip validation if results already exist.
If “all”, will run validation for all tasks and models.
If list of str, will run validation for tasks and models in list.