BaseBenchmark#

class BaseBenchmark(id_format: str | None = None, backend=None, backend_params=None, return_data=False)[source]#

Base class for benchmarks.

A benchmark consists of a set of tasks and a set of estimators.

Parameters:
id_format: str, optional (default=None)

A regex used to enforce task/estimator ID to match a certain format if None, no format is enforced on task/estimator ID

backendstring, by default “None”.

Parallelization backend to use for runs.

  • “None”: executes loop sequentially, simple list comprehension

  • “loky”, “multiprocessing” and “threading”: uses joblib.Parallel loops

  • “joblib”: custom and 3rd party joblib backends, e.g., spark

  • “dask”: uses dask, requires dask package in environment

  • “dask_lazy”: same as “dask”, but changes the return to (lazy)

    dask.dataframe.DataFrame.

  • “ray”: uses ray, requires ray package in environment

Recommendation: Use “dask” or “loky” for parallel evaluate. “threading” is unlikely to see speed ups due to the GIL and the serialization backend (cloudpickle) for “dask” and “loky” is generally more robust than the standard pickle library used in “multiprocessing”.

backend_paramsdict, optional

additional parameters passed to the backend as config. Directly passed to utils.parallel.parallelize. Valid keys depend on the value of backend:

  • “None”: no additional parameters, backend_params is ignored

  • “loky”, “multiprocessing” and “threading”: default joblib backends any valid keys for joblib.Parallel can be passed here, e.g., n_jobs, with the exception of backend which is directly controlled by backend. If n_jobs is not passed, it will default to -1, other parameters will default to joblib defaults.

  • “joblib”: custom and 3rd party joblib backends, e.g., spark. any valid keys for joblib.Parallel can be passed here, e.g., n_jobs, backend must be passed as a key of backend_params in this case. If n_jobs is not passed, it will default to -1, other parameters will default to joblib defaults.

  • “dask”: any valid keys for dask.compute can be passed, e.g., scheduler

  • “ray”: The following keys can be passed:

    • “ray_remote_args”: dictionary of valid keys for ray.init

    • “shutdown_ray”: bool, default=True; False prevents ray from shutting

      down after parallelization.

    • “logger_name”: str, default=”ray”; name of the logger to use.

    • “mute_warnings”: bool, default=False; if True, suppresses warnings

return_databool, optional (default=False)

Whether to return the prediction and the ground truth data in the results.

Methods

add_estimator(estimator[, estimator_id])

Register an estimator to the benchmark.

add_task(*args, **kwargs)

Register a task to the benchmark.

run([output_file, force_rerun])

Run the benchmarking for all tasks and estimators.

add_estimator(estimator: BaseEstimator, estimator_id: str | None = None)[source]#

Register an estimator to the benchmark.

Parameters:
estimatordict, list or BaseEstimator object

Estimator to add to the benchmark.

  • if BaseEstimator, single estimator. estimator_id is generated as the estimator’s class name if not provided.

  • If dict, keys are ``estimator_id``s used to customise identifier ID and values are estimators.

  • If list, each element is an estimator. ``estimator_id``s are generated automatically using the estimator’s class name.

estimator_idstr, optional (default=None)

Identifier for estimator. If none given then uses estimator’s class name.

add_task(*args, **kwargs)[source]#

Register a task to the benchmark.

run(output_file: str = None, force_rerun: str | list[str] = 'none')[source]#

Run the benchmarking for all tasks and estimators.

If output_file is provided, results will be saved to a file or location, in a format inferred from the file extension.

The exact format is determined by the storage backend used, see documentation on storage handlers in sktime.benchmarking._storage_handlers.get_storage_backend.

Parameters:
output_filestr or None (default)

Path to save the results to. If None, results will not be saved.

force_rerunUnion[str, list[str]], optional (default=”none”)
  • If “none”, will skip validation if results already exist.

  • If “all”, will run validation for all tasks and models.

  • If list of str, will run validation for tasks and models in list.