Adding a New Dataset#
To add a new dataset into sktime internal dataset repository, please proceed with the following steps:
From the root of your
sktimelocal repository, create a<dataset-name>folder:mkdir ./datasets/data/<dataset-name>In the above directory, add your dataset file
<dataset-name>.<EXT>, where<EXT>is the file extension:The list of supported file formats is available in the
sktime/MANIFEST.infile (e.g.,.csv,.txt).If your file format
<EXT>does not figure in the list, simply add it in thesktime/MANIFEST.infile:
"sktime/MANIFEST.in" ... recursive-include sktime/datasets *.csv ... *.<EXT> ...
In
sktime/datasets/_single_problem_loaders.py, declare aload_<dataset-name>(...)function. Feel free to use any other declared functions as templates for either classification or regression datasets.In
sktime/datasets/__init__.py, append"load_<dataset-name>"to the list__all__.In
sktime/datasets/setup.py, append"<dataset-name>"to the tupleincluded_datasets.