Adding a New Dataset#
To add a new dataset into sktime
internal dataset repository, please proceed with the following steps:
From the root of your
sktime
local repository, create a<dataset-name>
folder:mkdir ./datasets/data/<dataset-name>
In the above directory, add your dataset file
<dataset-name>.<EXT>
, where<EXT>
is the file extension:The list of supported file formats is available in the
sktime/MANIFEST.in
file (e.g.,.csv
,.txt
).If your file format
<EXT>
does not figure in the list, simply add it in thesktime/MANIFEST.in
file:
"sktime/MANIFEST.in" ... recursive-include sktime/datasets *.csv ... *.<EXT> ...
In
sktime/datasets/_single_problem_loaders.py
, declare aload_<dataset-name>(...)
function. Feel free to use any other declared functions as templates for either classification or regression datasets.In
sktime/datasets/__init__.py
, append"load_<dataset-name>"
to the list__all__
.In
sktime/datasets/setup.py
, append"<dataset-name>"
to the tupleincluded_datasets
.