Adding a New Dataset#

To add a new dataset into sktime internal dataset repository, please proceed with the following steps:

  1. From the root of your sktime local repository, create a <dataset-name> folder:

    mkdir ./datasets/data/<dataset-name>
  2. In the above directory, add your dataset file <dataset-name>.<EXT>, where <EXT> is the file extension:

    • The list of supported file formats is available in the sktime/ file (e.g., .csv, .txt).

    • If your file format <EXT> does not figure in the list, simply add it in the sktime/ file:

    recursive-include sktime/datasets *.csv ... *.<EXT>
  3. In sktime/datasets/, declare a load_<dataset-name>(...) function. Feel free to use any other declared functions as templates for either classification or regression datasets.

  4. In sktime/datasets/, append "load_<dataset-name>" to the list __all__.

  5. In sktime/datasets/, append "<dataset-name>" to the tuple included_datasets.