Data Format Specifications#

This section provides specifications for:

  • python in-memory data containers used in sktime (e.g., time series, panel data, etc.)

  • serialized file formats used by sktime (e.g., ts)

For utilities to check and convert data formats, see the API reference on Utility functions.

In-memory Data Specifications#

sktime uses a variety of in-memory data containers to represent time series data.

The in-memory specifications are listed by abstract data type, also referred to as data scitype (scientific type) throughout the documentation.

The core scitypes in sktime are:

  • Series: a single time series

  • Panel: a flat collection of time series, also called panel of time series, or panel data

  • Hierarchical: a hierarchical collection of time series

  • Table: a data frame table, as implemented for instance by pandas.DataFrame

Each scitype is sub-typed with property fields such as is_univariate (the time series is univariate yes/no), n_instances (number of instances in a panel or hierarchical collection), etc.

Concerete data types in sktime are implementations of these abstract data types, also referred to as data mtype (machine type) throughout the documentation.

Full specifications of the abstract data types, with their subtypes, can be accessed below:

ScitypeSeries([is_univariate, ...])

Series data type.

ScitypePanel([is_univariate, ...])

Panel data type.

ScitypeHierarchical([is_univariate, ...])

Hierarchical data type.

ScitypeTable([is_univariate, is_empty, ...])

Data Frame or Table data type.

Series mtype specifications#

The Series mtype represents a single time series.

SeriesPdDataFrame([is_univariate, ...])

Data type: pandas.DataFrame based specification of single time series.

SeriesPdSeries([is_univariate, ...])

Data type: pandas.Series based specification of single time series.

SeriesNp2D([is_univariate, ...])

Data type: 2D np.ndarray based specification of single time series.

SeriesDask([is_univariate, ...])

Data type: dask.DataFrame based specification of single time series.

SeriesPolarsEager([is_univariate, ...])

Data type: polars.DataFrame based specification of single time series.

SeriesGluontsList([is_univariate, ...])

Data type: gluonts ListDataset based specification of single time series.

SeriesGluontsPandas([is_univariate, ...])

Data type: gluonts PandasDataset based specification of single time series.

Panel mtype specifications#

The Panel mtype represents a flat collection of time series.

PanelPdMultiIndex([is_univariate, ...])

Data type: MultiIndex-ed pd.DataFrame specification of panel of time series.

PanelNp3D([is_univariate, ...])

Data type: 3D np.ndarray based specification of panel of time series.

PanelDfList([is_univariate, ...])

Data type: list-of-pandas.DataFrame based specification of panel of time series.

PanelDask([is_univariate, ...])

Data type: dask data frame based specification of panel of time series.

PanelPolarsEager([is_univariate, ...])

Data type: polars.DataFrame based specification of panel of time series.

PanelGluontsList([is_univariate, ...])

Data type: gluonTS representation of univariate and multivariate time series.

PanelGluontsPandas([is_univariate, ...])

Data type: polars.DataFrame based specification of panel of time series.

Hierarchical mtype specifications#

The Hierarchical mtype represents a hierarchical collection of time series.

HierarchicalPdMultiIndex([is_univariate, ...])

Data type: pandas.DataFrame based specification of hierarchical series.

HierarchicalDask([is_univariate, ...])

Data type: dask frame based specification of hierarchical series.

HierarchicalPolarsEager([is_univariate, ...])

Data type: polars DataFrame frame based specification of hierarchical series.

Table mtype specifications#

The Table mtype represents a (non-temporal) data frame table.

TablePdDataFrame([is_univariate, is_empty, ...])

Data type: pandas.DataFrame based specification of tabular data.

TablePdSeries([is_univariate, is_empty, ...])

Data type: pandas.Series based specification of tabular data.

TableNp1D([is_univariate, is_empty, ...])

Data type: 1D np.ndarray based specification of data frame table.

TableNp2D([is_univariate, is_empty, ...])

Data type: 2D np.ndarray based specification of data frame table.

TableListOfDict([is_univariate, is_empty, ...])

Data type: list of dict based specification of data frame table.

TablePolarsEager([is_univariate, is_empty, ...])

Data type: eager polars DataFrame based specification of data frame table.

Serialized File Format Specifications#

sktime supports a variety of file formats for serialized data, specific to storing time series data.

Specifications for file formats specific to sktime are provided below: