User Guide#
Welcome to sktime’s user guide!
The user guide consists of introductory notebooks, ordered by learning task.
For guided tutorials with videos, see our Tutorials page.
To run the user guide notebooks interactively, you can launch them on binder without having to install anything.
We assume basic familiarity with scikit-learn. If you haven’t worked with scikit-learn before, check out their getting-started guide.
The notebook files can be found here.
- Forecasting with sktime
- Table of Contents
- 1. Basic forecasting workflows
- Step 1 - Preparation of the data
- Step 2 - Specifying the forecasting horizon
- Step 3 - Specifying the forecasting algorithm
- Step 4 - Fitting the forecaster to the seen data
- Step 5 - Requesting forecasts
- 1.2.1 The basic deployment workflow in a nutshell
- 1.2.2 Forecasters that require the horizon already in
fit
- 1.2.3 Forecasters that can make use of exogeneous data
- 1.2.4. Multivariate forecasting
- 1.2.5 Probabilistic forecasting: prediction intervals, quantile, variance, and distributional forecasts
- 1.2.6 Panel forecasts and hierarchical forecasts
- Step 1 - Splitting a historical data set in to a temporal train and test batch
- Step 2 - Making forecasts for y_test from y_train
- Steps 3 and 4 - Specifying a forecasting metric, evaluating on the test set
- Step 5 - Testing performance against benchmarks
- 1.3.1 The basic batch forecast evaluation workflow in a nutshell - function metric interface
- 1.3.2 The basic batch forecast evaluation workflow in a nutshell - metric class interface
- 2. Forecasters in
sktime
- lookup, properties, main families - 3. Advanced composition patterns - pipelines, reduction, autoML, and more
- 4. Extension guide - implementing your own forecaster
- 5. Summary
- Useful resources
- Forecasting with sktime - appendix: forecasting, supervised regression, and pitfalls in confusing the two
- The pitfalls of mis-diagnosing forecasting as supervised regression
- Pitfall 1: over-optimism in performance evaluation, false confidence in “broken” forecasters
- Pitfall 2: obscure data manipulations, brittle boilerplate code to apply regressors
- Pitfall 3: Given a fitted regression algorithm, how can we generate forecasts?
- How does
sktime
help avoid the above pitfalls?
- The pitfalls of mis-diagnosing forecasting as supervised regression
- Probabilistic Forecasting with
sktime
- Overview of this notebook
- Quick Start - Probabilistic Forecasting with
sktime
- What is probabilistic forecasting?
- Probabilistic forecasting interfaces in
sktime
- Metrics for probabilistic forecasts and evaluation
- Advanced composition: pipelines, tuning, reduction, adding proba forecasts to any estimator
- Useful resources
- Credits
- Overview of this notebook
- 2.1.1 preferred format 1 -
pd-multiindex
specification - 2.1.2 preferred format 2 -
numpy3D
specification - 2.2.3 Time Series Classification - deployment vignette
- 2.2.4 Time Series Classification - simple evaluation vignette
- 2.2.5 Time Series Regression - basic vignettes
- 5.2.6 Time Series Clustering - basic vignettes
- 2.4.1 Primer on
sktime
transformers for feature extraction - 2.4.2 Pipelines for time series panel tasks
- 2.4.3 Using transformers to deal with unequal length or missing values
- 2.4.4 Tuning and model selection
- 2.4.5 Advanced Composition cheat sheet - AutoML, bagging, ensembles
- Benchmarking with sktime
- In-memory data representations and data loading
- Section 1: in-memory data containers
- Section 1.1: Time series - the
"Series"
scitype - Section 1.1.1: Time series - the
"pd.DataFrame"
mtype - Section 1.1.2: Time series - the
"pd.Series"
mtype - Section 1.1.3: Time series - the
"np.ndarray"
mtype - Section 1.2: Time series panels - the
"Panel"
scitype - Section 1.2.1: Time series panels - the
"pd-multiindex"
mtype - Section 1.2.2: Time series panels - the
"numpy3D"
mtype - Section 1.2.3: Time series panels - the
"df-list"
mtype - Section 1.3: Hierarchical time series - the
"Hierarchical"
scitype
- Section 1.1: Time series - the
- Section 2: validity checking and mtype conversion
- Section 3: loading pre-defined data sets
- Section 4: loading data from
csv
files
- Section 1: in-memory data containers