# Get Started#

The following information is designed to get users up and running with `sktime`

quickly. For more detailed information, see the links in each of the subsections.

## Installation#

`sktime`

currently supports:

environments with python version 3.8, 3.9, 3.10, 3.11, or 3.12.

operating systems Mac OS X, Unix-like OS, Windows 8.1 and higher

installation via

`PyPi`

or`conda`

Please see the installation guide for step-by-step instructions on the package installation.

## Key Concepts#

`sktime`

seeks to provide a unified framework for multiple time series machine learning tasks. This (hopefully) makes `sktime's`

functionality intuitive for users
and lets developers extend the framework more easily. But time series data and the related scientific use cases each can take multiple forms.
Therefore, a key set of common concepts and terminology is important.

### Data Types#

`sktime`

is designed for time series machine learning. Time series data refers to data where the variables are ordered over time or
an index indicating the position of an observation in the sequence of values.

In `sktime`

time series data can refer to data that is univariate, multivariate or panel, with the difference relating to the number and interrelation
between time series variables, as well as the number of instances for which each variable is observed.

Univariate time series data refers to data where a single variable is tracked over time.

Multivariate time series data refers to data where multiple variables are tracked over time for the same instance. For example, multiple quarterly economic indicators for a country or multiple sensor readings from the same machine.

Panel time series data refers to data where the variables (univariate or multivariate) are tracked for multiple instances. For example, multiple quarterly economic indicators for several countries or multiple sensor readings for multiple machines.

### Learning Tasks#

`sktime's`

functionality for each learning tasks is centered around providing a set of code artifacts that match a common interface to a given
scientific purpose (i.e. scientific type or scitype). For example, `sktime`

includes a common interface for “forecaster” classes designed to predict future values
of a time series.

`sktime's`

interface currently supports:

Time series classification where the time series data for a given instance are used to predict a categorical target class.

Time series regression where the time series data for a given instance are used to predict a continuous target value.

Time series clustering where the goal is to discover groups consisting of instances with similar time series.

Forecasting where the goal is to predict future values of the input series.

Time series annotation which is focused on outlier detection, anomaly detection, change point detection and segmentation.

### Reduction#

While the list above presents each learning task separately, in many cases it is possible to adapt one learning task to help solve another related learning task. For example,
one approach to forecasting would be to use a regression model that explicitly accounts for the data’s time dimension. However, another approach is to reduce the forecasting problem
to cross-sectional regression, where the input data are tabularized and lags of the data are treated as independent features in scikit-learn style
tabular regression algorithms. Likewise one approach to the time series annotation task like anomaly detection is to reduce the problem to using forecaster to predict future values and flag
observations that are too far from these predictions as anomalies. `sktime`

typically incorporates these type of reductions through the use of composable classes that
let users adapt one learning task to solve another related one.

For more information on `sktime's`

terminology and functionality see the Glossary of Common Terms and the user guide.

## Quickstart#

The code snippets below are designed to introduce `sktime's`

functionality so you can start using its functionality quickly. For more detailed information see the Tutorials, User Guide and API Reference in `sktime's`

Documentation.

### Forecasting#

```
>>> from sktime.datasets import load_airline
>>> from sktime.forecasting.base import ForecastingHorizon
>>> from sktime.forecasting.model_selection import temporal_train_test_split
>>> from sktime.forecasting.theta import ThetaForecaster
>>> from sktime.performance_metrics.forecasting import mean_absolute_percentage_error
>>> y = load_airline()
>>> y_train, y_test = temporal_train_test_split(y)
>>> fh = ForecastingHorizon(y_test.index, is_relative=False)
>>> forecaster = ThetaForecaster(sp=12) # monthly seasonal periodicity
>>> forecaster.fit(y_train)
>>> y_pred = forecaster.predict(fh)
>>> mean_absolute_percentage_error(y_test, y_pred)
0.08661467738190656
```

### Time Series Classification#

```
>>> from sktime.classification.interval_based import TimeSeriesForestClassifier
>>> from sktime.datasets import load_arrow_head
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.metrics import accuracy_score
>>> X, y = load_arrow_head()
>>> X_train, X_test, y_train, y_test = train_test_split(X, y)
>>> classifier = TimeSeriesForestClassifier()
>>> classifier.fit(X_train, y_train)
>>> y_pred = classifier.predict(X_test)
>>> accuracy_score(y_test, y_pred)
0.8679245283018868
```

### Time Series Regression#

```
>>> from sktime.datasets import load_covid_3month
>>> from sktime.regression.distance_based import KNeighborsTimeSeriesRegressor
>>> from sklearn.metrics import mean_squared_error
>>> X_train, y_train = load_covid_3month(split="train")
>>> y_train = y_train.astype("float")
>>> X_test, _ = load_covid_3month(split="test")
>>> regressor = KNeighborsTimeSeriesRegressor()
>>> regressor.fit(X_train, y_train)
>>> y_pred = regressor.predict(X_test)
>>> mean_squared_error(y_test, y_pred)
```

### Time Series Clustering#

```
>>> from sklearn.model_selection import train_test_split
>>> from sktime.clustering.k_means import TimeSeriesKMeans
>>> from sktime.clustering.utils.plotting._plot_partitions import plot_cluster_algorithm
>>> from sktime.datasets import load_arrow_head
>>> X, y = load_arrow_head()
>>> X_train, X_test, y_train, y_test = train_test_split(X, y)
>>> k_means = TimeSeriesKMeans(n_clusters=5, init_algorithm="forgy", metric="dtw")
>>> k_means.fit(X_train)
>>> plot_cluster_algorithm(k_means, X_test, k_means.n_clusters)
```

### Time Series Annotation#

Warning

The time series annotation API is experimental, and may change in future releases.

```
>>> from sktime.annotation.adapters import PyODAnnotator
>>> from pyod.models.iforest import IForest
>>> from sktime.datasets import load_airline
>>> y = load_airline()
>>> pyod_model = IForest()
>>> pyod_sktime_annotator = PyODAnnotator(pyod_model)
>>> pyod_sktime_annotator.fit(y)
>>> annotated_series = pyod_sktime_annotator.predict(y)
```