binder

4 Forecasting Pipelines, Tuning, and AutoML#

This notebook is about pipelining and tuning (grid search) for time series forecasting with sktime

[1]:
import warnings

import numpy as np

warnings.filterwarnings("ignore")

4.1 Using exogeneous data X in forecasting#

The exogenous data X is a pd.DataFrame with the same index as y. It can be given optionally to a forecaster in the fit and predict methods. If X is given in fit, it also has to be given in predict, this means the future values of X have to be available at the time of prediction.

[ ]:
from sktime.datasets import load_longley
from sktime.forecasting.base import ForecastingHorizon
from sktime.split import temporal_train_test_split
from sktime.utils.plotting import plot_series

y, X = load_longley()
y_train, y_test, X_train, X_test = temporal_train_test_split(y=y, X=X, test_size=6)
fh = ForecastingHorizon(y_test.index, is_relative=False)
[3]:
plot_series(y_train, y_test, labels=["y_train", "y_test"]);
../_images/examples_03b_forecasting_transformers_pipelines_tuning_4_0.png
[4]:
X.head()
[4]:
GNPDEFL GNP UNEMP ARMED POP
Period
1947 83.0 234289.0 2356.0 1590.0 107608.0
1948 88.5 259426.0 2325.0 1456.0 108632.0
1949 88.2 258054.0 3682.0 1616.0 109773.0
1950 89.5 284599.0 3351.0 1650.0 110929.0
1951 96.2 328975.0 2099.0 3099.0 112075.0
[5]:
from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA(suppress_warnings=True)
forecaster.fit(y=y_train, X=X_train)
y_pred = forecaster.predict(fh=fh, X=X_test)

In case we dont have future X values to call predict(...), we can forecast X separately by means of the ForecastX composition class. This class takes two separate forecasters, one to forecast y and another one to forecast X. This means that when we call predict(...) first the X is forecasted and then given to the y forecaster to forecast y.

[6]:
from sktime.forecasting.compose import ForecastX
from sktime.forecasting.var import VAR

forecaster_X = ForecastX(
    forecaster_y=AutoARIMA(sp=1, suppress_warnings=True),
    forecaster_X=VAR(),
)
forecaster_X.fit(y=y, X=X, fh=fh)
# now in predict() we don't need to pass X
y_pred = forecaster_X.predict(fh=fh)

4.2 Simple Forecasting Pipelines#

We hae seen: pipelines combine multiple estimators into one.

Forecasters methods have two key data arguments, both time series:

  • endogeneous data, y

  • exogeneous data X

Pipelines exist for both

Pipeline transformers on endogenous data - y argument#

Sequence of transforms without pipeline object:

[7]:
from sktime.forecasting.arima import AutoARIMA
from sktime.transformations.series.detrend import Deseasonalizer, Detrender

detrender = Detrender()
deseasonalizer = Deseasonalizer()
forecaster = AutoARIMA(sp=1, suppress_warnings=True)

# fit_transform
y_train_1 = detrender.fit_transform(y_train)
y_train_2 = deseasonalizer.fit_transform(y_train_1)

# fit
forecaster.fit(y_train_2)

# predidct
y_pred_2 = forecaster.predict(fh)

# inverse_transform
y_pred_1 = detrender.inverse_transform(y_pred_2)
y_pred_isolated = deseasonalizer.inverse_transform(y_pred_1)

As one end-to-end forecaster, using TransformedTargetForecaster class:

[8]:
from sktime.forecasting.compose import TransformedTargetForecaster

pipe_y = TransformedTargetForecaster(
    steps=[
        ("detrend", Detrender()),
        ("deseasonalize", Deseasonalizer()),
        ("forecaster", AutoARIMA(sp=1, suppress_warnings=True)),
    ]
)
pipe_y.fit(y=y_train)
y_pred_pipe = pipe_y.predict(fh=fh)

from pandas.testing import assert_series_equal

# make sure that outputs are the same
assert_series_equal(y_pred_pipe, y_pred_isolated)

Pipeline transformers on exogenous data - X argument#

Applies transformers to X argument before passed to forecaster.

Here: Detrender then Deseasonalizer, then passed to AutoARIMA’s X

[9]:
from sktime.forecasting.compose import ForecastingPipeline

pipe_X = ForecastingPipeline(
    steps=[
        ("detrend", Detrender()),
        ("deseasonalize", Deseasonalizer()),
        ("forecaster", AutoARIMA(sp=1, suppress_warnings=True)),
    ]
)
pipe_X.fit(y=y_train, X=X_train)
y_pred = pipe_X.predict(fh=fh, X=X_test)

Pipeline on y and X data by means of nesting#

[10]:
pipe_y = TransformedTargetForecaster(
    steps=[
        ("detrend", Detrender()),
        ("deseasonalize", Deseasonalizer()),
        ("forecaster", AutoARIMA(sp=1, suppress_warnings=True)),
    ]
)

pipe_X = ForecastingPipeline(
    steps=[
        ("detrend", Detrender()),
        ("deseasonalize", Deseasonalizer()),
        ("forecaster", pipe_y),
    ]
)
pipe_X.fit(y=y_train, X=X_train)
y_pred = pipe_X.predict(fh=fh, X=X_test)

sktime provides a dunder methods for estimators to chain them into pipelines and other compositions. To create pipelines we can use * for creation of a TransformedTargetForecaster and ** for creation of a ForecastingPipeline. Further dunder methods are given in the appendinx section at the end of this notebook.

[11]:
pipe_y = Detrender() * Deseasonalizer() * AutoARIMA()
pipe_y
[11]:
TransformedTargetForecaster(steps=[Detrender(), Deseasonalizer(), AutoARIMA()])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
[12]:
pipe_X = Detrender() ** Deseasonalizer() ** AutoARIMA()
pipe_X
[12]:
ForecastingPipeline(steps=[Detrender(), Deseasonalizer(), AutoARIMA()])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

nesting with dunders, both X and y:

[13]:
pipe_nested = (
    Detrender() * Deseasonalizer() * (Detrender() ** Deseasonalizer() ** AutoARIMA())
)

4.3 Tuning#

Temporal cross-validation#

In sktime there are three different types of temporal cross-validation splitters available: - SingleWindowSplitter, which is equivalent to a single train-test-split - SlidingWindowSplitter, which is using a rolling window approach and “forgets” the oldest observations as we move more into the future - ExpandingWindowSplitter, which is using a expanding window approach and keep all observations in the training set as we move more into the future

[14]:
from sktime.datasets import load_shampoo_sales

y = load_shampoo_sales()
y_train, y_test = temporal_train_test_split(y=y, test_size=6)
plot_series(y_train, y_test, labels=["y_train", "y_test"]);
../_images/examples_03b_forecasting_transformers_pipelines_tuning_25_0.png
[15]:
from sktime.forecasting.base import ForecastingHorizon
from sktime.split import (
    ExpandingWindowSplitter,
    SingleWindowSplitter,
    SlidingWindowSplitter,
)
from sktime.utils.plotting import plot_windows

fh = ForecastingHorizon(y_test.index, is_relative=False).to_relative(
    cutoff=y_train.index[-1]
)
[16]:
cv = SingleWindowSplitter(fh=fh, window_length=len(y_train) - 6)
plot_windows(cv=cv, y=y_train)
../_images/examples_03b_forecasting_transformers_pipelines_tuning_27_0.png
[17]:
cv = SlidingWindowSplitter(fh=fh, window_length=12, step_length=1)
plot_windows(cv=cv, y=y_train)
../_images/examples_03b_forecasting_transformers_pipelines_tuning_28_0.png
[18]:
cv = ExpandingWindowSplitter(fh=fh, initial_window=12, step_length=1)
plot_windows(cv=cv, y=y_train)
../_images/examples_03b_forecasting_transformers_pipelines_tuning_29_0.png
[19]:
# get number of total splits (folds)
cv.get_n_splits(y=y_train)
[19]:
13

Grid search with pipeline#

For tuning parameters with compositions such as pipelines, we can use the <estimator>__<parameter> syntax known from scikit-learn. For multiple levels of nesting, we can use the same syntax with two underscores, e.g. forecaster__transformer__parameter.

[25]:
from sklearn.preprocessing import MinMaxScaler, PowerTransformer, RobustScaler

from sktime.forecasting.compose import TransformedTargetForecaster
from sktime.transformations.series.adapt import TabularToSeriesAdaptor
from sktime.transformations.series.detrend import Deseasonalizer, Detrender

forecaster = TransformedTargetForecaster(
    steps=[
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer()),
        ("minmax", TabularToSeriesAdaptor(MinMaxScaler((1, 10)))),
        ("power", TabularToSeriesAdaptor(PowerTransformer())),
        ("scaler", TabularToSeriesAdaptor(RobustScaler())),
        ("forecaster", ExponentialSmoothing()),
    ]
)

# using dunder notation to access inner objects/params as in sklearn
param_grid = {
    # deseasonalizer
    "deseasonalizer__model": ["multiplicative", "additive"],
    # power
    "power__transformer__method": ["yeo-johnson", "box-cox"],
    "power__transformer__standardize": [True, False],
    # forecaster
    "forecaster__sp": [4, 6, 12],
    "forecaster__seasonal": ["add", "mul"],
    "forecaster__trend": ["add", "mul"],
    "forecaster__damped_trend": [True, False],
}

gscv = ForecastingGridSearchCV(
    forecaster=forecaster,
    param_grid=param_grid,
    cv=cv,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),  # set custom scoring function
)
gscv.fit(y_train)
y_pred = gscv.predict(fh=fh)
Fitting 13 folds for each of 192 candidates, totalling 2496 fits
[26]:
gscv.best_params_
[26]:
{'deseasonalizer__model': 'additive',
 'forecaster__damped_trend': False,
 'forecaster__seasonal': 'add',
 'forecaster__sp': 4,
 'forecaster__trend': 'add',
 'power__transformer__method': 'yeo-johnson',
 'power__transformer__standardize': False}

Model selection#

There are three ways ways to select a model out of multiple models: 1) ForecastingGridSearchCV 2) MultiplexForecaster 3) relative_loss or RelativeLoss

1) ForecastingGridSearchCV#

We can use ForecastingGridSearchCV to fit multiple forecasters and find the best one using equivalent notations as in sklearn. We can either do: param_grid: List[Dict] or we just have a list of forecasters in the grid: forecaster": [NaiveForecaster(), STLForecaster()]. The advanteage doing this together tuning the other parameters is that we can use the same cross-validation as for the other parameters to find the overall best forecasters and parameters.

[27]:
from sktime.forecasting.naive import NaiveForecaster
from sktime.forecasting.theta import ThetaForecaster
from sktime.forecasting.trend import STLForecaster

forecaster = TransformedTargetForecaster(
    steps=[
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer()),
        ("scaler", TabularToSeriesAdaptor(RobustScaler())),
        ("minmax2", TabularToSeriesAdaptor(MinMaxScaler((1, 10)))),
        ("forecaster", NaiveForecaster()),
    ]
)

gscv = ForecastingGridSearchCV(
    forecaster=forecaster,
    param_grid=[
        {
            "scaler__transformer__with_scaling": [True, False],
            "forecaster": [NaiveForecaster()],
            "forecaster__strategy": ["drift", "last", "mean"],
            "forecaster__sp": [4, 6, 12],
        },
        {
            "scaler__transformer__with_scaling": [True, False],
            "forecaster": [STLForecaster(), ThetaForecaster()],
            "forecaster__sp": [4, 6, 12],
        },
    ],
    cv=cv,
)
gscv.fit(y)
gscv.best_params_
[27]:
{'forecaster': NaiveForecaster(sp=4),
 'forecaster__sp': 4,
 'forecaster__strategy': 'last',
 'scaler__transformer__with_scaling': True}

2) MultiplexForecaster#

We can use the MultiplexForecaster to compare the performance of different forecasters. This approach might be useful if we want to compare the performance of different forecasters that have been tuned and fitted already separately. The MultiplexForecaster is just a forecaster compostition that provides a parameters selected_forecaster: List[str] that can be tuned with a grid search. The other parameters of the forecasters are not tuned.

[28]:
from sktime.forecasting.compose import MultiplexForecaster

forecaster = MultiplexForecaster(
    forecasters=[
        ("naive", NaiveForecaster()),
        ("stl", STLForecaster()),
        ("theta", ThetaForecaster()),
    ]
)
gscv = ForecastingGridSearchCV(
    forecaster=forecaster,
    param_grid={"selected_forecaster": ["naive", "stl", "theta"]},
    cv=cv,
)
gscv.fit(y)
gscv.best_params_
[28]:
{'selected_forecaster': 'theta'}

3) relative_loss or RelativeLoss#

We can compare two models on a given test data by means of a relative loss calculation. This is however not doing any cross-validation compared to the above model selection approaches 1) and 2).

The relative loss function applies a forecasting performance metric to a set of forecasts and benchmark forecasts and reports the ratio of the metric from the forecasts to the the metric from the benchmark forecasts. Relative loss output is non-negative floating point. The best value is 0.0. The relative loss function uses the mean_absolute_error by default as error metric. If the score is >1, the forecast given as y_pred is worse than the given benchmark forecast in y_pred_benchmark. If we want to customize the relative loss, we can use the RelativeLoss scorer class and e.g. provide a custom loss function.

[29]:
from sktime.performance_metrics.forecasting import mean_squared_error, relative_loss

relative_loss(y_true=y_test, y_pred=y_pred_1, y_pred_benchmark=y_pred_2)
[29]:
123.00771294252276
[30]:
from sktime.performance_metrics.forecasting import RelativeLoss

relative_mse = RelativeLoss(relative_loss_function=mean_squared_error)
relative_mse(y_true=y_test, y_pred=y_pred_1, y_pred_benchmark=y_pred_2)
[30]:
14720.015012346896

Tuning pipeline structure as a hyper-parameter#

Pipeline structure choices influence performance

sktime allows to expose these choices via structural compositors:

  • switch between transform/forecast: MultiplexTransformer, MultiplexForecaster

  • transformer on/off: OptionalPassthrough

  • sequence of transformers: Permute

Combine with pipelines and FeatureUnion for rich structure space

Combination of transformers#

Given a set of four different transformers, we would like to know which combination of the four transformers is having the best error. So in total there are 4² = 16 different combinations. We can use the GridSearchCV to find the best combination.

[31]:
steps = [
    ("detrender", Detrender()),
    ("deseasonalizer", Deseasonalizer()),
    ("power", TabularToSeriesAdaptor(PowerTransformer())),
    ("scaler", TabularToSeriesAdaptor(RobustScaler())),
    ("forecaster", ExponentialSmoothing()),
]

In sktime there is a transformer composition called OptionalPassthrough() which gets a transformer as an argument and a param passthrough: bool. Setting passthrough=True will return an identity transformation for the given data. Setting passthrough=False will apply the given inner transformer on the data.

[32]:
from sktime.transformations.compose import OptionalPassthrough

transformer = OptionalPassthrough(transformer=Detrender(), passthrough=True)
transformer.fit_transform(y_train).head()
[32]:
1991-01    266.0
1991-02    145.9
1991-03    183.1
1991-04    119.3
1991-05    180.3
Freq: M, Name: Number of shampoo sales, dtype: float64
[33]:
y_train.head()
[33]:
1991-01    266.0
1991-02    145.9
1991-03    183.1
1991-04    119.3
1991-05    180.3
Freq: M, Name: Number of shampoo sales, dtype: float64
[34]:
transformer = OptionalPassthrough(transformer=Detrender(), passthrough=False)
transformer.fit_transform(y_train).head()
[34]:
1991-01    130.376344
1991-02      1.503263
1991-03     29.930182
1991-04    -42.642900
1991-05      9.584019
Freq: M, dtype: float64
[35]:
forecaster = TransformedTargetForecaster(
    steps=[
        ("detrender", OptionalPassthrough(Detrender())),
        ("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
        ("power", OptionalPassthrough(TabularToSeriesAdaptor(PowerTransformer()))),
        ("scaler", OptionalPassthrough(TabularToSeriesAdaptor(RobustScaler()))),
        ("forecaster", ExponentialSmoothing()),
    ]
)

param_grid = {
    "detrender__passthrough": [True, False],
    "deseasonalizer__passthrough": [True, False],
    "power__passthrough": [True, False],
    "scaler__passthrough": [True, False],
}

gscv = ForecastingGridSearchCV(
    forecaster=forecaster,
    param_grid=param_grid,
    cv=cv,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),
)
gscv.fit(y_train)
Fitting 13 folds for each of 16 candidates, totalling 208 fits
[35]:
ForecastingGridSearchCV(cv=ExpandingWindowSplitter(fh=ForecastingHorizon([1, 2, 3, 4, 5, 6], dtype='int64', is_relative=True),
                                                   initial_window=12),
                        forecaster=TransformedTargetForecaster(steps=[('detrender',
                                                                       OptionalPassthrough(transformer=Detrender())),
                                                                      ('deseasonalizer',
                                                                       OptionalPassthrough(transformer=Deseasonalizer())),
                                                                      ('power',
                                                                       OptionalPassthrough(transform...
                                                                       OptionalPassthrough(transformer=TabularToSeriesAdaptor(transformer=RobustScaler()))),
                                                                      ('forecaster',
                                                                       ExponentialSmoothing())]),
                        n_jobs=-1,
                        param_grid={'deseasonalizer__passthrough': [True,
                                                                    False],
                                    'detrender__passthrough': [True, False],
                                    'power__passthrough': [True, False],
                                    'scaler__passthrough': [True, False]},
                        scoring=MeanSquaredError(square_root=True), verbose=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Best performing combination of transformers:

[36]:
gscv.best_params_
[36]:
{'deseasonalizer__passthrough': True,
 'detrender__passthrough': False,
 'power__passthrough': True,
 'scaler__passthrough': True}

Worst performing combination of transformers:

[37]:
# worst params
gscv.cv_results_.sort_values(by="mean_test_MeanSquaredError", ascending=True).iloc[-1][
    "params"
]
[37]:
{'deseasonalizer__passthrough': False,
 'detrender__passthrough': True,
 'power__passthrough': False,
 'scaler__passthrough': True}

Permutation of transformers#

Given a set of four different transformers, we would like to know which permutation (ordering) of the four transformers is having the best error. In total there are 4! = 24 different permutations. We can use the GridSearchCV to find the best permutation.

[38]:
from sktime.forecasting.compose import Permute
[39]:
forecaster = TransformedTargetForecaster(
    steps=[
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer()),
        ("power", TabularToSeriesAdaptor(PowerTransformer())),
        ("scaler", TabularToSeriesAdaptor(RobustScaler())),
        ("forecaster", ExponentialSmoothing()),
    ]
)

param_grid = {
    "permutation": [
        ["detrender", "deseasonalizer", "power", "scaler", "forecaster"],
        ["power", "scaler", "detrender", "deseasonalizer", "forecaster"],
        ["scaler", "deseasonalizer", "power", "detrender", "forecaster"],
        ["deseasonalizer", "power", "scaler", "detrender", "forecaster"],
    ]
}
permuted = Permute(estimator=forecaster, permutation=None)

gscv = ForecastingGridSearchCV(
    forecaster=permuted,
    param_grid=param_grid,
    cv=cv,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),
)

permuted = gscv.fit(y, fh=fh)
Fitting 19 folds for each of 4 candidates, totalling 76 fits
[40]:
permuted.cv_results_
[40]:
mean_test_MeanSquaredError mean_fit_time mean_pred_time params rank_test_MeanSquaredError
0 104.529303 0.119823 0.032105 {'permutation': ['detrender', 'deseasonalizer'... 4.0
1 95.172818 0.118840 0.033758 {'permutation': ['power', 'scaler', 'detrender... 1.5
2 95.243942 0.114363 0.033743 {'permutation': ['scaler', 'deseasonalizer', '... 3.0
3 95.172818 0.119377 0.030739 {'permutation': ['deseasonalizer', 'power', 's... 1.5
[41]:
gscv.best_params_
[41]:
{'permutation': ['power',
  'scaler',
  'detrender',
  'deseasonalizer',
  'forecaster']}
[42]:
# worst params
gscv.cv_results_.sort_values(by="mean_test_MeanSquaredError", ascending=True).iloc[-1][
    "params"
]
[42]:
{'permutation': ['detrender',
  'deseasonalizer',
  'power',
  'scaler',
  'forecaster']}

4.4 Putting it all together: AutoML-like pipeline and tuning#

Taking all incredients from above examples, we can build a forecaster that comes close to what is usually called AutoML. With AutoML we aim to automate as many steps of an ML model creation as possible. The main compositions from sktime that we can use for this are: - TransformedTargetForecaster - ForecastingPipeline - ForecastingGridSearchCV - OptionalPassthrough - Permute

Univariate example#

Please see appendix section for an example with exogenous data

[43]:
pipe_y = TransformedTargetForecaster(
    steps=[
        ("detrender", OptionalPassthrough(Detrender())),
        ("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
        ("scaler", OptionalPassthrough(TabularToSeriesAdaptor(RobustScaler()))),
        ("forecaster", STLForecaster()),
    ]
)
permuted_y = Permute(estimator=pipe_y, permutation=None)

param_grid = {
    "permutation": [
        ["detrender", "deseasonalizer", "scaler", "forecaster"],
        ["scaler", "deseasonalizer", "detrender", "forecaster"],
    ],
    "estimator__detrender__passthrough": [True, False],
    "estimator__deseasonalizer__passthrough": [True, False],
    "estimator__scaler__passthrough": [True, False],
    "estimator__scaler__transformer__transformer__with_scaling": [True, False],
    "estimator__scaler__transformer__transformer__with_centering": [True, False],
    "estimator__forecaster__sp": [4, 8, 12],
}

gscv = ForecastingGridSearchCV(
    forecaster=permuted_y,
    param_grid=param_grid,
    cv=cv,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),
    error_score="raise",
)

gscv.fit(y=y_train, fh=fh)
Fitting 13 folds for each of 192 candidates, totalling 2496 fits
[43]:
ForecastingGridSearchCV(cv=ExpandingWindowSplitter(fh=ForecastingHorizon([1, 2, 3, 4, 5, 6], dtype='int64', is_relative=True),
                                                   initial_window=12),
                        error_score='raise',
                        forecaster=Permute(estimator=TransformedTargetForecaster(steps=[('detrender',
                                                                                         OptionalPassthrough(transformer=Detrender())),
                                                                                        ('deseasonalizer',
                                                                                         OptionalPassthrough(transformer=Deseasonalizer())),...
                                    'estimator__scaler__passthrough': [True,
                                                                       False],
                                    'estimator__scaler__transformer__transformer__with_centering': [True,
                                                                                                    False],
                                    'estimator__scaler__transformer__transformer__with_scaling': [True,
                                                                                                  False],
                                    'permutation': [['detrender',
                                                     'deseasonalizer', 'scaler',
                                                     'forecaster'],
                                                    ['scaler', 'deseasonalizer',
                                                     'detrender',
                                                     'forecaster']]},
                        scoring=MeanSquaredError(square_root=True), verbose=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
[44]:
gscv.cv_results_["mean_test_MeanSquaredError"].min()
[44]:
83.44136911735615
[45]:
gscv.cv_results_["mean_test_MeanSquaredError"].max()
[45]:
125.54124792614672

4.5 Model backtesting#

After fitting a model, we can evaluate the model error in the past similar to a cross-validation. For this we can use the evaluate function. This approach is often also called backtesting as we want to test the forecasters performance in the past.

[46]:
from sktime.forecasting.model_evaluation import evaluate

y = load_shampoo_sales()
results = evaluate(
    forecaster=STLForecaster(sp=12),
    y=y,
    cv=ExpandingWindowSplitter(fh=[1, 2, 3, 4, 5, 6], initial_window=12, step_length=6),
    scoring=MeanSquaredError(square_root=True),
    return_data=True,
)
results
[46]:
test_MeanSquaredError fit_time pred_time len_train_window cutoff y_train y_test y_pred
0 85.405220 0.008634 0.009597 12 1991-12 Period 1991-01 266.0 1991-02 145.9 1991-... Period 1992-01 194.3 1992-02 149.5 1992-... 1992-01 266.0 1992-02 145.9 1992-03 1...
1 129.383456 0.010045 0.007728 18 1992-06 Period 1991-01 266.0 1991-02 145.9 1991-... Period 1992-07 226.0 1992-08 303.6 1992-... 1992-07 270.953646 1992-08 263.653646 19...
2 126.393156 0.009373 0.008245 24 1992-12 Period 1991-01 266.0 1991-02 145.9 1991-... Period 1993-01 339.7 1993-02 440.4 1993-... 1993-01 260.633333 1993-02 215.833333 19...
3 171.142305 0.012287 0.009526 30 1993-06 Period 1991-01 266.0 1991-02 145.9 1991-... Period 1993-07 575.5 1993-08 407.6 1993-... 1993-07 378.659609 1993-08 450.927848 19...
[47]:
plot_series(
    y,
    results["y_pred"].iloc[0],
    results["y_pred"].iloc[1],
    results["y_pred"].iloc[2],
    results["y_pred"].iloc[3],
    markers=["o", "", "", "", ""],
    labels=["y_true"] + ["y_pred (Backtest " + str(x) + ")" for x in range(4)],
);
../_images/examples_03b_forecasting_transformers_pipelines_tuning_74_0.png

4.6 Appendix#

AutoML-like pipeline with exog data#

[48]:
from sktime.datasets import load_macroeconomic

data = load_macroeconomic()
y = data["unemp"]
X = data.drop(columns=["unemp"])

y_train, y_test, X_train, X_test = temporal_train_test_split(y, X, test_size=40)

fh = ForecastingHorizon(np.arange(1, 41), is_relative=True)
[49]:
cv = ExpandingWindowSplitter(fh=fh, initial_window=80, step_length=4)
plot_windows(cv=cv, y=y_train)
../_images/examples_03b_forecasting_transformers_pipelines_tuning_78_0.png
[50]:
from sktime.datasets import load_macroeconomic
from sktime.forecasting.statsforecast import StatsForecastAutoARIMA
from sktime.split import temporal_train_test_split

data = load_macroeconomic()
y = data["unemp"]
X = data.drop(columns=["unemp"])

y_train, y_test, X_train, X_test = temporal_train_test_split(y, X, test_size=40)
fh = ForecastingHorizon(np.arange(1, 41), is_relative=True)

pipe_y = TransformedTargetForecaster(
    steps=[
        ("detrender", OptionalPassthrough(Detrender())),
        ("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
        ("scaler", OptionalPassthrough(TabularToSeriesAdaptor(RobustScaler()))),
        ("forecaster", StatsForecastAutoARIMA(sp=4)),
    ]
)
permuted_y = Permute(pipe_y, permutation=None)

pipe_X = TransformedTargetForecaster(
    steps=[
        ("detrender", OptionalPassthrough(Detrender())),
        ("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
        ("scaler", OptionalPassthrough(TabularToSeriesAdaptor(RobustScaler()))),
        ("forecaster", permuted_y),
    ]
)
permuted_X = Permute(pipe_X, permutation=None)
[51]:
gscv = ForecastingGridSearchCV(
    forecaster=permuted_X,
    param_grid={
        "estimator__forecaster__permutation": [
            ["detrender", "deseasonalizer", "scaler", "forecaster"],
            ["scaler", "detrender", "deseasonalizer", "forecaster"],
        ],
        "permutation": [
            ["detrender", "deseasonalizer", "scaler", "forecaster"],
            ["scaler", "detrender", "deseasonalizer", "forecaster"],
        ],
        "estimator__forecaster__estimator__scaler__passthrough": [True, False],
        "estimator__scaler__passthrough": [True, False],
    },
    cv=cv,
    error_score="raise",
    scoring=MeanSquaredError(square_root=True),
)

gscv.fit(y=y_train, X=X_train)
[51]:
ForecastingGridSearchCV(cv=ExpandingWindowSplitter(fh=ForecastingHorizon([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
            18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
            35, 36, 37, 38, 39, 40],
           dtype='int64', is_relative=True),
                                                   initial_window=80,
                                                   step_length=4),
                        error_score='raise',
                        forecaster=Permute(estimator=TransformedTargetForecaster(steps=[('detrender',
                                                                                         Op...
                                    'estimator__forecaster__permutation': [['detrender',
                                                                            'deseasonalizer',
                                                                            'scaler',
                                                                            'forecaster'],
                                                                           ['scaler',
                                                                            'detrender',
                                                                            'deseasonalizer',
                                                                            'forecaster']],
                                    'estimator__scaler__passthrough': [True,
                                                                       False],
                                    'permutation': [['detrender',
                                                     'deseasonalizer', 'scaler',
                                                     'forecaster'],
                                                    ['scaler', 'detrender',
                                                     'deseasonalizer',
                                                     'forecaster']]},
                        scoring=MeanSquaredError(square_root=True))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
[52]:
gscv.cv_results_["mean_test_MeanSquaredError"].min()
[52]:
1.9856591203311482
[53]:
gscv.cv_results_["mean_test_MeanSquaredError"].max()
[53]:
2.1207628182179863
[54]:
gscv.best_params_
[54]:
{'estimator__forecaster__estimator__scaler__passthrough': True,
 'estimator__forecaster__permutation': ['detrender',
  'deseasonalizer',
  'scaler',
  'forecaster'],
 'estimator__scaler__passthrough': False,
 'permutation': ['detrender', 'deseasonalizer', 'scaler', 'forecaster']}
[55]:
# worst params
gscv.cv_results_.sort_values(by="mean_test_MeanSquaredError", ascending=True).iloc[-1][
    "params"
]
[55]:
{'estimator__forecaster__estimator__scaler__passthrough': True,
 'estimator__forecaster__permutation': ['scaler',
  'detrender',
  'deseasonalizer',
  'forecaster'],
 'estimator__scaler__passthrough': True,
 'permutation': ['scaler', 'detrender', 'deseasonalizer', 'forecaster']}

Credits: notebook 4 - pipelines#

notebook creation: aiwalter

forecaster pipelines: fkiraly, aiwalter
transformer pipelines & compositors: fkiraly, mloning, miraep8
dunder interface: fkiraly, miraep8
tuning, autoML: mloning, fkiraly, aiwalter
CV and splitters: mloning, kkoralturk, khrapovs
forecasting metrics: mloning, aiwalter, rnkuhns, fkiraly
backtesting, evaluation: aiwalter, mloning, fkiraly, topher-lo

Generated using nbsphinx. The Jupyter notebook can be found here.