YfromX#
- YfromX(estimator, pooling='local')[source]#
Simple reduction predicting endogeneous from concurrent exogeneous variables.
Tabulates all seen
X
andy
by time index and applies tabular supervised regression.In
fit
, given endogeneous time seriesy
and exogeneousX
: fitsestimator
to feature-label pairs as defined as follows.features = \(y(t)\), labels: \(X(t)\) ranging over all \(t\) where the above have been observed (are in the index)
In
predict
, at a time \(t\) in the forecasting horizon, usesestimator
to predict \(y(t)\), from labels: \(X(t)\)If regressor is
skpro
probabilistic regressor, and haspredict_interval
etc, usesestimator
to predict \(y(t)\), from labels: \(X(t)\), passing on thepredict_interval
etc arguments.If no exogeneous data is provided, will predict the mean of
y
seen infit
.In order to use a fit not on the entire historical data and update periodically, combine this with
UpdateRefitsEvery
.In order to deal with missing data, combine this with
Imputer
.To construct an custom direct reducer, combine with
YtoX
,Lag
, orReducerTransform
.- Parameters:
- estimatorsklearn regressor or skpro probabilistic regressor,
must be compatible with sklearn or skpro interface tabular regression algorithm used in reduction algorithm if skpro regressor, resulting forecaster will have probabilistic capability
- poolingstr, one of [“local”, “global”, “panel”], optional, default=”local”
level on which data are pooled to fit the supervised regression model “local” = unit/instance level, one reduced model per lowest hierarchy level “global” = top level, one reduced model overall, on pooled data ignoring levels “panel” = second lowest level, one reduced model per panel level (-2) if there are 2 or less levels, “global” and “panel” result in the same if there is only 1 level (single time series), all three settings agree
Examples
>>> from sktime.datasets import load_longley >>> from sktime.split import temporal_train_test_split >>> from sktime.forecasting.compose import YfromX >>> from sklearn.linear_model import LinearRegression >>> >>> y, X = load_longley() >>> y_train, y_test, X_train, X_test = temporal_train_test_split(y, X) >>> fh = y_test.index >>> >>> f = YfromX(LinearRegression()) >>> f.fit(y=y_train, X=X_train, fh=fh) YfromX(...) >>> y_pred = f.predict(X=X_test)
YfromX can also be used with skpro probabilistic regressors, in this case the resulting forecaster will be capable of probabilistic forecasts: >>> from skpro.regression.residual import ResidualDouble # doctest: +SKIP >>> reg_proba = ResidualDouble(LinearRegression()) # doctest: +SKIP >>> f = YfromX(reg_proba) # doctest: +SKIP >>> f.fit(y=y_train, X=X_train, fh=fh) # doctest: +SKIP YfromX(…) >>> y_pred = f.predict_interval(X=X_test) # doctest: +SKIP