CutoffSplitter#

class CutoffSplitter(cutoffs: Union[numpy.ndarray, pandas.core.indexes.base.Index], fh: Union[int, list, numpy.ndarray, pandas.core.indexes.base.Index, sktime.forecasting.base._fh.ForecastingHorizon] = 1, window_length: Union[int, float, pandas._libs.tslibs.timedeltas.Timedelta, datetime.timedelta, numpy.timedelta64, pandas._libs.tslibs.offsets.DateOffset] = 10)[source]#

Cutoff window splitter.

Split time series at given cutoff points into a fixed-length training and test set.

Here the user is expected to provide a set of cutoffs (train set endpoints), which using the notation provided in BaseSplitter, can be written as \(\{k_1,\ldots,k_n\}\) for integer based indexing, or \(\{t(k_1),\ldots,t(k_n)\}\) for datetime based indexing. Training window’s last point is equal to the cutoff, while test window starts from the next observation in y.

The number of splits returned by .get_n_splits is then trivially equal to \(n\).

The sorted array of cutoffs returned by .get_cutoffs is then equal to \(\{t(k_1),\ldots,t(k_n)\}\) with \(k_i<k_{i+1}\).

Parameters
cutoffsnp.array or pd.Index

Cutoff points, positive and integer- or datetime-index like. Type should match the type of fh input.

fhint, timedelta, list or np.ndarray of ints or timedeltas

Type should match the type of cutoffs input.

window_lengthint or timedelta or pd.DateOffset

Methods

get_cutoffs([y])

Return the cutoff points in .iloc[] context.

get_fh()

Return the forecasting horizon.

get_n_splits([y])

Return the number of splits.

split(y)

Split y into training and test windows.

get_n_splits(y: Optional[Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]] = None) int[source]#

Return the number of splits.

For this splitter the number is trivially equal to the number of cutoffs given during instance initialization.

Parameters
ypd.Series or pd.Index, optional (default=None)

Time series to split

Returns
n_splitsint

The number of splits.

get_cutoffs(y: Optional[Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]] = None) numpy.ndarray[source]#

Return the cutoff points in .iloc[] context.

This method trivially returns the cutoffs given during instance initialization, in case these cutoffs are integer .iloc[] friendly indices. The only change is that the set of cutoffs is sorted from smallest to largest. When the given cutoffs are datetime-like, then this method returns corresponding integer indices.

Parameters
ypd.Series or pd.Index, optional (default=None)

Time series to split

Returns
cutoffs1D np.ndarray of int

iloc location indices, in reference to y, of cutoff indices

get_fh() sktime.forecasting.base._fh.ForecastingHorizon[source]#

Return the forecasting horizon.

Returns
fhForecastingHorizon

The forecasting horizon

split(y: Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]) Generator[Tuple[numpy.ndarray, numpy.ndarray], None, None][source]#

Split y into training and test windows.

Parameters
ypd.Series or pd.Index

Time series to split

Yields
trainnp.ndarray

Training window indices

testnp.ndarray

Test window indices