CutoffSplitter#

class CutoffSplitter(cutoffs: Union[numpy.ndarray, pandas.core.indexes.base.Index], fh: Union[int, list, numpy.ndarray, pandas.core.indexes.base.Index, sktime.forecasting.base._fh.ForecastingHorizon] = 1, window_length: Union[int, float, pandas._libs.tslibs.timedeltas.Timedelta, datetime.timedelta, numpy.timedelta64, pandas._libs.tslibs.offsets.DateOffset] = 10)[source]#

Cutoff window splitter.

Split time series at given cutoff points into a fixed-length training and test set.

Here the user is expected to provide a set of cutoffs (train set endpoints), which using the notation provided in BaseSplitter, can be written as \(\{k_1,\ldots,k_n\}\) for integer based indexing, or \(\{t(k_1),\ldots,t(k_n)\}\) for datetime based indexing. Training window’s last point is equal to the cutoff, while test window starts from the next observation in y.

The number of splits returned by .get_n_splits is then trivially equal to \(n\).

The sorted array of cutoffs returned by .get_cutoffs is then equal to \(\{t(k_1),\ldots,t(k_n)\}\) with \(k_i<k_{i+1}\).

Parameters

cutoffsnp.array or pd.Index: Cutoff points, positive and integer- or datetime-index like. Type should match the type of fh input.
fhint, timedelta, list or np.ndarray of ints or timedeltas: Type should match the type of cutoffs input.
window_lengthint or timedelta or pd.DateOffset

Methods

`get_cutoffs`([y])	Return the cutoff points in .iloc[] context.
`get_fh`()	Return the forecasting horizon.
`get_n_splits`([y])	Return the number of splits.
`split`(y)	Split y into training and test windows.

get_n_splits(y: Optional[Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]] = None) → int[source]#

Return the number of splits.

For this splitter the number is trivially equal to the number of cutoffs given during instance initialization.

Parameters

ypd.Series or pd.Index, optional (default=None): Time series to split

Returns

n_splitsint: The number of splits.

get_cutoffs(y: Optional[Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]] = None) → numpy.ndarray[source]#

Return the cutoff points in .iloc[] context.

This method trivially returns the cutoffs given during instance initialization, in case these cutoffs are integer .iloc[] friendly indices. The only change is that the set of cutoffs is sorted from smallest to largest. When the given cutoffs are datetime-like, then this method returns corresponding integer indices.

Parameters

ypd.Series or pd.Index, optional (default=None): Time series to split

Returns

cutoffs1D np.ndarray of int: iloc location indices, in reference to y, of cutoff indices

get_fh() → sktime.forecasting.base._fh.ForecastingHorizon[source]#

Return the forecasting horizon.

Returns

fhForecastingHorizon: The forecasting horizon

split(y: Union[pandas.core.series.Series, pandas.core.frame.DataFrame, numpy.ndarray, pandas.core.indexes.base.Index]) → Generator[Tuple[numpy.ndarray, numpy.ndarray], None, None][source]#

Split y into training and test windows.

Parameters

ypd.Series or pd.Index: Time series to split

Yields

trainnp.ndarray: Training window indices
testnp.ndarray: Test window indices