Performance metrics

The sktime.performance_metrics module contains metrics for evaluating and tuning time series models.

All parameter estimators in sktime can be listed using the sktime.registry.all_estimators utility, using estimator_types="metric", optionally filtered by tags. Valid tags can be listed using sktime.registry.all_tags.

A full table with tag based search is also available on the Estimator Search Page (select “metric” in the “Estimator type” dropdown).

Metrics for assessing model performance.

Forecasting

Point forecasts - classes

Average losses

`MeanAbsoluteError`([multioutput, multilevel, ...])	Mean absolute error (MAE).
`MeanSquaredError`([multioutput, multilevel, ...])	Mean squared error (MSE) or root mean squared error (RMSE).
`MedianAbsoluteError`([multioutput, ...])	Median absolute error (MdAE).
`MedianSquaredError`([multioutput, ...])	Median squared error (MdSE) or root median squared error (RMdSE).

Percentage errors

`MeanAbsolutePercentageError`([multioutput, ...])	Mean absolute percentage error (MAPE) or symmetric MAPE.
`MedianAbsolutePercentageError`([multioutput, ...])	Median absolute percentage error (MdAPE) or symmetric version.
`MeanSquaredPercentageError`([multioutput, ...])	Mean squared percentage error (MSPE), or RMSPE, or symmetric MSPE, RMSPE.
`MedianSquaredPercentageError`([multioutput, ...])	Median squared percentage error (MdSPE), or RMdSPE, or symmetric MdSPE, RMDsPE.
`MeanSquaredErrorPercentage`([multioutput, ...])	Mean Squared Error Percentage (MSE%) and root-MSE% forecasting error metrics.
`MeanArctangentAbsolutePercentageError`([...])	Mean Arctangent Absolute Percentage Error (MAAPE).

Scaled errors

`MeanAbsoluteScaledError`([multioutput, ...])	Mean absolute scaled error (MASE).
`MedianAbsoluteScaledError`([multioutput, ...])	Median absolute scaled error (MdASE).
`MeanSquaredScaledError`([multioutput, ...])	Mean squared scaled error (MSSE) or root mean squared scaled error (RMSSE).
`MedianSquaredScaledError`([multioutput, ...])	Median squared scaled error (MdSSE) or root median squared scaled error (RMdSSE).

Relative errors

`MeanRelativeAbsoluteError`([multioutput, ...])	Mean relative absolute error (MRAE).
`MedianRelativeAbsoluteError`([multioutput, ...])	Median relative absolute error (MdRAE).
`RelativeLoss`([multioutput, multilevel, ...])	Calculate relative loss of forecast versus benchmark forecast.

Geometric errors

`GeometricMeanAbsoluteError`([multioutput, ...])	Geometric mean absolute error (GMAE).
`GeometricMeanSquaredError`([multioutput, ...])	Geometric mean squared error (GMSE) or Root geometric mean squared error (RGMSE).
`GeometricMeanRelativeAbsoluteError`([...])	Geometric mean relative absolute error (GMRAE).
`GeometricMeanRelativeSquaredError`([...])	Geometric mean relative squared error (GMRSE).

Benchmark errors

OverallWeightedAverage([sp, multioutput, ...])

Overall Weighted Average (OWA) metric as used in the M4 competition.

Under- and over-prediction errors

`MeanAsymmetricError`([multioutput, ...])	Calculate mean of asymmetric loss function.
`MeanLinexError`([a, b, multioutput, ...])	Mean Linear Exponential (LinEx) error.

Point forecasts - functions

`make_forecasting_scorer`(func[, name, ...])	Create a metric class from a metric function.
`mean_absolute_scaled_error`(y_true, y_pred[, ...])	Mean absolute scaled error (MASE).
`median_absolute_scaled_error`(y_true, y_pred)	Median absolute scaled error (MdASE).
`mean_squared_scaled_error`(y_true, y_pred[, ...])	Mean squared scaled error (MSSE) or root mean squared scaled error (RMSSE).
`median_squared_scaled_error`(y_true, y_pred)	Median squared scaled error (MdSSE) or root median squared scaled error (RMdSSE).
`mean_absolute_error`(y_true, y_pred[, ...])	Mean absolute error (MAE).
`mean_squared_error`(y_true, y_pred[, ...])	Mean squared error (MSE) or root mean squared error (RMSE).
`median_absolute_error`(y_true, y_pred[, ...])	Median absolute error (MdAE).
`median_squared_error`(y_true, y_pred[, ...])	Median squared error (MdSE) or root median squared error (RMdSE).
`geometric_mean_absolute_error`(y_true, y_pred)	Geometric mean absolute error (GMAE).
`geometric_mean_squared_error`(y_true, y_pred)	Geometric mean squared error (GMSE) or Root geometric mean squared error (RGMSE).
`mean_absolute_percentage_error`(y_true, y_pred)	Mean absolute percentage error (MAPE) or symmetric MAPE (sMAPE).
`median_absolute_percentage_error`(y_true, y_pred)	Median absolute percentage error (MdAPE) or symmetric version.
`mean_squared_percentage_error`(y_true, y_pred)	Mean squared percentage error (MSPE) or square root version.
`median_squared_percentage_error`(y_true, y_pred)	Median squared percentage error (MdSPE) or square root version.
`mean_relative_absolute_error`(y_true, y_pred)	Mean relative absolute error (MRAE).
`median_relative_absolute_error`(y_true, y_pred)	Median relative absolute error (MdRAE).
`geometric_mean_relative_absolute_error`(...)	Geometric mean relative absolute error (GMRAE).
`geometric_mean_relative_squared_error`(...[, ...])	Geometric mean relative squared error (GMRSE).
`mean_asymmetric_error`(y_true, y_pred[, ...])	Calculate mean of asymmetric loss function.
`mean_linex_error`(y_true, y_pred[, a, b, ...])	Calculate mean linex error.
`relative_loss`(y_true, y_pred[, ...])	Relative loss of forecast versus benchmark forecast for a given metric.

Quantile and interval forecasts

`PinballLoss`([multioutput, score_average, alpha])	Pinball loss aka quantile loss for quantile/interval predictions.
`EmpiricalCoverage`([multioutput, ...])	Empirical coverage percentage for interval predictions.
`ConstraintViolation`([multioutput, ...])	Average absolute constraint violations for interval predictions.
`IntervalWidth`([multioutput, score_average, ...])	Interval width for interval predictions, sometimes also known as sharpness.

Distribution forecasts

`AUCalibration`([multioutput, multivariate])	Area under the calibration curve for distributional predictions.
`CRPS`([multioutput, multivariate])	Continuous rank probability score for distributional predictions.
`LogLoss`([multioutput, multivariate])	Logarithmic loss for distributional predictions.
`SquaredDistrLoss`([multioutput, multivariate])	Squared loss for distributional predictions.

Detection tasks

Detection metrics can be applied to compare ground truth events with detected events, and ground truth segments with detected segments.

Detection metrics are typically designed for either:

point events, i.e., annotated time stamps, or
segments, i.e., annotated time intervals.

The metrics in sktime can be used for both types of detection tasks:

segmentation metrics interpret point events as segment boundaries, separating consecutive segments
point event metrics are applied to segments by considering their boundaries as point events

Event detection - anomalies, outliers

`DirectedChamfer`([normalize])	Directed Chamfer distance between event points.
`DirectedHausdorff`()	Directed Hausdorff distance between event points.
`DetectionCount`([target, excess_only])	Count of detection, possibly in excess or deviation of a target count.
`WindowedF1Score`([margin])	F1-score for event detection, using a margin-based match criterion.
`TimeSeriesAUPRC`([integration, ...])	TimeSeriesAUPRC: TimeSeries area under precision recall curve.

Segment detection

RandIndex([use_loc])

Segmentation Rand Index metric.