sktime testing framework overview#
sktime uses pytest for testing interface compliance of estimators, and correctness of code.
This page gives an overview over the tests, and introductions on how to add tests, or how to extend the testing framework.
Test module architecture#
sktime testing happens on three layers, roughly corresponding to the inheritance layers of estimators.
“package level”: testing interface compliance with the
BaseObjectandBaseEstimatorspecifications, intests/test_all_estimators.py“module level”: testing interface compliance of concrete estimators with their scitype base class, for instance
forecasting/tests/test_all_forecasters.py“low level”: testing individual functionality of estimators or other code, in individual files in
testsfolders.
Module conventions are as follows:
Each module contains a
testsfolder, which contains tests specific to that module.Sub-modules may also contain
testsfolders.testsfolders may contain_config.pyfiles to collect test configuration settings for that modulegeneric utilities for tests are located in the module
utils._testing.Tests for these utilities should be contained in the
utils._testing.testsfolder.Each test module corresponding to a learning task and estimator scitype should containmodule level tests in a test
test_all_[name_of_scitype].pyfile that tests interface compliance of all estimators adhering to the scitype. For instance,forecasting/tests/test_all_forecasters.py, ordistances/tests/test_all_dist_kernels.py.Learning task specific tests should not duplicate package level, generic estimator tests in
test_all_estimators.py
Test code architecture#
sktime test files should use best pytest practice such as fixtures or test parameterization where possible,
instead of custom logic, see pytest documentation.
Estimator tests use sktime’s framework plug-in to pytest_generate_tests,
which parameterizes estimator fixtures and data input scenarios.
An illustrative example#
Starting with an example:
def test_fit_returns_self(estimator_instance, scenario):
"""Check that fit returns self."""
fit_return = scenario.run(estimator_instance, method_sequence=["fit"])
assert (
fit_return is estimator_instance
), f"Estimator: {estimator_instance} does not return self when calling fit"
This test constitutes a loop over estimator_instance and scenario fixtures,
where the loop is orchestrated by pytest parameterization in
pytest_generate_tests, which automatically decorates the test with a suitable loop.
Notably, loops in the test do not need to be written by the developer,
if they use a fixture name (such as estimator_instance) which already has a loop defined.
See below for more details, or the pytest documentation on the topic.
The sktime plug-in for pytest generates the tuples of fixture values for this.
In the above example, we loop over the following fixtures lists:
estimator_instanceover estimator instances, obtained from allsktimeestimators viacreate_test_instances_and_namesscenarioobjects, which encodes data inputs and method call sequences toestimator_instance(explained in further detail below).
The sktime plug-in ensures that only those scenarios are retrieved that are
applicable to the estimator_instance.
In the example, the scenario.run command is equivalent to calling estimator_instance.fit(**scenario_kwargs),
where the scenario_kwargs are generated by the scenario.
It should be noted that the test is not decorated with fixture parametrization,
the fixtures are instead generated by pytest_generate_tests.
The reason for this is that the applicable scenarios (fixture values of scenario) depend on the estimator_instance fixture,
since inputs to fit of a classifier will differ to an input to fit of a forecaster.
Parameterized fixtures#
sktime uses pytest fixture parameterization to execute tests in a loop over fixtures,
for instance running all interface compatibility tests for all estimators.
See the pytest documentation on fixture parameterization in general for an explanation of fixture parameterization.
Implementation-wise, loops over fixtures is orchestrated by pytest parameterization in
pytest_generate_tests, which automatically decorates every test by
a mark.parameterize based on the test arguments (estimator_instance and scenario in the above example).
This is in line with standard use of pytest_generate_tests, see the section in the pytest
documentation on advanced fixture parameterization using pytest_generate_tests.
Currently, the sktime testing framework provides automated fixture parameterization
via mark.parameterize for the following fixtures, in module level tests:
estimator: all estimator classes, inheriting from the base class of the given module.In the package level tests
test_all_estimators, that base class isBaseEstimator.estimator_instance: all estimator test instances, obtained from allsktimeestimators viacreate_test_instances_and_namesscenario: test scenarios, applicable toestimatororestimator_instance.The scenarios are specified in
utils/_testing/scenarios_[estimator_scitype].
Further parameterization may happen for individual tests, the scope is usually explained in the test docstrings.
Scenarios#
The scenario fixtures contain arguments for method calls, and a sequence for method calls.
An example scenario specification, from utils/_testing/scenarios_forecasting:
class ForecasterFitPredictUnivariateNoXLateFh(ForecasterTestScenario):
"""Fit/predict only, univariate y, no X, no fh in predict."""
_tags = {"univariate_y": True, "fh_passed_in_fit": False}
args = {
"fit": {"y": _make_series(n_timepoints=20, random_state=RAND_SEED)},
"predict": {"fh": 1},
}
default_method_sequence = ["fit", "predict"]
The scenario ForecasterFitPredictUnivariateNoXLateFh encodes instructions
applied to an estimator_instance, via instances scenario.
A call result = scenario.run(estimator_instance) will:
first, call
estimator_instance.fit(y=_make_series(n_timepoints=20, random_state=RAND_SEED))then, call
estimator_instance.predict(fh=1)and return the output tooresult.
The abstraction of “scenario” allows to specify multiple argument combinations across multiple methods.
The method run also has arguments (method_sequence and arg_sequence)
that allow to override the method sequence, e.g.,
run them in a different order, or only a subset thereof.
Scenarios also provide a method scenario.is_applicable(estimator), which returns a boolean, whether
scenario is applicable to estimator. For instance, scenarios with univariate data are not applicable
to multivariate forecasters, and will cause exceptions in a fit method call.
Non-applicable scenarios can be filtered out in positive tests, and filtered in in negative tests.
As a default, the sktime implemented pytest_generate_tests only pass applicable scenarios.
Further, scenarios inherit from BaseObject, which allows to use the sktime tag system with scenarios.
For further details on scenarios, inspect the docstring of BaseScenario.
Remote CI set-up#
The remote CI runs all package level tests, module level tests, and low-level tests for all combinations of supported operating systems (OS) and python versions.
The estimators package and module level are distributed across OS and python version combinations so that:
only about a third of estimators are run per combination
a given estimator runs at least once for a given OS
a given estimator runs at least once for a python version
This is for reducing runtime and memory requirements for each CI element.
The precise logic maps estimators, OS and python versions on integers, and matches estimators with the sum of OS and python version modulo 3.
This logic located in subsample_by_version_os in tests.test_all_estimators,
which is called in pytest_generate_tests of BaseFixtureGenerator, which
is inherited by all the TestAll[estimator_type] classes.
By default, the subsetting by OS and python version is switched off,
but can be turned on by setting the pytest flag matrixdesign to True
(see conftest.py)
Extending the testing module#
This section explains how to extend the testing module. Depending on the primary change that is tested, the changes to the testing module will be shallow or deep. In decreasing order of commonality:
When adding new estimators or utility functionality, write low level tests that check correctness of the estimator.
These typically use only the simplest idioms in
pytest(e.g., fixture parameterization).New estimators are also automatically discovered and looped over by the existing module and package level tests.
Introducing or changing base class level interface points will typically require addition of module level tests, and addition of, or modification to scenarios with functionality specific to these interface points.Rarely, this may require changes package level tests.
Major interface changes or addition of modules may require writing of entire test suites, and changes or additions to package level tests.
Adding low level tests#
Low level tests are “free-form” and should follow best pytest practice.
pytest tests should be located in the appropriate tests folder of the module where a change is made.
Examples should be located in the docstring of the class or function added.
For an added estimator of name estimator_name, the test file should be called test_estimator_name.py.
Useful functionality to write tests:
example fixture generation, via
datatypes.get_examplesdata format checkers in
datatypes:check_is_mtype,check_is_scitype,check_raisemiscellaneous utilities in
utils, especially in_testing
Escaping tests#
On occasion, it may make sense to escape individual estimators from individual tests.
This can be done (currently, as of 0.9.0) in two ways:
adding the estimator or test/estimator combination to the
EXCLUDED_TESTSorEXCLUDE_ESTIMATORSin the appropriate_configfile.adding a check condition in the
is_excludedmethod used inpytest_generate_fixtures, possibly only if the testing module supports this
Escaping tests directly in the tests, e.g., via if isinstance(estimator_instance, MyClass) should be avoided where possible.
Adding package or module level tests#
Module level tests use pytest_generate_tests to define fixtures.
The available fixtures vary per module, and are listed in the docstring of pytest_generate_tests.
A new test should use these fixtures, if possible, but also can add new fixtures via pytest basic fixture functionality.
If new fixture variables are to be used throughout the module, or depend on existing fixtures, instructions in the next section should be followed.
Where possible, scenarios should be used to simulate generic method calls (see above), instead of creating and passing arguments directly. Scenarios will ensure consistent coverage of input argument cases.
Adding fixture variables#
One-off fixture variables (localized to one or a few tests)
should be added using pytest basic functionality, such as immutable constants,
pytest.fixture or pytest.mark.parameterize. Extending pytest_generate_tests
can also be considered in this case, if it makes the tests more (and not less) readable.
In contrast, fixtures used throughout module or package level tests should typically be added to the
fixture generation process called by pytest_generate_tests.
This requires:
adding a function
_generate_[variablename](test_name, **kwargs), as described belowassigning the function to
generator_dict["variablename"]adding the new variable in the
fixture_sequencelist inpytest_generate_tests
The function _generate_[variable_name](test_name, **kwargs) should return two objects:
a list of fixture to loop over, to substitute for
variable_namewhen appearing in a test signaturea list of names of equal length, i-th element used as a name for the i-th fixture in test logs
The function has access to:
test_name, the name of the test the variable is called in.
This can be used to customize the list of fixtures for specific tests,
although this is meant for generic behaviour mainly.
One-off escapes and similar should be avoided here, and instead dealt with xfail and similar.
the value of the fixture variables that appear earlier in
fixture_sequence, inkwargs.
For instance, the value of estimator_instance, if this is a variable used in the test.
This can be used to make the list of fixtures for variable_name dependent on the value of other fixtures variables
Adding or extending scenarios#
Scenarios can be added or modified if a new combination of method/input values should be tested. The two main options are:
adding a new scenario, similar to existing scenarios for an estimator scitype. This is the common case when a new input condition should be covered.
adding a method or argument key to existing scenarios. This is the common case when a new method or method sequence should be covered. For this, args cshould be added to the scenarios’
argskey of an existing scenario.
Scenarios for a specific estimator scitype are found in utils/_testing/scenarios_[estimator_scitype].
All scenarios inherit from a base class for that scitype, e.g., ForecasterTestScenario.
This base class defines generics such as is_applicable, or tag handling, for all scenarios of the same type.
Scenarios should usually define:
an
argsparameter: a dictionary, with arbitrary keys (usually names of methods).The
argsparameter may be set as a class variable, or set by the contructor.optionally, a
default_method_sequenceand adefault_arg_sequence, lists of strings. These define the sequence in which methods are called, with which argument set, ifrunis called. Both may be class variables, or object variable set in the constructor.side note: a
method_sequenceandarg_sequencecan also be specified inrun. If not passed, defaulting will take place (first to each other, then to thedetault_etcvariables)optionally, a
_tagsdictionary, which is aBaseObjecttags dictionary and behaves exactly like that of estimators.optionally, a
get_argsmethod which allows to override key retrieval fromargs. For instance, to specify rules such as “if the key starts withpredict_, always return …”optionally, an
is_applicablemethod which allows to compare the scenario with estimators. For instance, comparing whether both scenario and estimator are multivariate.
For further details and expected signature, consult the docstring of TestScenario
(link),
and/or inspect any of the scenarios base classes, e.g., ForecasterTestScenario.
Creating tests for a new estimator type#
If a module for a new estimator type is added, multiple things need to be created for module level tests:
scenarios to cover the specified base class interface behaviour, in
utils/_testing/scenarios_[estimator_scitype]. This can be modelled onutils/_testing/scenarios_forecasting, or the other scenarios files.a line in the dispatch dictionary in
utils/_testing/scenarios_getterwhich links the scenarios to the scenario retrieval function, e.g.,scenarios["forecaster"] = scenarios_forecastinga
tests/test_all_[estimator_scitype].py, from the root of the module.in this file, appropriate fixture generation via
pytest_generate_fixtures. This can be modelled offtest_all_estimatorsortest_all_forecasters.and, a collection of tests for interface compliance with the base class of the estimator type. The tests should cover positive cases, as well as testing raising of informative error message in negative cases.