ProximityTree#

class ProximityTree(random_state=None, get_exemplars=<function get_one_exemplar_per_class_proximity>, distance_measure=None, get_distance_measure=None, setup_distance_measure=<function setup_all_distance_measure_getter>, get_gain=<function gini_gain>, max_depth=inf, is_leaf=<function pure>, verbosity=0, n_jobs=1, n_stump_evaluations=5, find_stump=None)[source]#

Proximity Tree class.

A decision tree which uses distance measures to partition data.

Attributes

random_statethe random state
get_exemplars:: function to extract exemplars from a dataframe and class value list
distance_measuredistance measures
get_distance_measuredistance measure getters
setup_distance_measurefunction: setup the distance measure getters from dataframe and class value list
get_gainfunction: score the quality of a split
verbosity: logging verbosity
is_leaffunction: decide when to mark a node as a leaf node
n_jobs: number of jobs to run in parallel *across threads”
find_stump: function to find the best split of data
max_depth: max tree depth
depth: current depth of tree, as each node is a tree itself,
therefore can have a depth of >=0
stump: the stump used to split data at this node
branches: the partitions of data driven by the stump
get_exemplars: get the exemplars from a given dataframe and list of class labels
distance_measure: distance measure to use
get_distance_measure: method to get the distance measure
setup_distance_measure: method to setup the distance measures based upon the
dataset given
get_gain: method to find the gain of a data split
max_depth: maximum depth of the tree
verbosity: number reflecting the verbosity of logging
n_jobs: number of parallel threads to use while building
find_stump: method to find the best split of data / stump at a node
n_stump_evaluations: number of stump evaluations to do if
find_stump method is None

Examples

>>> from sktime.classification.distance_based import ProximityTree
>>> from sktime.datasets import load_unit_test
>>> X_train, y_train = load_unit_test(split="train", return_X_y=True)
>>> X_test, y_test = load_unit_test(split="test", return_X_y=True)
>>> clf = ProximityTree(max_depth=2, n_stump_evaluations=1)
>>> clf.fit(X_train, y_train)
ProximityTree(...)
>>> y_pred = clf.predict(X_test)

Methods

`check_is_fitted`()	Check if the estimator has been fitted.
`clone_tags`(estimator[, tag_names])	clone/mirror tags from another estimator as dynamic override.
`create_test_instance`([parameter_set])	Construct Estimator instance if possible.
`create_test_instances_and_names`([parameter_set])	Create list of all test instances and a list of names for them.
`fit`(X, y)	Fit time series classifier to training data.
`get_class_tag`(tag_name[, tag_value_default])	Get tag value from estimator class (only class tags).
`get_class_tags`()	Get class tags from estimator class and all its parent classes.
`get_params`([deep])	Get parameters for this estimator.
`get_tag`(tag_name[, tag_value_default, …])	Get tag value from estimator class and dynamic tag overrides.
`get_tags`()	Get tags from estimator class and dynamic tag overrides.
`get_test_params`([parameter_set])	Return testing parameter settings for the estimator.
`is_composite`()	Check if the object is composite.
`predict`(X)	Predicts labels for sequences in X.
`predict_proba`(X)	Predicts labels probabilities for sequences in X.
`reset`()	Reset the object to a clean post-init state.
`score`(X, y)	Scores predicted labels against ground truth labels on X.
`set_params`(**params)	Set the parameters of this estimator.
`set_tags`(**tag_dict)	Set dynamic tags to given values.

classmethod get_test_params(parameter_set='default')[source]#

Return testing parameter settings for the estimator.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set. For classifiers, a “default” set of parameters should be provided for general testing, and a “results_comparison” set for comparing against previously recorded results if the general set does not produce suitable probabilities to compare against.

Returns

paramsdict or list of dict, default={}: Parameters to create testing instances of the class. Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params.

check_is_fitted()[source]#

Check if the estimator has been fitted.

Raises

NotFittedError: If the estimator has not been fitted yet.

clone_tags(estimator, tag_names=None)[source]#

clone/mirror tags from another estimator as dynamic override.

Parameters

estimatorestimator inheriting from :class:BaseEstimator
tag_namesstr or list of str, default = None: Names of tags to clone. If None then all tags in estimator are used as tag_names.

Returns

Self: Reference to self.

Notes

Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.

classmethod create_test_instance(parameter_set='default')[source]#

Construct Estimator instance if possible.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

instanceinstance of the class with default parameters

Notes

get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.

classmethod create_test_instances_and_names(parameter_set='default')[source]#

Create list of all test instances and a list of names for them.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

objslist of instances of cls: i-th instance is cls(**cls.get_test_params()[i])
nameslist of str, same length as objs: i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}
parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

fit(X, y)[source]#

Fit time series classifier to training data.

Parameters

X3D np.array (any number of dimensions, equal length series)

of shape [n_instances, n_dimensions, series_length]

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.array of int, of shape [n_instances] - class labels for fitting

indices correspond to instance indices in X

Returns

selfReference to self.

Notes

Changes state by creating a fitted model that updates attributes ending in “_” and sets is_fitted flag to True.

classmethod get_class_tag(tag_name, tag_value_default=None)[source]#

Get tag value from estimator class (only class tags).

Parameters

tag_namestr: Name of tag value.
tag_value_defaultany type: Default/fallback value if tag is not found.

Returns

tag_value: Value of the tag_name tag in self. If not found, returns tag_value_default.

classmethod get_class_tags()[source]#

Get class tags from estimator class and all its parent classes.

Returns

collected_tagsdict: Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance. NOT overridden by dynamic tags set by set_tags or mirror_tags.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#

Get tag value from estimator class and dynamic tag overrides.

Parameters

tag_namestr: Name of tag to be retrieved
tag_value_defaultany type, optional; default=None: Default/fallback value if tag is not found
raise_errorbool: whether a ValueError is raised when the tag is not found

Returns

tag_value: Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises

ValueError if raise_error is True i.e. if tag_name is not in self.get_tags(
).keys()

get_tags()[source]#

Get tags from estimator class and dynamic tag overrides.

Returns

collected_tagsdict: Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.

is_composite()[source]#

Check if the object is composite.

A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.

Returns

composite: bool, whether self contains a parameter which is BaseObject

property is_fitted[source]#: Whether fit has been called.

predict(X) → numpy.ndarray[source]#

Predicts labels for sequences in X.

Parameters

X3D np.array (any number of dimensions, equal length series)

of shape [n_instances, n_dimensions, series_length]

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns

y1D np.array of int, of shape [n_instances] - predicted class labels: indices correspond to instance indices in X

predict_proba(X) → numpy.ndarray[source]#

Predicts labels probabilities for sequences in X.

Parameters

X3D np.array (any number of dimensions, equal length series)

of shape [n_instances, n_dimensions, series_length]

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns

y2D array of shape [n_instances, n_classes] - predicted class probabilities: 1st dimension indices correspond to instance indices in X 2nd dimension indices correspond to possible labels (integers) (i, j)-th entry is predictive probability that i-th instance is of class j

reset()[source]#

Reset the object to a clean post-init state.

Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))

Detail behaviour: removes any object attributes, except:

hyper-parameters = arguments of __init__ object attributes containing double-underscores, i.e., the string “__”

runs __init__ with current values of hyper-parameters (result of get_params)

Not affected by the reset are: object attributes containing double-underscores class and object methods, class attributes

score(X, y) → float[source]#

Scores predicted labels against ground truth labels on X.

Parameters

X3D np.array (any number of dimensions, equal length series)

of shape [n_instances, n_dimensions, series_length]

or 2D np.array (univariate, equal length series): of shape [n_instances, series_length]
or pd.DataFrame with each column a dimension, each cell a pd.Series: (any number of dimensions, equal or unequal length series)
or of any other supported Panel mtype: for list of mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

y1D np.ndarray of int, of shape [n_instances] - class labels (ground truth)

indices correspond to instance indices in X

Returns

float, accuracy score of predict(X) vs y

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

set_tags(**tag_dict)[source]#

Set dynamic tags to given values.

Parameters

tag_dictdict: Dictionary of tag name : tag value pairs.

Returns

Self: Reference to self.

Notes

Changes object state by settting tag values in tag_dict as dynamic tags in self.