TimeSeriesKMedoids#

class TimeSeriesKMedoids(n_clusters: int = 8, init_algorithm: Union[str, Callable] = 'random', metric: Union[str, Callable] = 'dtw', n_init: int = 10, max_iter: int = 300, tol: float = 1e-06, verbose: bool = False, random_state: Optional[Union[int, numpy.random.mtrand.RandomState]] = None, distance_params: Optional[dict] = None)[source]#

Time series K-medoids implementation.

Parameters

n_clusters: int, defaults = 8: The number of clusters to form as well as the number of centroids to generate.
init_algorithm: str, defaults = ‘forgy’: Method for initializing cluster centers. Any of the following are valid: [‘kmeans++’, ‘random’, ‘forgy’]
metric: str or Callable, defaults = ‘dtw’: Distance metric to compute similarity between time series. Any of the following are valid: [‘dtw’, ‘euclidean’, ‘erp’, ‘edr’, ‘lcss’, ‘squared’, ‘ddtw’, ‘wdtw’, ‘wddtw’]
n_init: int, defaults = 10: Number of times the k-means algorithm will be run with different centroid seeds. The final result will be the best output of n_init consecutive runs in terms of inertia.
max_iter: int, defaults = 30: Maximum number of iterations of the k-means algorithm for a single run.
tol: float, defaults = 1e-6: Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.
verbose: bool, defaults = False: Verbosity mode.
random_state: int or np.random.RandomState instance or None, defaults = None: Determines random number generation for centroid initialization.
distance_params: dict, defaults = None: Dictonary containing kwargs for the distance metric being used.

Attributes

cluster_centers_: np.ndarray (3d array of shape (n_clusters, n_dimensions,: series_length)) Time series that represent each of the cluster centers. If the algorithm stops before fully converging these will not be consistent with labels_.
labels_: np.ndarray (1d array of shape (n_instance,)): Labels that is the index each time series belongs to.
inertia_: float: Sum of squared distances of samples to their closest cluster center, weighted by the sample weights if provided.
n_iter_: int: Number of iterations run.

Methods

`check_is_fitted`()	Check if the estimator has been fitted.
`clone`()	Obtain a clone of the object with same hyper-parameters.
`clone_tags`(estimator[, tag_names])	Clone tags from another estimator as dynamic override.
`create_test_instance`([parameter_set])	Construct Estimator instance if possible.
`create_test_instances_and_names`([parameter_set])	Create list of all test instances and a list of names for them.
`fit`(X[, y])	Fit time series clusterer to training data.
`fit_predict`(X[, y])	Compute cluster centers and predict cluster index for each time series.
`get_class_tag`(tag_name[, tag_value_default])	Get a class tag’s value.
`get_class_tags`()	Get class tags from the class and all its parent classes.
`get_config`()	Get config flags for self.
`get_fitted_params`([deep])	Get fitted parameters.
`get_param_defaults`()	Get object’s parameter defaults.
`get_param_names`()	Get object’s parameter names.
`get_params`([deep])	Get a dict of parameters values for this object.
`get_tag`(tag_name[, tag_value_default, …])	Get tag value from estimator class and dynamic tag overrides.
`get_tags`()	Get tags from estimator class and dynamic tag overrides.
`get_test_params`([parameter_set])	Return testing parameter settings for the estimator.
`is_composite`()	Check if the object is composed of other BaseObjects.
`load_from_path`(serial)	Load object from file location.
`load_from_serial`(serial)	Load object from serialized memory container.
`predict`(X[, y])	Predict the closest cluster each sample in X belongs to.
`predict_proba`(X)	Predicts labels probabilities for sequences in X.
`reset`()	Reset the object to a clean post-init state.
`save`([path])	Save serialized self to bytes-like object or to (.zip) file.
`score`(X[, y])	Score the quality of the clusterer.
`set_config`(**config_dict)	Set config flags to given values.
`set_params`(**params)	Set the parameters of this object.
`set_tags`(**tag_dict)	Set dynamic tags to given values.

classmethod get_test_params(parameter_set='default')[source]#

Return testing parameter settings for the estimator.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

paramsdict or list of dict, default = {}: Parameters to create testing instances of the class Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params

check_is_fitted()[source]#

Check if the estimator has been fitted.

Raises

NotFittedError: If the estimator has not been fitted yet.

clone()[source]#

Obtain a clone of the object with same hyper-parameters.

A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self.

Raises

RuntimeError if the clone is non-conforming, due to faulty __init__.

Notes

If successful, equal in value to type(self)(**self.get_params(deep=False)).

clone_tags(estimator, tag_names=None)[source]#

Clone tags from another estimator as dynamic override.

Parameters

estimatorestimator inheriting from :class:BaseEstimator
tag_namesstr or list of str, default = None: Names of tags to clone. If None then all tags in estimator are used as tag_names.

Returns

Self: Reference to self.

Notes

Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.

classmethod create_test_instance(parameter_set='default')[source]#

Construct Estimator instance if possible.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

instanceinstance of the class with default parameters

Notes

get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.

classmethod create_test_instances_and_names(parameter_set='default')[source]#

Create list of all test instances and a list of names for them.

Parameters

parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns

objslist of instances of cls: i-th instance is cls(**cls.get_test_params()[i])
nameslist of str, same length as objs: i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}
parameter_setstr, default=”default”: Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

fit(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) → sktime.base._base.BaseEstimator[source]#

Fit time series clusterer to training data.

Parameters

XTraining time series instances to cluster. np.ndarray (2d or 3d array of
shape (n_instances, series_length) or shape (n_instances, n_dimensions,
series_length)) or pd.DataFrame (where each column is a dimension, each cell
is a pd.Series (any number of dimensions, equal or unequal length series)).
Converted to type _tags[“X_inner_mtype”]
y: ignored, exists for API consistency reasons.

Returns

self:: Fitted estimator.

fit_predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) → numpy.ndarray[source]#

Compute cluster centers and predict cluster index for each time series.

Convenience method; equivalent of calling fit(X) followed by predict(X)

Parameters

Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape: (n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to train clusterer and then have indexes each belong to return.
y: ignored, exists for API consistency reasons.

Returns

np.ndarray (1d array of shape (n_instances,)): Index of the cluster each time series in X belongs to.

classmethod get_class_tag(tag_name, tag_value_default=None)[source]#

Get a class tag’s value.

Does not return information from dynamic tags (set via set_tags or clone_tags) that are defined on instances.

Parameters

tag_namestr: Name of tag value.
tag_value_defaultany: Default/fallback value if tag is not found.

Returns

tag_value: Value of the tag_name tag in self. If not found, returns tag_value_default.

classmethod get_class_tags()[source]#

Get class tags from the class and all its parent classes.

Retrieves tag: value pairs from _tags class attribute. Does not return information from dynamic tags (set via set_tags or clone_tags) that are defined on instances.

Returns

collected_tagsdict: Dictionary of class tag name: tag value pairs. Collected from _tags class attribute via nested inheritance.

get_config()[source]#

Get config flags for self.

Returns

config_dictdict: Dictionary of config name : config value pairs. Collected from _config class attribute via nested inheritance and then any overrides and new tags from _onfig_dynamic object attribute.

get_fitted_params(deep=True)[source]#

Get fitted parameters.

State required:: Requires state to be “fitted”.

Parameters

deepbool, default=True

Whether to return fitted parameters of components.

If True, will return a dict of parameter name : value for this object, including fitted parameters of fittable components (= BaseEstimator-valued parameters).
If False, will return a dict of parameter name : value for this object, but not include fitted parameters of components.

Returns

fitted_paramsdict with str-valued keys

Dictionary of fitted parameters, paramname : paramvalue keys-value pairs include:

always: all fitted parameters of this object, as via get_param_names values are fitted parameter value for that key, of this object
if deep=True, also contains keys/value pairs of component parameters parameters of components are indexed as [componentname]__[paramname] all parameters of componentname appear as paramname with its value
if deep=True, also contains arbitrary levels of component recursion, e.g., [componentname]__[componentcomponentname]__[paramname], etc

classmethod get_param_defaults()[source]#

Get object’s parameter defaults.

Returns

default_dict: dict[str, Any]: Keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__.

classmethod get_param_names()[source]#

Get object’s parameter names.

Returns

param_names: list[str]: Alphabetically sorted list of parameter names of cls.

get_params(deep=True)[source]#

Get a dict of parameters values for this object.

Parameters

deepbool, default=True

Whether to return parameters of components.

If True, will return a dict of parameter name : value for this object, including parameters of components (= BaseObject-valued parameters).
If False, will return a dict of parameter name : value for this object, but not include parameters of components.

Returns

paramsdict with str-valued keys

Dictionary of parameters, paramname : paramvalue keys-value pairs include:

always: all parameters of this object, as via get_param_names values are parameter value for that key, of this object values are always identical to values passed at construction
if deep=True, also contains keys/value pairs of component parameters parameters of components are indexed as [componentname]__[paramname] all parameters of componentname appear as paramname with its value
if deep=True, also contains arbitrary levels of component recursion, e.g., [componentname]__[componentcomponentname]__[paramname], etc

get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#

Get tag value from estimator class and dynamic tag overrides.

Parameters

tag_namestr: Name of tag to be retrieved
tag_value_defaultany type, optional; default=None: Default/fallback value if tag is not found
raise_errorbool: whether a ValueError is raised when the tag is not found

Returns

tag_valueAny: Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises

ValueError if raise_error is True i.e. if tag_name is not in
self.get_tags().keys()

get_tags()[source]#

Get tags from estimator class and dynamic tag overrides.

Returns

collected_tagsdict: Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.

is_composite()[source]#

Check if the object is composed of other BaseObjects.

A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.

Returns

composite: bool: Whether an object has any parameters whose values are BaseObjects.

property is_fitted[source]#: Whether fit has been called.

classmethod load_from_path(serial)[source]#

Load object from file location.

Parameters

serialresult of ZipFile(path).open(“object)

Returns

deserialized self resulting in output at path, of cls.save(path)

classmethod load_from_serial(serial)[source]#

Load object from serialized memory container.

Parameters

serial1st element of output of cls.save(None)

Returns

deserialized self resulting in output serial, of cls.save(None)

predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) → numpy.ndarray[source]#

Predict the closest cluster each sample in X belongs to.

Parameters

Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape: (n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to predict their cluster indexes.
y: ignored, exists for API consistency reasons.

Returns

np.ndarray (1d array of shape (n_instances,)): Index of the cluster each time series in X belongs to.

predict_proba(X)[source]#

Predicts labels probabilities for sequences in X.

Default behaviour is to call _predict and set the predicted class probability to 1, other class probabilities to 0. Override if better estimates are obtainable.

Parameters

Xguaranteed to be of a type in self.get_tag(“X_inner_mtype”)

if self.get_tag(“X_inner_mtype”) = “numpy3D”:: 3D np.ndarray of shape = [n_instances, n_dimensions, series_length]
if self.get_tag(“X_inner_mtype”) = “nested_univ”:: pd.DataFrame with each column a dimension, each cell a pd.Series

for list of other mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns

y2D array of shape [n_instances, n_classes] - predicted class probabilities: 1st dimension indices correspond to instance indices in X 2nd dimension indices correspond to possible labels (integers) (i, j)-th entry is predictive probability that i-th instance is of class j

reset()[source]#

Reset the object to a clean post-init state.

Using reset, runs __init__ with current values of hyper-parameters (result of get_params). This Removes any object attributes, except:

hyper-parameters = arguments of __init__

object attributes containing double-underscores, i.e., the string “__”

Class and object methods, and class attributes are also unaffected.

Returns

self: Instance of class reset to a clean post-init state but retaining the current hyper-parameter values.

Notes

Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))

save(path=None)[source]#

Save serialized self to bytes-like object or to (.zip) file.

Behaviour: if path is None, returns an in-memory serialized self if path is a file location, stores self at that location as a zip file

saved files are zip files with following contents: _metadata - contains class of self, i.e., type(self) _obj - serialized self. This class uses the default serialization (pickle).

Parameters

pathNone or file location (str or Path): if None, self is saved to an in-memory object if file location, self is saved to that file location. If:

path=”estimator” then a zip file estimator.zip will be made at cwd. path=”/home/stored/estimator” then a zip file estimator.zip will be stored in /home/stored/.

Returns

if path is None - in-memory serialized self
if path is file location - ZipFile with reference to the file

score(X, y=None) → float[source]#

Score the quality of the clusterer.

Parameters

Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape: (n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to train clusterer and then have indexes each belong to return.
y: ignored, exists for API consistency reasons.

Returns

scorefloat: Score of the clusterer.

set_config(**config_dict)[source]#

Set config flags to given values.

Parameters

config_dictdict: Dictionary of config name : config value pairs.

Returns

selfreference to self.

Notes

Changes object state, copies configs in config_dict to self._config_dynamic.

set_params(**params)[source]#

Set the parameters of this object.

The method works on simple estimators as well as on nested objects. The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: BaseObject parameters.

Returns

self: Reference to self (after parameters have been set).

set_tags(**tag_dict)[source]#

Set dynamic tags to given values.

Parameters

**tag_dictdict: Dictionary of tag name: tag value pairs.

Returns

Self: Reference to self.

Notes

Changes object state by settting tag values in tag_dict as dynamic tags in self.