run_clustering_experiment#

run_clustering_experiment(trainX, clusterer, results_path, trainY=None, testX=None, testY=None, cls_name=None, dataset_name=None, resample_id=0, overwrite=True)[source]#

Run a clustering experiment and save the results to file.

Method to run a basic experiment and write the results to files called testFold<resampleID>.csv and, if required, trainFold<resampleID>.csv. This version loads the data from file based on a path. The clusterer is always trained on the required input data trainX. Output to trainResample<resampleID>.csv will be the predicted clusters of trainX. If trainY is also passed, these are written to file. If the clusterer makes probabilistic predictions, these are also written to file. See write_results_to_uea_format for more on the output. Be warned, this method will always overwrite existing results, check bvefore calling or use load_and_run_clustering_experiment instead.

Parameters:
trainXpd.DataFrame or np.array

The data to cluster.

clustererBaseClusterer

The clustering object

results_pathstr

Where to write the results to

trainYnp.array, default = None

Train data tue class labels, only used for file writing, ignored by the clusterer

testXpd.DataFrame or np.array, default = None

Test attribute data, if present it is used for predicting testY

testYnp.array, default = None

Test data true class labels, only used for file writing, ignored by the clusterer

cls_namestr, default = None

Name of the clusterer, written to the results file, ignored if None

dataset_namestr, default = None

Name of problem, written to the results file, ignored if None

resample_idint, default = 0

Resample identifier, defaults to 0