This repository is not being proactively maintained or receiving new implementation currently (This message updated 26/06/2023). Feel free to open still bug reports, we may get around to fixing them eventually but response may be delayed. For the latest time series algorithms implemented and maintained by our group, see the Python based aeon toolkit.
A Weka-compatible Java toolbox for time series classification, clustering and transformation. For the python sklearn-compatible version, see aeon.
Find out more info about our broader work and dataset hosting for the UCR univariate and UEA multivariate time series classification archives on our website.
This codebase is actively being developed for our research. The dev branch will contain the most up-to-date, but stable, code.
We are looking into deploying this project on Maven or Gradle in the future. For now there are two options:
- download the jar file and include as a dependency in your project, or you can run experiments through command line, see the examples on running experiments
- fork or download the source files and include in a project in your favourite IDE you can then construct your own experiments (see our examples) and implement your own classifiers.
This codebase mainly represents the implementation of different algorithms in a common framework, which at the time leading up to the Great Time Series Classification Bake Off in particular was a real problem, with implementations being in any of Python, C/C++, Matlab, R, Java, etc. or even combinations thereof.
We therefore mainly provide implementations of different classifiers as well as experimental and results analysis pipelines with the hope of promoting and streamlining open source, easily comparable, and easily reproducible results, specifically within the TSC space.
While they are obviously very important methods to study, we shall very likely not be implementing any kind of deep learning methods in our codebase, and leave those rightfully in the land of optimised languages and libraries for them. See aeon for implemented deep learning methods for time series data.
Our examples run through the basics of using the code, however the basic layout of the codebase is this:
- evaluation/
- contains classes for generating, storing and analysing the results of your experiments
- experiments/
- contains classes specifying the experimental pipelines we utilise, and lists of classifier and dataset specifications. The 'main' class is Experiments.java, however other experiments classes exist for running on simulation datasets or for generating transforms of time series for later classification, such as with the Shapelet Transform.
- tsml/ and multivariate_timeseriesweka/
- contain the TSC algorithms we have implemented, for univariate and multivariate classification respectively.
- machine_learning/
- contains extra algorithm implementations that are not specific to TSC, such as generalised ensembles or classifier tuners.
The lists of implemented TSC algorithms shall continue to grow over time. These are all in addition to the standard Weka classifiers and non-TSC algorithms defined under the machine_learning package.
We have implemented the following bespoke classifiers for univariate, equal length time series classification:
Distance Based | Dictionary Based | Kernel Based | Shapelet Based | Interval Based | Hybrids |
---|---|---|---|---|---|
DD_DTW | BOSS | Arsenal | LearnShapelets | TSF | HIVE-COTE |
DTD_C | cBOSS | ROCKET | ShapeletTransform | TSBF | Catch22 |
ElasticEnsemble | TDE | FastShapelets | LPS | ||
NN_CID | WEASEL | ShapeletTree | CIF | ||
SAX_1NN | SAXVSM | DrCIF | |||
ProximityForest | SpatialBOSS | RISE | |||
DTW_kNN | SAX_1NN | STSF | |||
FastDTW | BafOfPatterns... | ||||
FastElasticEn... | BOSSC45 | ||||
ShapeDTW_1NN | BoTSWEnsemble | ||||
ShapeDTW_SVM | BOSSSpatialPy... | ||||
SlowDTW_1NN | |||||
KNN |
And we have implemented the following bespoke classifiers for multivariate, equal length time series classification:
NN_ED_D | MultivariateShapeletTransform |
NN_ED_I | ConcatenateClassifier |
NN_DTW_D | MultivariateHiveCote |
NN_DTW_I | WEASEL+MUSE |
STC_D | MultivariateSingleEnsemble |
NN_DTW_A | MultivariateAbstractClassifier |
MultivariateAbstractEnsemble |
Currently quite limited, aside from those already shipped with Weka.
UnsupervisedShapelets | |
K-Shape | |
DictClusterer | |
TTC | |
AbstractTimeSeriesCLusterer |
SimpleBatchFilters that take an Instances (the set of time series), transforms them and returns a new Instances object.
ACF | ACF_PACF | ARMA |
BagOfPatternsFilter | BinaryTransform | Clipping |
Correlation | Cosine | DerivativeFilter |
Differences | FFT | Hilbert |
MatrixProfile | NormalizeAttribute | NormalizeCase |
PAA | PACF | PowerCepstrum |
PowerSepstrum | RankOrder | RunLength |
SAX | Sine | SummaryStats |
Transformers We will be shifting over to a bespoke Transformer interface
ShapeletTransform | |
catch22 |
This project acts as the general open-source codebase for our research, especially the Great Time Series Classification Bake Off. We are also trialling a process of creating stable branches in support of specific outputs.
Current branches of this type are:
- paper/cawpe/ in support of "A probabilistic classifier ensemble weighting scheme based on cross-validated accuracy estimates"
- paper/cawpeExtension/ in support of "Mixing hetero- and homogeneous models in weighted ensembles" (Accepted/in-press)
Lead: Anthony Bagnall (@TonyBagnall, @tony_bagnall, [email protected])
- James Large (@James-Large, @jammylarge, [email protected])
- Jason Lines (@jasonlines),
- George Oastler (@goastler),
- Matthew Middlehurst (@MatthewMiddlehurst, @M_Middlehurst, [email protected]),
- Michael Flynn (GitHub - @MJFlynn, Twitter - @M_J_Flynn, Email - [email protected])
- Aaron Bostrom (@ABostrom, @_Groshh_, [email protected]),
- Patrick Schäfer (@patrickzib)
- Chang Wei Tan (@ChangWeiTan)
- Alejandro Pasos Ruiz ([email protected])
- Conor Egan (@c-eg)
We welcome anyone who would like to contribute their algorithms!
GNU General Public License v3.0