Python time series benchmark. The main goal of this repository is to allow researchers and developers to compare several time series forecasting models or libraries. The repository contains data and the necessary wrappers for forecasting univariate and multivariate time series.
Which can be used to predict univariate time series
The repository includes the following libraries and models for comparison:
- FEDOT - AutoML framework which supports the time series forecasting task. Name in the repository
FEDOT
. - AutoTS - Automated time series forecasting library. Name in the repository
AutoTS
. - pmdarima - A statistical library to fit ts models. Name in the repository
pmdarima
. - prophet - a procedure for forecasting time series data based on an additive model. Name in the repository
prophet
. - H2O - (lagged transformation + H2O) AutoML platform for tabular data. Name in the repository
H2O
. - TPOT - (lagged transformation + TPOT) Tree-based Pipeline Optimization Tool. AutoML library for tabular data. Name in the repository
TPOT
. - naive forecaster - repeat last observation. Name in the repository
repeat_last
. - naive forecaster - forecasts the average value of the time series. Name in the repository
average
. - naive forecaster ETS - Simple exponential smoothing model. Name in the repository
ets
.
Available datasets (check data folder):
- FRED (license) - Federal Reserve Economic Data is an online database consisting
of economic data time series (source link). Name in the repository
FRED
. - TEP (license) - Tennessee Eastman Process (TEP) is a
model of the industrial chemical process (source link). Name in the repository
TEP
. - SMART - the readings with a time span of 1 minute of house appliances in kW from a
smart meter and weather conditions (source link).
Name in the repository
SMART
.
Below is a brief description of datasets:
Dataset | Total number of time series |
Average row length |
Minimum row length |
Maximum row length |
Percentage of non-stationary time series |
---|---|---|---|---|---|
FRED | 12 | 3674 | 468 | 17520 | 67 |
TEP | 41 | 12801 | 12801 | 12801 | 5 |
SMART | 28 | 503911 | 503911 | 503911 | 21 |
Which can be used to predict multivariate time series
The repository includes the following libraries and models for comparison:
- FEDOT - AutoML framework which supports both univariate and multivariate time
series forecasting tasks. Name in the repository
FEDOT
. - naive forecaster - repeat last observation. Name in the repository
repeat_last
. - naive forecaster - forecasts the average value of the time series. Name in the repository
average
.
At the moment, the repository contains one dataset of multivariate time series forecasting:
- SSH - Data were collected by simulating the sea surface height (SSH) using NEMO
(Nucleus for European Modelling of the Ocean) model.
The data contain measurements of sea level (in meters) in different geographical locations. For each time series, the
coordinates (x and y) and the label are known. For each series, it is required to generate a forecast based on the previous
values of the current and all other time series.
Name in the repository
SSH
.
The picture below shows the location of the time series and shows examples of the structures of some of them.
The model is designed to iteratively generate a forecast for each time series in the dataset. It is possible to use the historical values of not only the target series but also the neighboring (exogenous time series).
See documentation for a more detailed explanation.
Check experiments folder
Benchmark contains two forecasting tasks:
- univariate time series
- multivariate time series
For each case there is a folder with a configuration file to launch experiments. Results according to experiments are presented in the tables below.
In progress
In progress
Use the following command to install this module
pip install git+https://github.com/ITMO-NSS-team/pytsbe.git
This module is designed so that you can add your library to it as easily as possible.
Follow these steps to make the changes:
- Make a fork of this repository, or create a separate branch
- Add a new Forecaster class
- If required, add a new class to serialize additional launch information
- Create a pull request and ask our team to review it
- After the code review, correct our notes and merge the code into the main branch
And check contribution guide for more details.
Nikitin, Nikolay O., et al. "Automated evolutionary approach for the design of composite machine learning pipelines." Future Generation Computer Systems 127 (2022): 109-125.
Other papers - in ResearchGate