Skip to content
The Tornado framework, designed and implemented for adaptive online learning and data stream mining in Python.
Python
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
archiver
classifier
data_streams
data_structures
dictionary
drift_detection Update __init__.py May 17, 2019
evaluators
filters
graphic Add files via upload Oct 4, 2017
plotter
streams
tasks
tutorial_img
LICENSE.txt
README.md
github_generate_stream.py
github_prequential_multi_test.py
github_prequential_test.py

README.md

The Tornado Framework

Language

Tornado is a framework for data stream mining, written in Python. The framework includes implementations of various incremental/online learning algorithms as well as concept drift detection methods.

You must have Python 3.5 or above (either 32-bit or 64-bit) on your system to run the framework without any error. Note that the numpy, scipy, mathplotlib, and pympler packages are used in the Tornado's implementations. You may use the pip command in order to install these packages, for example:

pip install numpy

Although you can use an installer from https://www.python.org/downloads/ to install Python on your system, I highly recommend Anaconda, one of the Python distributions, since it includes the numpy, scipy, and mathplotlib packages by default. You may download one of the Anaconda's installers from https://www.anaconda.com/download/. Please note that, you still need to install the pympler package for Anaconda. For that, run the following command in a command prompt or a terminal:

conda install -c conda-forge pympler

Once you have all the packages installed, you may run the framework.

Three sample codes are prepared to show how you can use the framework. Those files are:

  • github_prequential_test.py - This file lets you evaluate an adaptive algorithm, i.e. a pair of a learner and a drift detector, prequentially. In this example, Naive Bayes is the learner and Fast Hoeffding Drift Detection Method (FHDDM) is the detector. You find lists of incremental learners in tornado/classifier/ and drift detectors in tornado/drift_detection/. The outputs in the created project directory are similar to:


  • github_prequential_multi_test.py - This file lets you run multiple adaptive algorithms together against a data stream. While algorithms are learning from instances of a data stream, the framework tells you which adaptive algorithm is optimal by considering classification, adaptation, and resource consumption measures. The outputs in the created project directory are similar to:


  • github_generate_stream.py - The file helps you use the Tornado framework for generating synthetic data streams containing concept drifts. You find a list of stream generators in tornado/streams/generators/.

Citation

Please kindly cite the following papers, or thesis, if you plan to use Tornado or any of its components:

  1. Pesaranghader, Ali. "A Reservoir of Adaptive Algorithms for Online Learning from Evolving Data Streams", Ph.D. Dissertation, Université d'Ottawa/University of Ottawa, 2018.
    DOI: http://dx.doi.org/10.20381/ruor-22444
  2. Pesaranghader, Ali, et al. "Reservoir of Diverse Adaptive Learners and Stacking Fast Hoeffding Drift Detection Methods for Evolving Data Streams", Machine Learning Journal, 2018.
    Pre-print available at: https://arxiv.org/abs/1709.02457, DOI: https://doi.org/10.1007/s10994-018-5719-z
  3. Pesaranghader, Ali, et al. "A framework for classification in data streams using multi-strategy learning", International Conference on Discovery Science, 2016.
    Pre-print available at: http://iwera.ir/~ali/papers/ds2016.pdf, DOI: https://doi.org/10.1007/978-3-319-46307-0_22


Ali Pesaranghader © 2020
Under MIT License

You can’t perform that action at this time.