A curated list of awesome big data frameworks, ressources and other awesomeness.
Updated Feb 12, 2019
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
A stream processor for dull stuff written in Go
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Python
Updated Mar 22, 2019
Pravega - Streaming as a new software defined storage primitive
Java
Updated Mar 22, 2019
Real Time Analytics and Data Pipelines based on Spark Streaming
Scala
Updated Jul 7, 2017
A list about Apache Kafka
Updated Jan 2, 2019
A Java Toolbox for Scalable Probabilistic Machine Learning
Java
Updated Nov 12, 2018
A multi-output/multi-label and stream data framework. Inspired by MOA and MEKA, following scikit-learn's philosophy.
Streamdata.io Javascript package containing SDK, documentation and sample applications
JavaScript
Updated Dec 17, 2018
Grid Solutions Framework
C#
Updated Mar 9, 2019
Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent…
Java
Updated Feb 21, 2019
Open Source Phasor Data Concentrator
Decentralized, end-to-end streaming application and real-time user interface platform—home of the WARP multiplexed st…
Java
Updated Mar 10, 2019
Source code for the Kafka Streams in Action Book
Java
Updated Aug 28, 2018
Natural Series Database
Scala
Updated Mar 15, 2019
Simple Http client (Use RxSwift for stream data)
Swift
Updated Feb 25, 2019
OharaStream - A powerful ETL tool and easy-to-use visual stream processing
Window-Based Hybrid CPU/GPU Stream Processing Engine
Java
Updated Jan 28, 2019
Combinatorial Game Theory through Conformal Geometric Algebra
JavaScript
Updated Mar 21, 2019
IMTSL - Incremental and Multi-feature Tensor Subspace Learning
MATLAB
Updated Feb 23, 2015
🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams
Python
Updated Mar 7, 2019
A Pythonic and ultra fast template engine DSL.
Python
Updated Mar 7, 2019
Sample Applications for Pravega.
Java
Updated Mar 21, 2019
A simple and eloquent workflow for streaming messages to micro-services.
JavaScript
Updated Jun 8, 2017
Clustering for arbitrary data and dissimilarity function
Python
Updated Feb 20, 2019
Twitter Dynamic Dataset Api. Create any dataset YOU want.
A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable…
JavaScript
Updated Aug 8, 2017
HyperStream
Simple demonstration of how to build a complex real time machine learning visualization tool.
Python
Updated Mar 26, 2016