Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spar…
#
big-data
Repositories 1,098
Scalable, Available, Stable, Performant, and Intelligent System Design Patterns
system-design
backend
scalability
site-reliability-engineering
sre
interview
architecture
devops
site-reliability
design-patterns
back-end
back-end-development
interview-questions
design-systems
awesome-list
microservices
distributed-systems
design-system
tech
big-data
Updated Sep 15, 2018
A realtime, decentralized, offline-first, graph database engine.
machine-learning
ml
artificial-intelligence
ai
big-data
blockchain
encrypted
end-to-end
p2p
peer-to-peer
decentralized
graph
cryptography
crypto
offline-first
realtime
iot
crdt
protocol
database
JavaScript
Updated Sep 20, 2018
Distributed SQL query engine for big data
Java
Updated Sep 21, 2018
Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems …
Python
Updated Jan 21, 2018
Alluxio, formerly Tachyon, Unify Data at Memory Speed
Java
Updated Sep 21, 2018
The most widely used Python to C compiler
Open Source Fast Scalable Machine Learning Platform For Smarter Applications (Deep Learning, Gradient Boosting, Rando…
h2o
machine-learning
data-science
deep-learning
big-data
ensemble-learning
gbm
random-forest
naive-bayes
pca
opensource
distributed
multi-threading
java
python
r
hadoop
spark
gpu
automatic
Java
Updated Sep 21, 2018
CatBoost is an open-source gradient boosting on decision trees library with categorical features support out of the b…
Reproducible Data Science at Scale!
Kubernetes中文指南/云原生应用架构实践手册 - https://jimmysong.io/kubernetes-handbook
Open Source In-Memory Data Grid
Java
Updated Sep 21, 2018
BigDL: Distributed Deep Learning Library for Apache Spark
Scala
Updated Sep 21, 2018
Bare bone examples of machine learning in TensorFlow
tensorflow
tensorflow-tutorials
distributed-computing
simple
big-data
linear-regression
tensorflow-examples
tensorflow-exercises
Python
Updated Mar 14, 2017
A large-scale entity and relation database supporting aggregation of properties
Java
Updated Sep 21, 2018
An easy to use, self-service open BI reporting and BI dashboard platform.
JavaScript
Updated Sep 5, 2018
data-science
data-visualization
dashboard
data-engineering
d3
d3js
chart
data
yaml
csv
json
gist
github-gist
big-data
business-intelligence
data-driven
just-dashboard
JavaScript
Updated Aug 23, 2018
A search engine which can hold 100 trillion lines of log data.
Go
Updated May 22, 2017
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine L…
Java
Updated Apr 10, 2018
Distributed Big Data Orchestration Service
big-data
bigdata
orchestration
configuration
configuration-management
java
spring-boot
distributed-systems
netflixoss
cloud
netflix-oss
microservice
microservices
Java
Updated Sep 12, 2018
TrailDB is an efficient tool for storing and querying series of events
C
Updated Sep 7, 2017
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processin…
Updated Jul 25, 2018
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
spark
python
pyspark
data-analysis
mllib
ipython-notebook
notebook
ipython
data-science
machine-learning
big-data
bigdata
Jupyter Notebook
Updated Sep 6, 2017
Java
Updated Sep 17, 2018
trade as a fool
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Jupyter Notebook
Updated Apr 20, 2017
Support content for my blog
sciblog
sciblog-support
machine-learning
artificial-intelligence
deep-learning
neural-networks
examples
code-examples
programming-exercise
data-science
big-data
analytics
Jupyter Notebook
Updated Aug 15, 2018
A vue component that support big data list with high scroll performance.
JavaScript
Updated Sep 20, 2018
MooseFS - Open Source Network Distributed File System. MooseFS 3.0 is stable and recommended for production environme…
dfs
software-defined-storage
posix
filesystem
file-system
distributed-file-system
clustering
distributed-storage
distributed-computing
c
fuse
big-data
snapshot
storage-tiering
ha
high-availability
scalability
storage
moosefs
hadoop
C
Updated Jul 23, 2018