Pinned repositories
Repositories
-
-
luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
-
chartify
Python library that makes it easy for data scientists to create charts.
-
terraform-gke-kubeflow-cluster
Terraform module for creating GKE clusters to run Kubeflow
-
-
SPTDataLoader
The HTTP library used by the Spotify iOS client
-
featran
A Scala feature transformation library for data science and machine learning
-
klio
Smarter data pipelines for audio.
-
scio
A Scala API for Apache Beam and Google Cloud Dataflow.
-
ratatool
A tool for data sampling, data generation, and data diffing
-
-
-
styx
"The path to execution", Styx is a service that schedules batch data processing jobs in Docker containers on Kubernetes.
-
-
-
github-java-client
A Java client to Github API
-
-
-
-
web-scripts
A collection of base configs and CLI wrappers used to speed up development @ Spotify.
-
dbeam
DBeam exports SQL tables into Avro files using JDBC and Apache Beam
-
-
magnolify
A collection of Magnolia add-on modules
-
zoltar
Common library for serving TensorFlow, XGBoost and scikit-learn models in production.
-
reactochart
📈 React chart component library📉 -
dockerfile-mode
An emacs mode for handling Dockerfiles
-
big-data-rosetta-code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
-
missinglink
Build time tool for detecting link problems in java projects
-