Here are
51 public repositories
matching this topic...
A Python library for building data applications: ETL, ML, Data Pipelines, and more.
Updated
Aug 10, 2020
Python
MLeap: Deploy Spark Pipelines to Production
Updated
Jul 31, 2020
Scala
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Updated
Aug 10, 2020
TypeScript
Relational data pipelines for the science lab
Updated
May 26, 2020
Python
This is an Open Source PHP Reporting Framework which you can use to write perfect data reports or to construct awesome dashboards using PHP
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Updated
Dec 15, 2017
Java
The fastest way to access and manage datasets for PyTorch and TensorFlow. Easily build scalable data pipelines.
https://activeloop.ai
Updated
Aug 10, 2020
Python
A Pachyderm deep learning tutorial for conference workshops
Updated
Aug 2, 2017
Python
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
Updated
Jul 30, 2020
Python
Example of an ETL Pipeline using Airflow
Updated
Aug 30, 2017
Python
Framework for data processing
Updated
Nov 10, 2019
Python
Provides an extensible solution for creating Data Processing Pipelines in F#.
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Updated
Sep 30, 2019
Python
Framework to quickly build and maintain Smart Data Lakes
Updated
Aug 10, 2020
Scala
Using Apache Airflow to author, run and monitor complex data pipelines.
Updated
Oct 24, 2018
Jupyter Notebook
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Updated
Jul 27, 2020
Python
Create production-ready Dataflow projects in a zap! ⚡
Updated
Jan 2, 2020
Python
versioned machine learning pipelines
Updated
Jun 28, 2020
Python
An example Pachyderm ML pipeline using Nervana Neon
Updated
Mar 23, 2017
Python
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
Updated
Aug 3, 2020
Jupyter Notebook
Ease of use in-app micro-ETL framework for building data processing pipelines.
A framework for microservices
Updated
Nov 5, 2018
JavaScript
A framework for fast development of scalable data pipelines following a simple design pattern
Updated
Jul 29, 2020
Python
A suite of tools written in Pyraf, Astropy, Scipy, and Numpy to process individual QuickReduced images into single stacked images using a set of "best practices" for ODI data.
Updated
May 18, 2020
Python
Project 5 - Data Engineering Nanodegree
Updated
Jun 26, 2019
Python
Quick way to deploy Airflow Multi-Node Cluster (a.k.a. Airflow Celery Executor Setup)
Updated
Aug 6, 2020
Python
Source code for guide to run Apache Airflow on Kubernetes
Updated
Apr 13, 2020
Python
Updated
Jul 28, 2020
Python
Supplementary material for DOLAP 2019 submission
Implemented Data Warehouse, Data Lake on AWS and Data modeling with Postgres and Apache Cassandra, Also used Apache Airflow to create data pipeline
Updated
Jul 2, 2020
Jupyter Notebook
Improve this page
Add a description, image, and links to the
data-pipelines
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
data-pipelines
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.