A Python stream processing engine modeled after Yahoo! Pipes
#
etl
Repositories 664
Data processing & ETL framework for Ruby
Ruby
Updated Apr 9, 2019
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Python
Updated Apr 15, 2019
Sync data between persistence engines, like ETL only not stodgy
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, t…
Shell
Updated Apr 16, 2019
Actively curated list of awesome BI tools. PRs welcome!
Updated Apr 18, 2019
React components to build CSV files on the fly basing on Array/literal object of data
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Python
Updated Jan 8, 2019
This repository is a getting started guide to Singer.
Makefile
Updated Apr 11, 2019
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on va…
Logical Replication extension for PostgreSQL 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slo…
postgresql
replication
logical-decoding
database-replication
subscription
publish-subscribe
data-transformation
data-transport
etl
cdc
zero-downtime
Updated Mar 14, 2019
1 issue
needs help
SmartCode = IDataSource -> IBuildTask -> IOutput => Build Everything!!!
C#
Updated Apr 18, 2019
ETL Library for Machine Learning - data pipelines, data munging and wrangling
etl
spark
machine-learning
transformations
svmlight
hadoop-ecosystem
writables
schema
pipeline
formatter
datapipeline
data-munging
Java
Updated May 21, 2018
The premier open source Data Quality solution
Java
Updated Apr 18, 2019
R client for the Elasticsearch HTTP API
R
Updated Apr 11, 2019
HTML
Updated Mar 27, 2019
Power of appbase.io via CLI, with nifty imports from your favorite data sources
Go
Updated Apr 16, 2019
ETL Framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value formatted files)
C#
Updated Apr 19, 2019
A MongoDB to Elasticsearch connector
TypeScript
Updated Feb 19, 2019
A cross-platform command line tool for parallelised content extraction and analysis.
Java
Updated Apr 19, 2019
Bender - Serverless ETL Framework
Java
Updated Jul 2, 2018
Linked Data & RDF Manufacturing Tools in Clojure
Clojure
Updated Apr 15, 2019
Example DAGs using hooks and operators from Airflow Plugins
apache-airflow
dag
etl
airflow-plugins
selenium
google-analytics
facebook-ads
sftp
marketo
salesforce
hubspot
marketo-sdk
imap
airflow
mongodb
mailgun
singer
dbt
Python
Updated Jul 24, 2018
Metl is a simple, web-based integration platform that allows for several different styles of data integration includi…
Java
Updated Apr 19, 2019
Example project implementing best practices for PySpark ETL jobs and applications.
Python
Updated Feb 19, 2019
mito ETL tool
Python
Updated Jan 20, 2017
A visual ETL development and debugging tool for big data
Java
Updated Feb 25, 2019
Big Data Toolkit for the JVM
StorageTapper is a scalable realtime MySQL change data streaming and transformation service
Go
Updated Jan 7, 2019