Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
Jan 4, 2024 - Java
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
valmi.io reverse-ETL (data activation) is the open source ( OSS ) data activation platform to load data from warehouses into SaaS platforms, Webhook Apis etc.
Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0
Set up a Cost-Effective Modern Data Stack for a Charity
🧱 Tinkering around with a bunch of open source data tools.
A simple low-cost finance data pipeline orchestration. All you need is just python & SQL.
Deploy an open source and modern data stack in no time with Terraform
A minimal modern data stack with working data pipelines in a single Docker container.
A curated selection of tools, libraries and services that help tame your dataflow to productively build ambitious, data driven & reactive applications on a streaming lakehouse
A list of tools frequently used in "the modern data stack". Mostly SaaS or open-source apps to work with data and get value from it.
Extract Transform and Aggregate (ETA) Tool. Increase data fidelity and decrease data volume.
Ansible playbook to setup a host to serve Airbyte (via docker-compose) behind a oauth2-proxy
A minimal modern data stack with working data pipelines in a single Docker container.
Minimalistic and free Modern Data Stack, hence for all. ELT layer.
Building a Modern Data Stack with Open Source tools
Minimalistic and free Modern Data Stack, hence for all. Visualisation layer.
Cuadro de mandos sobre el COVID-19 construido sobre el Modern Data Stack, nueva arquitectura de datos que utiliza las ventajas de la tecnología en la nube.
Add a description, image, and links to the modern-data-stack topic page so that developers can more easily learn about it.
To associate your repository with the modern-data-stack topic, visit your repo's landing page and select "manage topics."