An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
-
Updated
Mar 9, 2020 - Python
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Build, run and manage your data pipelines with Python or SQL on any cloud
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
Code examples showing flow deployment to various types of infrastructure
Classwork projects and home works done through Udacity data engineering nano degree
Deploy a Prefect flow to serverless AWS Lambda function
ETL pipeline combined with supervised learning and grid search to classify text messages sent during a disaster event
Apache Spark Guide
Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database
Marshmallow serializer integration with pyspark
A data engineering pipeline for digital marketers.
Challenge to job: Data Scientist
Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups
Solution for the Ultimate Student Hunt Challenge (1st place).
Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.
Social Media Analysis, scalable solution, flexible deployment that analyses social media contents
An environment for analyzing Twitter
Add a description, image, and links to the data-engineering-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering-pipeline topic, visit your repo's landing page and select "manage topics."