#

dataengineering

Here are 79 public repositories matching this topic...

prodmodel / prodmodel

Star

Build, test, deploy, iterate - Dev and prod tool for data science pipelines

productivity data-science data production build-automation datascience build-tool build-system dataengineering dataeng

Updated Jul 15, 2019
Python

dbtvault

Datavault-UK / dbtvault

Star

Open

[FEATURE] BigQuery support?

9

chapmanjacobd commented Dec 15, 2019

I'm new to the idea of Data Vault 2.0 and I'm reading the Dan Linstedt book to understand it better.

To help me get a better perspective of how dbtvault works I would like to know how difficult do you think it would be to add support for BigQuery?

Are there specific features of Snowflake which makes it better for running dbt/dbtvault ?

Thanks,
Jacob

Read more

enhancement feature good first issue

Finance-And-ML / US-Stock-Prediction-Using-ML-And-Spark

Star

Predict stock price based on financial news feeds

scala spark stock-price-prediction tableau webscraping dataengineering

Updated Apr 6, 2018
Jupyter Notebook

olist / work-at-olist-data

Star

Apply for a job at Olist's Data Team: http://bit.ly/olist-bsa

python data r sql analytics julia pandas datascience dataset machinelearning dataengineering

Updated May 21, 2020

data-engineering-interviews

kislerdm / data-engineering-interviews

Star

Data engineering interviews Q&A for data community by data community

python linux opensource sql kafka spark interview-questions dataengineering

Updated Jun 7, 2020
Python

CynthiaKoopman / Forecasting-Solar-Energy

Star

Forecasting Solar Power: Analysis of using a LSTM Neural Network

data-science lstm forecasting machinelearning deeplearning solar-energy renewable-energy dataengineering neuralnetworks solarpower electricity-grid forecasting-solar-power

Updated Feb 7, 2020
Jupyter Notebook

juyoung228 / Evolving_Basic

Star

Contains basic things (Data structure, Algorithm, Cracking coding Interview Q&A...etc) for Data engineers.

python java basic algorithms cracking-the-coding-interview interview data-structures coding-interviews dataengineering

Updated Aug 23, 2019
Jupyter Notebook

WaylonWalker / kedro-static-viz

Star

kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.

python data datapipeline dataengineering kedro

Updated Aug 1, 2020
Python

rajat5ranjan / AV-Quantum-Black-Data-Engineering-Hackathon

Star

Quantum Black Hackathon organised by Analytics Vidya

data-mining analytics dataengineering analytics-vidhya-competition

Updated Jul 23, 2019
Jupyter Notebook

adilkhash / luigi-course-materials

Star

Материалы для курса Введение в Data Engineering: дата пайплайны

python workflow-engine luigi datapipeline dataengineering dataeng

Updated Mar 9, 2020
Python

aksl20 / platform-ds

Star

This is a quick-and-dirty data analytics platform based on Spark, Hadoop and Jupyterhub. All this tools are deployed automatically with docker and docker-compose.

python docker platform devops data-science big-data spark hadoop jupyter analytics docker-compose jupyter-notebook datascience dataengineering

Updated Oct 10, 2019
Shell

icodeai / nbox

Star

ML Toolkit

postgresql dataengineering

Updated Sep 5, 2019
Python

WaylonWalker / kedro-action

Star

A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).

actions datapipeline dataengineering kedro

Updated Jun 1, 2020
Shell

Ashton-Sidhu / sysmon-extract

Star

Extract logs based off events from sysmon. Comes as a package, cli and ui.

data-science spark infosec sysmon dataengineering threat-intelligence threathunting streamlit

Updated May 22, 2020
Python

MNoorFawi / data-compression

Star

Compressing data using Python and compression techniques for better data storage and transfer.

python compression datascience dataengineering

Updated Sep 13, 2019
Python

NikhilPeri / data-land

Star

Perfect for small data

data-mining dataengineering

Updated Feb 16, 2019
Jupyter Notebook

egorfolley / DataCamp

Star

Courses and projects on Data Camp

machine-learning datascience data-analysis dataengineering datacamp dataengineer

Updated Jun 28, 2020
Python

ramapilli16 / CCA175-PySpark-Practice-with-solutions

Star

My Solutions to the practice tests provided at http://nn02.itversity.com/cca175/ by ITVersity.

spark hadoop cloudera sparksql spark-sql dataengineering cca175 pyspark-python cca-175

Updated Jul 15, 2020

pfife00 / stock-expectations

Star

Pipeline validation using Great Expectations library

aws kafka spark s3 python3 dash aws-ec2 validation-library dataengineering

Updated Jul 25, 2019
Python

aakashgoel12 / Play-DataStructure-Python-Data-Engineering

Star

python map lambda list datascience datastructure unique dataengineering

Updated Aug 29, 2020
Jupyter Notebook

vikrambhat249 / CurrencyExchange

Star

This code is used to connect to fixer.io api, Download the currency exchange rates as keeping EUR as base currency. It also provides functions to download the historical data for any date and the average conversion date between 2 dates.

python data-warehouse currency-exchange-rates sqllite dataengineering fixer-io requests-module latest-exchange-rates

Updated Nov 21, 2018
Python

Data_engineering_takehome

deepakksahu / Data_engineering_takehome

Star

Crosslend DE Assignment

api docker airflow python3 dataengineering dataengineer

Updated Mar 24, 2020
Python

wsdt / dataproject_2sm

Star

Simple default web application (university assignment)

students database dataengineering

Updated Jun 17, 2020
PHP

ericcgu / US_Births

Star

US_Births_1994-2014

python data-science pandas python3 datascience dataengineering

Updated Jun 21, 2018
Python

deepakksahu / Github_stargazer

Star

Github star counter

github api docker kubernetes jenkins airflow python3 stargazer dataengineering

Updated Mar 30, 2020
Python

qasim1020 / DisasterResponsePipeline

Star

In the Project Workspace, I'll find a data set containing real messages that were sent during disaster events. I will be creating a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency. This project will include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data. This project will show off my software skills, including your ability to create basic data pipelines and write clean, organized code.

data-science udacity-nanodegree dataengineering datavisualization

Updated Apr 26, 2020
Python

voorloopnul / pypeline

Star

Pypeline is a small library that help you process data (stream or batch) taking advantage of python multiprocessing library.

python bioinformatics stream pipeline datascience batch dataengineering

Updated Jan 1, 2020
Python

benjaminr / udacity-data-engineering

Star

Data Engineering

python data udacity dataengineering

Updated Jun 14, 2020
Jupyter Notebook

shweta-yadav15 / Disaster-Response-Pipeline

Star

Machine Learning Pipeline to categorize emergency messages based on the needs communicated by the sender.

text-classification nlp-machine-learning dataengineering etl-pipeline mlpipelines

Updated Jun 8, 2020
Jupyter Notebook

DragonYong / Jun-Dragon

Star

12 project for my we chart doc in every month

paper models structure technology cv mls nlp-machine-learning dataengineering

Updated Jun 28, 2020
Jupyter Notebook

Improve this page

Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."

You can’t perform that action at this time.