#
dataengineering
Here are 145 public repositories matching this topic...
A Data Platform built for AWS, powered by Kubernetes.
kubernetes
aws
jupyter
analytics
gpu
jupyterhub
data-analysis
redshift
mach
workbench
datalake
dataengineering
eks
eks-cluster
orbit-workbench
-
Updated
Sep 2, 2021 - Python
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
productivity
data-science
data
production
build-automation
datascience
build-tool
build-system
dataengineering
dataeng
-
Updated
Jul 15, 2019 - Python
Predict stock price based on financial news feeds
-
Updated
Apr 6, 2018 - Jupyter Notebook
ELT for the DataOps era- open source data integration tool. This is a read-only mirror of https://gitlab.com/meltano/meltano
open-source
tap
data
opensource
integration
pipelines
target
dataops
loaders
elt
extract-data
data-pipelines
singer
connectors
dataengineering
targets
taps
meltano
dataops-platform
meltano-sdk
-
Updated
Sep 2, 2021 - Python
Apply for a job at Olist's Data Team: http://bit.ly/olist-bsa
-
Updated
May 21, 2020
Data engineering interviews Q&A for data community by data community
-
Updated
Jun 7, 2020 - Python
data
grafana
orchestration
statsd-metrics
dataengineering
workflow-orchestration
airflow-metrics
statsd-format
airflow-containers
-
Updated
Jul 10, 2021 - Python
kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.
-
Updated
Sep 2, 2021 - Python
Forecasting Solar Power: Analysis of using a LSTM Neural Network
data-science
lstm
forecasting
machinelearning
deeplearning
solar-energy
renewable-energy
dataengineering
neuralnetworks
solarpower
electricity-grid
forecasting-solar-power
-
Updated
Feb 7, 2020 - Jupyter Notebook
Contains basic things (Data structure, Algorithm, Cracking coding Interview Q&A...etc) for Data engineers.
python
java
basic
algorithms
cracking-the-coding-interview
interview
data-structures
coding-interviews
dataengineering
-
Updated
Aug 23, 2019 - Jupyter Notebook
A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).
-
Updated
Jun 16, 2021 - Shell
Материалы для курса Введение в Data Engineering: дата пайплайны
-
Updated
Apr 8, 2021 - Python
Courses and projects on Data Camp
-
Updated
Jun 28, 2020 - Python
Quantum Black Hackathon organised by Analytics Vidya
-
Updated
Jul 23, 2019 - Jupyter Notebook
Projeto do grupo 3GTeam apresentado no Hackathon de Engenharia de Dados da A3Data no mês de Junho de 2021.
-
Updated
Jun 26, 2021 - Python
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
python
aws
big-data
spark
aws-emr
pyspark
dataengineering
big-data-analytics
ec2-spot
emr-cluster
wordcloud-generator
ec2-spot-instances
-
Updated
Jul 18, 2021 - Python
Dockerizing an Apache Spark Standalone Cluster
docker
apache-spark
hive
docker-compose
pyspark
hdfs
hadoop-cluster
hue
hadoop-docker
dataengineering
hive-metastore
dataengineer
-
Updated
Aug 7, 2021 - VBA
Compressing data using Python and compression techniques for better data storage and transfer.
-
Updated
Sep 13, 2019 - Python
This is a quick-and-dirty data analytics platform based on Spark, Hadoop and Jupyterhub. All this tools are deployed automatically with docker and docker-compose.
python
docker
platform
devops
data-science
big-data
spark
hadoop
jupyter
analytics
docker-compose
jupyter-notebook
datascience
dataengineering
-
Updated
Oct 10, 2019 - Shell
Data Scraping, Data Models/ORMs, Workflow code commits
-
Updated
Jan 15, 2021 - Python
Pipeline validation using Great Expectations library
-
Updated
Jul 25, 2019 - Python
Extract logs based off events from sysmon. Comes as a package, cli and ui.
-
Updated
May 22, 2020 - Python
My Solutions to the practice tests provided at http://nn02.itversity.com/cca175/ by ITVersity.
-
Updated
Jul 15, 2020
12 project for my we chart doc in every month
-
Updated
Jun 28, 2020 - Jupyter Notebook
Prescreening Tasks for Data Engineer
-
Updated
Jul 26, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."
I'm new to the idea of Data Vault 2.0 and I'm reading the Dan Linstedt book to understand it better.
To help me get a better perspective of how dbtvault works I would like to know how difficult do you think it would be to add support for BigQuery?
Are there specific features of Snowflake which makes it better for running dbt/dbtvault ?
Thanks,
Jacob