This is a quick-and-dirty data analytics platform based on Spark, Hadoop and Jupyterhub. All this tools are deployed automatically with docker and docker-compose.
A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).
This code is used to connect to fixer.io api, Download the currency exchange rates as keeping EUR as base currency. It also provides functions to download the historical data for any date and the average conversion date between 2 dates.
In the Project Workspace, I'll find a data set containing real messages that were sent during disaster events. I will be creating a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency. This project will include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data. This project will show off my software skills, including your ability to create basic data pipelines and write clean, organized code.
I'm new to the idea of Data Vault 2.0 and I'm reading the Dan Linstedt book to understand it better.
To help me get a better perspective of how dbtvault works I would like to know how difficult do you think it would be to add support for BigQuery?
Are there specific features of Snowflake which makes it better for running dbt/dbtvault ?
Thanks,
Jacob