data-pipeline

Description & context

Users can specify names for their nodes to identify them more easily. When a name is not explicitly specified, Kedro auto-generates a default name. You can see this in the name property on Node.
The current auto-generated name for a node looks something like this: func_name(inputs) -> outputs. (see implementation of __str__ method on the Node class)

This is

Write unit test coverage for SafeDataset and SafeDataLoader, along with the functions in utils.py.

There's a warning note in README.md detailing:

Warning - the AnalyzeDocument process from AWS Textract costs $50 per 1,000 PDF pages. Be careful when deploying this CDK stack as you could unintentionally rack up an expensive AWS bill quickly if you're not paying attention.

This might not be enough - if a user finds this project and doesn't read the documentation, they could inadvertently

~~The serial package being used supports multiple options beyond the baud rate which might be useful to expose as optional arguments.~~

We should add more options to the serial protocol. See comment below.

Allow trembita integration with grafana and prometheus for monitoring pipeline performance and visualisation of pipeline itself

data-pipeline

Here are 192 public repositories matching this topic...

quantumblacklabs / kedro

Description & context

adilkhash / Data-Engineering-HowTo

GoogleCloudPlatform / data-science-on-gcp

reugn / go-streams

infoslack / awesome-kafka

msamogh / nonechucks

timkpaine / tributary

unnati-xyz / scalable-data-science-platform

ubisoft / mobydq

productml / blurr

aeksco / aws-pdf-textract-pipeline

openbridge / ob_bulkstash

yc9701 / pansori

wybiral / hookah

alexcasalboni / serverless-data-pipeline-sam

outbrain / Aletheia

xtTech / dc-sdk-js

itkpi / trembita

tikal-fuseday / delta-architecture

cid-harvard / pandas-to-postgres

ianlini / feagen

ooni / pipeline

electronick1 / stairs

aslotte / mldotnet-real-time-data-streaming-workshop

jay-johnson / network-pipeline

newsdev / nyt-entity-service

shirosaidev / saisoku

ixlan / machine-learning-data-pipeline

rayyan17 / jobAnalytics_and_search

mdh266 / AirflowETL

Improve this page

Add this topic to your repo