Skip to content
#

data-engineering

Here are 1,009 public repositories matching this topic...

superset
lakeFS
ozkatz
ozkatz commented Nov 7, 2021

What

being able to take a data object (or prefix, like a partition) and get back the commit that added/modified it.

Why

This is valuable lineage information that is currently available in lakeFS but not exposed easily, and mimics the behavior of git blame

How

Given the lakeFS API already supports listing the log of commits for an object or prefix (🎉), this could be a `

A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.

  • Updated Nov 15, 2021
anks7190
anks7190 commented Jan 27, 2021

Hi ,

I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?

edublancas
edublancas commented Oct 27, 2021

The load_dotted_path raises the following error if unable to load the module:

Traceback (most recent call last):
  File "/Users/Edu/Desktop/import-error/script.py", line 4, in <module>
    load_dotted_path('tests.quality.fn')
  File "/Users/Edu/dev/ploomber/src/ploomber/util/dotted_path.py", line 128, in load_dotted_path
    module = importlib.import_module(mod)
  File "/Users/
davidradl
davidradl commented Nov 17, 2021

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

A large amount of output goes to the log, this should not happen by default.

Expected Behavior

much less content in the output of the FVT and the build bu default

Switch on debug in the logging configuration and then see all the output.

Steps To Reproduce

run the build

Env

Improve this page

Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."

Learn more