data-engineering
Here are 1,108 public repositories matching this topic...
-
Updated
Jan 30, 2022
-
Updated
Jan 2, 2022
-
Updated
Jan 25, 2022
Current behavior
You get an error if you try to upload the same file name
azure.core.exceptions.ResourceExistsError: The specified blob already exists.
RequestId:5bef0cf1-b01e-002e-6
Proposed behavior
The task should take in an overwrite argument and pass it to [this line](https://github.com/PrefectHQ/prefect/blob/6cd24b023411980842fa77e6c0ca2ced47eeb83e/src/prefect/
Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
- make a batch from a long query string
- run validation
- render result to data docs
- See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4
Tell us about the problem you're trying to solve
Add streams not supported to Chargebee connector.
Describe the solution you’d like
A clear and concise description of what you want to see happen, or the change you would like to see
Describe the alternative you’ve considered or used
A clear and concise description of any alternative solutions or features you've considered or are u
Under the hood, Benthos csv input uses the standard encoding/csv packages's csv.Reader struct.
The current implementation of csv input doesn't allow setting the LazyQuotes field.
We have a use case where we need to set the LazyQuotes field in order to make things work correctly.
The current DynamoDB implementation does sequential gets (https://github.com/feast-dev/feast/blob/master/sdk/python/feast/infra/online_stores/dynamodb.py#L163)
Possible Solution
A better approach is to do some multi-get operation or at least run these queries in parallel and collect the results.
-
Updated
Feb 1, 2022 - Python
Description
When uploading from a windows machine using the command line. The directory structure is not preserved. All the files are named as files with the directory names prepended to it.
Maybe, as the source machine is windows the path for files are not separated by / instead are separated by \, which leads to such behavior.
Steps to Reproduce
- Have a recursive directory s
-
Updated
Jan 17, 2022
When we show data for a metric, we currently don't include the current day's worth of data. For users just getting set up, they may only have events from today, and want to test out if the query is working, and by excluding events from 'today', they can't see results.
TODO:
- In
packages/back-end/src/services/experiments.tson line329, instead of using the current date as the value
-
Updated
Aug 2, 2021 - JavaScript
-
Updated
Feb 1, 2022 - Jupyter Notebook
If a user runs any ploomber command and passes -e pipeline.yaml in a directory that doesn't have such a file, the error isn't very clear:
ploomber status -e pipeline.yamlTraceback (most recent call last):
File "/Users/Edu/dev/ploomber/src/ploomber/cli/io.py", line 34, in wrapper
fn(**kwargs)
File "/Users/Edu/dev/ploomber/src/ploomber/cli/status.py", line 15-
Updated
Dec 31, 2021
-
Updated
Feb 1, 2022 - Jupyter Notebook
-
Updated
Mar 9, 2020 - Python
if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.
`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)
@classmethod
def create_testing_pyspark_session(cls):
return Sp
Background
This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.
Criteria reiterated here for the benefit of discussion:
It sh
-
Updated
Feb 1, 2022
-
Updated
Jan 27, 2022 - Python
The height of selected field's number is slightly smaller than the non-selected ones.
It is hard to notice. However, if you select and deselect a few times you'll see the line moving up and down.
We should fix the height such that it is same regardless of it being selected or not
![image](https://user-images.githubusercontent.com/5906984/151686275-8b03668f-5226-4223-a561-2c0fe6444019.pn
-
Updated
Jun 2, 2021
-
Updated
Mar 5, 2020 - Python
-
Updated
Oct 25, 2021
-
Updated
Feb 1, 2022 - Python
-
Updated
Nov 6, 2021 - Ruby
Improve this page
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."
Latest Version of Apache Superset from docker image.
When the feature flag SQLLAB_BACKEND_PERSISTENCE is set to true in SQL lab upon selecting a schema receiving superset error trouble finding schema.
And seeing this error in logs upon trying to select schema:
404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and tr