data-engineering
Here are 801 public repositories matching this topic...
-
Updated
May 15, 2021
-
Updated
Apr 2, 2021
-
Updated
Apr 10, 2021
Current behavior
Right now, the connection string to Azure can be passed as a string at initialization or read AZURE_STORAGE_CONNECTION_STRING from the environment.
The connection string property is not serialized with the storage object. The only way to get this to work is to have AZURE_STORAGE_CONNECTION_STRING available when the flow is retrieved from storage. For most agent types, t
Describe the bug
When trying to run scaffolding (profiling) command, it fails because of commas in columns.
To Reproduce
Steps to reproduce the behavior:
- Run
great_expectations suite scaffold scaffold-nameon datasource where commas are in column - Bug
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 5323 saw 2
Expected behavior
D
-
Updated
May 15, 2021 - Go
-
Updated
Jan 13, 2021
-
Updated
May 13, 2021 - Python
-
Updated
May 10, 2021 - JavaScript
Auth system test
The current system tests use the default admin user for all requests.
The auth test should create users with less privileges and check actions are passing or blocked according to the permissions.
-
Updated
May 14, 2021 - Jupyter Notebook
-
Updated
May 10, 2021 - Jupyter Notebook
-
Updated
Mar 9, 2020 - Python
Hi ,
I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?
if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.
`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)
@classmethod
def create_testing_pyspark_session(cls):
return Sp
-
Updated
May 15, 2021
-
Updated
Mar 5, 2020 - Python
-
Updated
May 13, 2021 - R
-
Updated
May 10, 2021 - Ruby
-
Updated
May 14, 2021
-
Updated
Feb 13, 2021
-
Updated
May 10, 2021 - TypeScript
-
Updated
Feb 7, 2021 - CSS
-
Updated
Nov 29, 2018 - Java
-
Updated
May 11, 2021 - Python
Same as #281 but for SQLAlchemy client:
We can use the flavor property in the constructor to determine whether we are dealing with a sqlite db or not.
-
Updated
May 15, 2021
-
Updated
Apr 20, 2020 - Python
Improve this page
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."


Screenshot
Description
Unnecessary scrollbar
Design input
Please remove them.
cc @yousoph