pydata
Here are 82 public repositories matching this topic...
-
Updated
Nov 11, 2020 - Python
Clients created with worker_client or get_client don't respect the timeout settings (e.g. distributed.comm.timeouts.connect. The timeout is available to set programmatically, but defaults to 3 rather than falling back to the config file. I think this should be as simple as replacing timeout=3 with timeout=None throughout that code path.
Instructions for how to install pyjanitor via pipenv
Some folks might use pipenv for environment management. The recent update requires a prerelease dependency (black, as menti
-
Updated
Dec 27, 2016
Problem description
Our dask update graphs are not properly optimized.
We ussually use dask.dataframe optimization and set ave_width=repartition_ratio for kartothek.io.dask.dataframe.update_dataset_from_ddf graphs. We should return an optimized graph from update_dataset_from_ddf to make our users' life simple.
We already have code that does this, whoever picks this up can ping me
-
Updated
Jan 12, 2018 - HTML
-
Updated
Oct 18, 2016 - Jupyter Notebook
In trying to write tests for #189, I'm finding very difficult to add columns to existing tests, as in some cases like the all_types table, the table is defined in a separate file than the tests and multiple tests try to write to the same table.
Additionally, our test suite doesn't prove that the data that are uploaded are the same as the data downloaded for all types.
We should consider m
-
Updated
Oct 17, 2020 - Python
-
Updated
Jul 30, 2017 - Jupyter Notebook
-
Updated
Nov 5, 2020 - Jupyter Notebook
-
Updated
Nov 11, 2020 - Jupyter Notebook
-
Updated
Jun 4, 2020 - Jupyter Notebook
-
Updated
Jul 2, 2018 - Jupyter Notebook
-
Updated
Aug 14, 2018 - HTML
The assert_array_shape call is used a lot in REGENIE, and serves both as a runtime check and as documentation for the reader. We should sprinkle it liberally through other functions.
-
Updated
Nov 16, 2018 - Jupyter Notebook
-
Updated
Sep 14, 2017 - Jupyter Notebook
-
Updated
Sep 24, 2018 - Shell
-
Updated
Sep 2, 2017 - Jupyter Notebook
-
Updated
Feb 19, 2019 - Jupyter Notebook
-
Updated
Jul 9, 2018 - Python
Improve this page
Add a description, image, and links to the pydata topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the pydata topic, visit your repo's landing page and select "manage topics."
I just ran into an issue when trying to use
to_csvwith distributed workers that don't share a file system. I shouldn't have been surprised that writing to a local file system from a distributed worker doesn't work. It shouldn't work. But the error I got was just aFile Not Founderror. That brought me to:dask/dask#2656 (comment) - which was the answer.