dask
Here are 289 public repositories matching this topic...
Describe the bug
cudf.pivot doesn't understand "values" when it's column name instead of list and the column name includes numeric
Steps/Code to reproduce bug
import cudf
df = cudf.DataFrame([
['A', 'a', 0, 0, 0],
['A', 'b', 1, 1, 1],
['A', 'c', 2, 2, 2],
['B', 'a', 0, 0, 0],
['B', 'b', 1, 1, 1],
['B', 'c', 2, 2, 2],
['C', 'a', 0, 0, 0],
[
pydata/xarray#5865 (reply in thread)
I wonder if it's possible to implement a built-in function like:
da.str.format("%.2f") or xr.string_format(da, "%.2f)
To wrap:
import xarray as xr
da = xr.DataArray([5., 6., 7.])
das = xr.DataArray("%.2f")
das.str % da
<xarray.DataArray (dim_0: 3)>
array(['5.00', '6.00', '7.00'], dtype='<U4')
Dim
Is your feature request related to a problem? Please describe.
Implements classification_report for classification metrics.(https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html)
The stumpy.snippets feature is now completed in #283 which follows this work:
We have a rough notebook t
-
Updated
Feb 7, 2022 - Python
What happened:
When creating a LocalCluster object the comm is started on a random high port, even if there are no other clusters running.
What you expected to happen:
Should use port 8786.
Minimal Complete Verifiable Example:
$ conda create -n dask-lc-test -c conda-forge -y python=3.8 ipython dask distributed
$ conda activate dask-lc-testThe `d
-
Updated
Mar 31, 2022 - Python
-
Updated
Dec 24, 2021 - Python
Feature Request
Is your feature request related to a problem? Please describe.
Whenever I report a bug, I need to confirm what satpy version I am using. This is of course important, but it's also an extra step that could be semi-automated.
Describe the solution you'd like
I would like that debug_on() prints the relevant versions. When we report bugs, we anyway call `debu
-
Updated
Feb 17, 2022 - Python
Code Sample, a minimal, complete, and verifiable piece of code
from pyresample.boundary import Boundary
b = Boundary(my_lons, my_lats)
print(b.contour_poly.area())Problem description
The above code doesn't fail if the provided lons/lats are 2D (not sure on 3D+), but the class and all functions/utilities underneath it assume 1D arrays. The end results are incor
-
Updated
Feb 9, 2022 - Python
-
Updated
Jan 11, 2022 - Python
The ML implementation is still a bit experimental - we can improve on this:
-
SHOW MODELSandDESCRIBE MODEL - Hyperparameter optimizations, AutoML-like behaviour
- @romainr brought up the idea of exporting models (#191, still missing: onnx - see discussion in the PR by @rajagurunath)
- and some more showcases and examples
Does HyperGBM's make_experiment return the best model?
How does it work on paramter tuning? It's say that, what's its seach space (e.g. in XGboost)???
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(cores=1, memory='1GB')
print(cluster.job_script()) #!/usr/bin/env bash
#SBATCH -J dask-worker
#SBATCH -n 1
#SBATCH --cpus-per-task=1
#SBATCH --mem=954M
#SBATCH -t 00:30:00
/home/lesteve/miniconda3/bin/python -m distributed.cli.dask_worker tcp://192.168.0.11:44065 --nthreads 1 --memory-limit 1000.00MB -
-
Updated
Mar 30, 2022 - Python
Problem description
Reading a dataset with eager's read functionality raises a ValueError when providing columns.
Example code (ideally copy-pastable)
import pandas as pd
from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url
from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_dataNWP examples
Example for numerical weather prediction
to be added to initialised datasets
Data sources (to) implement(ed):
- GEFS https://www.ncei.noaa.gov/thredds/catalog/model-gefs-003/202008/20200831/catalog.html
- DWD https://opendata.dwd.de/weather/nwp/
relates to #600
-
Updated
Mar 31, 2022 - JavaScript
-
Updated
Mar 16, 2022 - Python
-
Updated
Mar 31, 2022 - Vue
Passing resampling
Without thinking I put resampling="bilinear" and got an error when I called .compute()
Traceback (most recent call last):
File "carajas.py", line 92, in <module>
band_medianNP = band_median.compute()
File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/xarray/core/dataarray.py", line 899, in compute
return new.load(**kwargs)
File "/home/ubuntu/anacoHi,
System and Software
- aicsimageio Version: 4.5.2
- Python Version: 3.10
- Operating System: Windows 10
Description
aiscimageio installs fine but when we try pip install aiscimageio[czi]1 it fails with the following error
ERROR: Command errored out with exit status 1:
command: 'C:\Program Files\Python310\python.exe' -u -c 'import io, os, sys, setuptools, to
Currently all of the metrics computed are independent of a target variable or column, but if lens.summarise took the name of a column as the target variable, the output of some metrics could be more interpretable even if the target variable is not used in any kind of predictive modelling.
A good example of this could be PCA (see #14), which could plot the different categories of the target va
Improve this page
Add a description, image, and links to the dask topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dask topic, visit your repo's landing page and select "manage topics."
I naively tried to do
dd.merge(a, b, on="column_with_ten_values"), whereaandbwere both large DataFrames with thousands of partitions each.Eventually the compute failed with: