Skip to content
#

dask

Here are 289 public repositories matching this topic...

gjoseph92
gjoseph92 commented Mar 2, 2022

I naively tried to do dd.merge(a, b, on="column_with_ten_values"), where a and b were both large DataFrames with thousands of partitions each.

Eventually the compute failed with:

[File /opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/multi.py:275, in merge_chunk()

File /opt/conda/envs/coiled/lib/python3.9/site-packages/pandas/core/frame.py:9329, i
good first issue dataframe documentation
KazukiOnodera
KazukiOnodera commented Mar 29, 2022

Describe the bug
cudf.pivot doesn't understand "values" when it's column name instead of list and the column name includes numeric

Steps/Code to reproduce bug

import cudf
df = cudf.DataFrame([
    ['A', 'a', 0, 0, 0],
    ['A', 'b', 1, 1, 1],
    ['A', 'c', 2, 2, 2],
    ['B', 'a', 0, 0, 0],
    ['B', 'b', 1, 1, 1],
    ['B', 'c', 2, 2, 2],
    ['C', 'a', 0, 0, 0],
    [
bug good first issue cuDF (Python)
xarray
jacobtomlinson
jacobtomlinson commented Jan 14, 2021

What happened:

When creating a LocalCluster object the comm is started on a random high port, even if there are no other clusters running.

What you expected to happen:

Should use port 8786.

Minimal Complete Verifiable Example:

$ conda create -n dask-lc-test -c conda-forge -y python=3.8 ipython dask distributed
$ conda activate dask-lc-test

The `d

bug good first issue
gerritholl
gerritholl commented Jan 12, 2022

Feature Request

Is your feature request related to a problem? Please describe.

Whenever I report a bug, I need to confirm what satpy version I am using. This is of course important, but it's also an extra step that could be semi-automated.

Describe the solution you'd like

I would like that debug_on() prints the relevant versions. When we report bugs, we anyway call `debu

enhancement good first issue
djhoese
djhoese commented Feb 22, 2021

Code Sample, a minimal, complete, and verifiable piece of code

from pyresample.boundary import Boundary
b = Boundary(my_lons, my_lats)
print(b.contour_poly.area())

Problem description

The above code doesn't fail if the provided lons/lats are 2D (not sure on 3D+), but the class and all functions/utilities underneath it assume 1D arrays. The end results are incor

nils-braun
nils-braun commented Feb 5, 2021

The ML implementation is still a bit experimental - we can improve on this:

  • SHOW MODELS and DESCRIBE MODEL
  • Hyperparameter optimizations, AutoML-like behaviour
  • @romainr brought up the idea of exporting models (#191, still missing: onnx - see discussion in the PR by @rajagurunath)
  • and some more showcases and examples
good first issue machine learning
lesteve
lesteve commented May 19, 2020
from dask_jobqueue import SLURMCluster 
cluster = SLURMCluster(cores=1, memory='1GB') 
print(cluster.job_script()) 
#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -n 1
#SBATCH --cpus-per-task=1
#SBATCH --mem=954M
#SBATCH -t 00:30:00

/home/lesteve/miniconda3/bin/python -m distributed.cli.dask_worker tcp://192.168.0.11:44065 --nthreads 1 --memory-limit 1000.00MB -
help wanted good first issue
NeroCorleone
NeroCorleone commented Aug 11, 2020

Problem description

Reading a dataset with eager's read functionality raises a ValueError when providing columns.

Example code (ideally copy-pastable)

import pandas as pd

from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url

from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_data
good first issue usability
climpred
RichardScottOZ
RichardScottOZ commented Mar 25, 2021

Without thinking I put resampling="bilinear" and got an error when I called .compute()

Traceback (most recent call last):
  File "carajas.py", line 92, in <module>
    band_medianNP = band_median.compute()
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/xarray/core/dataarray.py", line 899, in compute
    return new.load(**kwargs)
  File "/home/ubuntu/anaco
good first issue
timdnewman
timdnewman commented Jan 26, 2022

Hi,

System and Software

  • aicsimageio Version: 4.5.2
  • Python Version: 3.10
  • Operating System: Windows 10

Description

aiscimageio installs fine but when we try pip install aiscimageio[czi]1 it fails with the following error

ERROR: Command errored out with exit status 1:
command: 'C:\Program Files\Python310\python.exe' -u -c 'import io, os, sys, setuptools, to

good first issue admin
zblz
zblz commented Aug 15, 2017

Currently all of the metrics computed are independent of a target variable or column, but if lens.summarise took the name of a column as the target variable, the output of some metrics could be more interpretable even if the target variable is not used in any kind of predictive modelling.

A good example of this could be PCA (see #14), which could plot the different categories of the target va

Improve this page

Add a description, image, and links to the dask topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dask topic, visit your repo's landing page and select "manage topics."

Learn more