dask

Is your feature request related to a problem? Please describe.

I am trying to drop particular indices from a dimension that doesn't have coordinates.

Following: [drop_sel() documentation](http://xarray.pydata.org/en/stable/generated/xarray.Dataset.drop_sel.html#xarray.Dataset.drop_se

Now that mpdist has been implemented, it would be useful to have an mpdist tutorial

The paper is here

The support website is here

The goal of this tutorial would be to reproduce Figure 5 from the paper

In our API docs we currently use

.. autosummary::
   Client
   Client.call_stack
   Client.cancel
    ...

To generate a table of Client methods at the top of the page. Later on we use

.. autoclass:: Client
   :members:

to display the docstrings for all the public methods on Client (here an example for

Describe the bug
According to the multiscene documentation, the property all_same_area does:

Determine if all contained Scenes have the same ‘area’.

However, I have created a multiscene where all scenes have the same area (they just differ between datasets), yet the property returns Fa

There are various examples of how to define an AreaDefinition throughout the documentation and in test files like https://github.com/pytroll/pyresample/blob/master/pyresample/test/test_files/areas.yaml. However, some of these have projection definitions with only a defined (equatorial radius). This is technically incomplete and although it works in PROJ, rasterio/GDAL are a little more strict an

from dask_jobqueue import SLURMCluster 
cluster = SLURMCluster(cores=1, memory='1GB') 
print(cluster.job_script())

#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -n 1
#SBATCH --cpus-per-task=1
#SBATCH --mem=954M
#SBATCH -t 00:30:00

/home/lesteve/miniconda3/bin/python -m distributed.cli.dask_worker tcp://192.168.0.11:44065 --nthreads 1 --memory-limit 1000.00MB -

Problem description

Our dask update graphs are not properly optimized.

We ussually use dask.dataframe optimization and set ave_width=repartition_ratio for kartothek.io.dask.dataframe.update_dataset_from_ddf graphs. We should return an optimized graph from update_dataset_from_ddf to make our users' life simple.

We already have code that does this, whoever picks this up can ping me

Currently all of the metrics computed are independent of a target variable or column, but if lens.summarise took the name of a column as the target variable, the output of some metrics could be more interpretable even if the target variable is not used in any kind of predictive modelling.

A good example of this could be PCA (see #14), which could plot the different categories of the target va

I've been using nb_black lately and it's wonderful. https://github.com/dnanhkhoa/nb_black

We can add this with %load_ext lab_black for our notebook examples and it will auto-format every cell to fit black to make it more clean and readable without us having to worry about it.

As the server speaks the presto protocol, which is understood by most of the BI tools, it would be nice to test and showcase it with some of them.
For example I have done a very quick test with both hue and metabase, which look promising - but also have shown some additional quirks in the server implementation.

dask

Here are 193 public repositories matching this topic...

dask / dask

pydata / xarray

TDAmeritrade / stumpy

jmcarpenter2 / swifter

dask / distributed

ironmussa / Optimus

itamarst / eliot

pytroll / satpy

ranaroussi / pystore

timkpaine / paperboy

JiaweiZhuang / xESMF

pytroll / pyresample

dask / dask-jobqueue

JDASoftwareGroup / kartothek

Problem description

dask / dask-ec2

facultyai / lens

ironmussa / Bumblebee

pangeo-data / climpred

nils-braun / dask-sql

dymaxionlabs / dask-rasterio

chmp / framequery

dask / knit

NCAR / ncar-python-tutorial

JSybrandt / agatha

LDO-CERT / orochi

MITgcm / xmitgcm

dgerlanc / dask-scaling-dataframe

radix-ai / graphchain

jrbourbeau / madpy-dask

lesommer / oocgcm

Improve this page

Add this topic to your repo