Skip to content
#

scipy

Here are 929 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Jul 24, 2020
  • Python
tadej-redstone
tadej-redstone commented Sep 22, 2020

When merging a dask dataframe, the resulting index is duplicated - seems to be because of the number of partitions. See example below:

import pandas as pd
import dask.dataframe as dd

a = dd.from_pandas(pd.DataFrame({'a': [1,2,3,4]}), npartitions=2)
b = pd.DataFrame({'a': [1,2,3,4], 'b': [2,3,4,5]})

a.merge(b, on='a').compute()

Returns

a b

|

Improve this page

Add a description, image, and links to the scipy topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scipy topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.