Skip to content
#

scipy

Here are 957 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Oct 1, 2020
  • Python
tadej-redstone
tadej-redstone commented Sep 22, 2020

When merging a dask dataframe, the resulting index is duplicated - seems to be because of the number of partitions. See example below:

import pandas as pd
import dask.dataframe as dd

a = dd.from_pandas(pd.DataFrame({'a': [1,2,3,4]}), npartitions=2)
b = pd.DataFrame({'a': [1,2,3,4], 'b': [2,3,4,5]})

a.merge(b, on='a').compute()

Returns

a b

|

verde

Improve this page

Add a description, image, and links to the scipy topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scipy topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.