Skip to content
#

dataframe

Here are 541 public repositories matching this topic...

danfojs
goodPointP
goodPointP commented Nov 22, 2021

It would be really useful if there was a method that could insert a column into an existing Dataframe between two existing columns. I know about .addColumn, but that seems to place the new column at the end of the Dataframe.

For example:

df.print()

A | B 
======
7 | 5
3 | 6

df.insert({ "afterColumn": "A", "newColumnName": "C", "data": [4,1], inplace: true })
df.print()

alamb
alamb commented Nov 28, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We have had performance regressions in the TPCH plans due to plan changes (e.g. apache/arrow-datafusion#1367)

In order to avoid this situation in the future, we would like to have the explain plans committed as tests (so we can evaluate any changes to those plans

tv
DataFrame
anks7190
anks7190 commented Jan 27, 2021

Hi ,

I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?

pdpipe
shaypal5
shaypal5 commented Nov 19, 2021

If I'm using a pdp.cond.Condition object, for example pdp.cond.HasAllColumns(['a', 'b']), as a precondition (or a post-condition) in my pipeline stage and it fails, I expect an informative exception message, such as "Not all required columns ['a', 'b'] found in input dataframe" (or better yet, "Required columns ['a'] not found in input dataframe", in the case only "a" is missing).

Thi

Improve this page

Add a description, image, and links to the dataframe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataframe topic, visit your repo's landing page and select "manage topics."

Learn more