Skip to content
#

dataframe

Here are 606 public repositories matching this topic...

Ben-Epstein
Ben-Epstein commented Mar 12, 2022

Thank you for reaching out and helping us improve Vaex!

Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.

Description
Please provide a clear and concise description of the problem. This should contain all the steps nee

ttnghia
ttnghia commented Jun 15, 2022

Now we are going to have set operations (rapidsai/cudf#11043). To be consistent with other libraries/framework (like Presto: https://prestodb.io/docs/current/functions/array.html), we should rename lists::drop_list_duplicates into lists::distinct. The implementation should be moved into set_operations.hpp|cu to be easily located and for consistency, as mentioned above

feature request good first issue libcudf helps: Spark
danfojs
kylemcdonald
kylemcdonald commented Mar 2, 2022

I would like to convert a DataFrame to a JSON object the same way that Pandas does with to_dict().

toJSON() treats rows as elements in an array, and ignores the index labels. But to_dict() uses the index as keys.

Here is an example of what I have in mind:

function to_dict(df) {
    const rows = df.toJSON();
    const entries = df.index.map((e, i) => ({ [e]: rows[i] }));
  
enhancement good first issue
gam-phon
gam-phon commented Jul 3, 2022

Ulcer Index

in development version: I am ignoring everget

highest_close = close.rolling(length).max()
downside = scalar * (close - highest_close)
downside /= highest_close
d2 = downside * downside
_ui = d2.rolling(length).sum()
ui = np.sqrt(_ui / length)

In development version, sometime I am getting RuntimeWarning: invalid value encountered in sqrt after searching abo

bug help wanted good first issue
andygrove
andygrove commented Jul 6, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Someone is asking about this feature on StackOverflow: https://stackoverflow.com/questions/72876397/impute-and-add-new-calculated-column-with-rust-datafusion

Describe the solution you'd like
Add with_column(&self, name: &str, expr: Expr). This should add the new column to the data f

enhancement good first issue
DataFrame
anks7190
anks7190 commented Jan 27, 2021

Hi ,

I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?

help wanted good first issue available for hacking infrastructure
pdpipe
yarkhinephyo
yarkhinephyo commented Nov 28, 2021

For pipeline stages provided by the pdpipe.basic_stages, supplying conditions to the prec and post keyword arguments may not return the correct error messages.

Example Code

import pandas as pd; import pdpipe as pdp;
df = pd.DataFrame([[1,4],[4,5],[1,11]], [1,2,3], ['a','b'])
pline = pdp.PdPipeline([
  pdp.FreqDrop(2, 'a', prec=pdp.cond.HasAllColumns(['x']))
])
pline.apply(
skrawcz
skrawcz commented May 11, 2022

Is your feature request related to a problem? Please describe.
The friction to getting the examples up and running is installing the dependencies. A docker container with them already provided would reduce friction for people to get started with Hamilton.

Describe the solution you'd like

  1. A docker container, that has different python virtual environments, that has the dependencies t
documentation good first issue help wanted

Improve this page

Add a description, image, and links to the dataframe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataframe topic, visit your repo's landing page and select "manage topics."

Learn more