dataframe

vaex.from_arrays(s=['a,b']).s.str.replace(r'(\w+)',r'--\g<1>==',regex=True)

when using capture group in str, it fails, while str_pandas.replace() is correct

Name: vaex
Version: 4.6.0
Summary: Out-of-Core DataFrames to visualize and explore big tabular datasets
Home-page:

Versions

Python 3.9 / Polars 0.10.27 / Windows 10

Describe your bug / reproduce behaviour

>>> # create trivial float series and observe the resulting repr
>>> import polars as pl
>>> pl.from_records( data=[1.0, 0.0, -1.0], columns=['test'] )

shape: (3, 1)
┌──────┐
│ test │
│ ---  │
│ f64  │
╞══════╡
│ 1    │   # <- integer repr
├╌╌╌╌╌╌┤
│ 0.0  │   # <- fl

Describe the bug

Failed to execute Series.drop_duplicates.

In [75]: a = md.DataFrame(np.random.rand(10, 2), columns=['a', 'b'], chunk_size=2)                  

In [76]: a['a'].drop_duplicates().execute()

Which version are you running? The lastest version is on Github. Pip is for major releases.

import pandas_ta as ta
print(ta.version)

pandas-ta .3.2b0
Do you have TA Lib also installed in your environment?

$ pip list

TA-Lib = .4.19

Upgrade.

$ pip install -U git+https://github.com/twopirllc/pandas-ta

upgraded to .3.14b0
Same CMF resul

Is your feature request related to a problem? Please describe.
The Series.map() function should enable the usage of index in the passed lambda, just like the normal Array.map() function does. My example use case is calculating a moving average, which requires referencing values next to the current position in the Series.

Describe the solution you'd like
I would like to be able to writ

@matthewmturner

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We would like to compute a power function (like 2**4 = 2 * 2 * 2 * 2) in datafusion

@matthewmturner asks for it here: apache/arrow-datafusion#147 (comment)

Describe the solution you'd like

Implement the power function as described in

Example:
In the image below the word starships should begin on a new line to avoid being split.

Terminal width is provided to determine how many columns to print. The terminal width or the total width of the column headers may be used to wrap the text in the footer.

Hi ,

I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?

For pipeline stages provided by the pdpipe.basic_stages, supplying conditions to the prec and post keyword arguments may not return the correct error messages.

Example Code

import pandas as pd; import pdpipe as pdp;
df = pd.DataFrame([[1,4],[4,5],[1,11]], [1,2,3], ['a','b'])
pline = pdp.PdPipeline([
  pdp.FreqDrop(2, 'a', prec=pdp.cond.HasAllColumns(['x']))
])
pline.apply(

dataframe

Here are 551 public repositories matching this topic...

vaexio / vaex

modin-project / modin

haifengl / smile

pola-rs / polars

Versions

Describe your bug / reproduce behaviour

databricks / koalas

jtablesaw / tablesaw

adamerose / PandasGUI

mars-project / mars

ballista-compute / ballista

twopirllc / pandas-ta

javascriptdata / danfojs

apache / arrow-datafusion

alexhallam / tv

hosseinmoein / DataFrame

microsoft / Mobius

sngyai / Sequoia

RedisLabs / spark-redis

pyjanitor-devs / pyjanitor

rocketlaunchr / dataframe-go

uwdata / arquero

pdpipe / pdpipe

Example Code

MrPowers / spark-daria

andygrove / datafusion

Squarespace / datasheets

shramos / Awesome-Cybersecurity-Datasets

sfu-db / connector-x

michaelchu / optopsy

dmnfarrell / pandastable

techascent / tech.ml.dataset

Gmousse / dataframe-js

Improve this page

Add this topic to your repo