Skip to content
#

arrow

Here are 251 public repositories matching this topic...

miguelusque
miguelusque commented Jan 15, 2022

Is your feature request related to a problem? Please describe.
Hi,

While porting some code from Pandas to cuDF, I have noticed that cuDF series do not support unstack method.
As an additional request, It would be great if fill_values could be supported in both cudf.DataFrame.unstack and cudf.Series.unstack methods. Thanks!

Describe the solution you'd like
To have that meth

andygrove
andygrove commented Jan 23, 2022

Describe the bug

I have a data set created by Apache Spark and I tried to query it from the DataFusion CLI. It failed, saying that a parquet file was corrupt.

 CREATE EXTERNAL TABLE store_sales STORED AS PARQUET LOCATION 'store_sales.dat';
0 rows in set. Query took 0.002 seconds.
❯ select count(*) from store_sales;
Parquet reader thread terminated due to error: ParquetError(Gener
NeroCorleone
NeroCorleone commented Aug 11, 2020

Problem description

Reading a dataset with eager's read functionality raises a ValueError when providing columns.

Example code (ideally copy-pastable)

import pandas as pd

from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url

from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_data
Max-Meldrum
Max-Meldrum commented Jan 10, 2022

An Operator that both filters and maps.

Akin to Rust's own FilterMap but on a Stream rather than Iterator.

let strings = ["1", "two", "NaN", "four", "5"];
let mut app = Application::default()
  .iterator(strings, |conf| {
     conf.set_arcon_time(ArconTime::Process);
  })
  .filter_map(|s| s.parse().ok())
  .b

Improve this page

Add a description, image, and links to the arrow topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the arrow topic, visit your repo's landing page and select "manage topics."

Learn more