-
Updated
Jan 30, 2022 - C++
arrow
Here are 251 public repositories matching this topic...
-
Updated
Jan 28, 2022 - Kotlin
Is your feature request related to a problem? Please describe.
Hi,
While porting some code from Pandas to cuDF, I have noticed that cuDF series do not support unstack method.
As an additional request, It would be great if fill_values could be supported in both cudf.DataFrame.unstack and cudf.Series.unstack methods. Thanks!
Describe the solution you'd like
To have that meth
-
Updated
Apr 20, 2021 - Rust
-
Updated
Jan 23, 2022 - JavaScript
We no longer need to control the number of concurrent kernels, since now we control the number of concurrent tasks
Describe the bug
I have a data set created by Apache Spark and I tried to query it from the DataFusion CLI. It failed, saying that a parquet file was corrupt.
CREATE EXTERNAL TABLE store_sales STORED AS PARQUET LOCATION 'store_sales.dat';
0 rows in set. Query took 0.002 seconds.
❯ select count(*) from store_sales;
Parquet reader thread terminated due to error: ParquetError(Gener
Note sure if it could be interesting but:
When registering a table:
addr: 0.0.0.0:8084
tables:
- name: "example"
uri: "/data/"
option:
format: "parquet"
use_memory_table: false
add in options:
glob
pattern: "file_typev1*.parquet"
or regexp
pattern: "\wfile_type\wv1\w*.parquet"
It would allow selecting in uri's with different exte
-
Updated
May 22, 2020 - Java
-
Updated
Dec 21, 2021 - JavaScript
-
Updated
Jan 3, 2021 - Swift
-
Updated
Oct 15, 2018 - Swift
-
Updated
Sep 1, 2021 - JavaScript
-
Updated
Feb 8, 2021 - Python
-
Updated
May 19, 2021 - Java
It would be helpful to have Fletchgen output warnings for unused metadata fields that start with fletcher_. For example, (this happened to me) when someone adds fletchgen_epc to Schema metadata instead of Field metadata.
Problem description
Reading a dataset with eager's read functionality raises a ValueError when providing columns.
Example code (ideally copy-pastable)
import pandas as pd
from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url
from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_data-
Updated
Jan 13, 2022 - Objective-C
-
Updated
Jan 28, 2022 - Scala
FilterMap Operator
An Operator that both filters and maps.
Akin to Rust's own FilterMap but on a Stream rather than Iterator.
let strings = ["1", "two", "NaN", "four", "5"];
let mut app = Application::default()
.iterator(strings, |conf| {
conf.set_arcon_time(ArconTime::Process);
})
.filter_map(|s| s.parse().ok())
.b-
Updated
Jan 30, 2022 - Rust
-
Updated
Apr 25, 2019 - Python
-
Updated
Feb 18, 2021 - Kotlin
Improve this page
Add a description, image, and links to the arrow topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the arrow topic, visit your repo's landing page and select "manage topics."
Feature Request
Many locales have the bare minimum when it comes to test cases. While I understand it can be tedious and repetitive to write out test case