-
Updated
Jul 12, 2022 - C++
arrow
Here are 265 public repositories matching this topic...
-
Updated
Jul 11, 2022 - Kotlin
The API lists::drop_list_duplicates operates on a pair of keys-values input lists columns with duplicate_keep_option. This is Spark's specific feature request. Now we have lists::distinct which purely extracts distinct list elements from the input lists column. This API is more standard and is used in both Python and Spark.
Therefore, we should remove lists::drop_list_duplicates complet
-
Updated
Apr 20, 2021 - Rust
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In the SO post https://stackoverflow.com/questions/72888852/extract-year-month-day-from-unix-timestamp-column-in-rust-datafusion-dataframe/72941102#72941102 the user needed help with translating a unixtime to a timestamp. The solution now is to use a cast and this is verbose and not obvious.
-
Updated
May 26, 2022 - JavaScript
-
Updated
Jul 12, 2022 - Python
We no longer need to control the number of concurrent kernels, since now we control the number of concurrent tasks
the pre-built binary is not supporting database?
roapi -t "vocabs=sqlite:///data/vocabulary.sqlite"
[2022-05-31T06:48:11Z INFO roapi::context] loading `uri(sqlite:///data/vocabulary.sqlite)` as table `vocabs`
Error: Database error: Enable 'database' feature flag to support this
would you explain in README how to enable it?
I'm new to rust, after some searching i got it workin
-
Updated
May 22, 2020 - Java
-
Updated
Jul 4, 2022 - TypeScript
-
Updated
Jan 3, 2021 - Swift
-
Updated
Oct 15, 2018 - Swift
-
Updated
Feb 11, 2022 - JavaScript
-
Updated
Feb 8, 2021 - Python
-
Updated
May 19, 2021 - Java
-
Updated
Jul 12, 2022 - Scala
It would be helpful to have Fletchgen output warnings for unused metadata fields that start with fletcher_. For example, (this happened to me) when someone adds fletchgen_epc to Schema metadata instead of Field metadata.
Problem description
Reading a dataset with eager's read functionality raises a ValueError when providing columns.
Example code (ideally copy-pastable)
import pandas as pd
from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url
from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_dataDescribe the bug
We have a hard-coded distinct = false parameter in ballista/rust/core/src/serde/physical_plan/mod.rs.
Ok(create_aggregate_expr(
&aggr_function.into(),
false, // <-- hard-coded "distinct"
input_phy_expr.as_slice(),
&physical_schema,
name.to_string(),
)?)To Reproduce
Try running a COUNT(DISTINCT expr) in Ballista
**E
Move to arrow2
Motivation:
- Improved compile times (at least by 2x compared to arrow-rs).
- Faster Parquet impl
- Projects are migrating to arrow2 (including Datafusion and Polars)
-
Updated
Jan 13, 2022 - Objective-C
-
Updated
Feb 18, 2021 - Kotlin
Improve this page
Add a description, image, and links to the arrow topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the arrow topic, visit your repo's landing page and select "manage topics."

Feature Request
Many locales have the bare minimum when it comes to test cases. While I understand it can be tedious and repetitive to write out test case