parquet
Here are 321 public repositories matching this topic...
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
-
Updated
Mar 15, 2023 - Rust
A large-scale entity and relation database supporting aggregation of properties
-
Updated
Mar 20, 2023 - Java
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
-
Updated
Mar 19, 2023 - Python
Quilt is a data mesh for connecting people with actionable data
-
Updated
Mar 20, 2023 - Jupyter Notebook
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
-
Updated
Mar 20, 2023 - Go
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
-
Updated
Feb 14, 2023 - Python
CSVs sliced, diced & analyzed.
-
Updated
Mar 20, 2023 - Rust
Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython, pyarrow, etc.
-
Updated
Mar 10, 2023 - Jupyter Notebook
Simple windows desktop application for viewing & querying Apache Parquet files
-
Updated
Mar 13, 2023 - C#
High performance distributed data processing engine
-
Updated
May 29, 2021 - JavaScript
Improve this page
Add a description, image, and links to the parquet topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the parquet topic, visit your repo's landing page and select "manage topics."