Data-centric declarative deep learning framework
-
Updated
May 2, 2023 - Python
Data-centric declarative deep learning framework
AI Vector Database for LLMs/LangChain. Doubles as a Data Lake for Deep Learning. Store, query, version, & visualize any data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Modern columnar data format for ML implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
A curated, but incomplete, list of data-centric AI resources.
The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling to supercharge model performance.
DataCLUE: 数据为中心的NLP基准和工具包
Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)
A Data Centric annotation tool for your Named Entity Recognition projects
[ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"
An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.
Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
From local functions to cloud deployed pipelines
Sample notebooks that use the Openlayer Python API
[ACL 2023] The code for our paper Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach
Open-source Data Backend written in Java and based on PostgreSQL & GraphQL.
Quickly set up an image labelling web application for manually tagging images for machine learning tasks.
Data-Oriented Microservices Architecture Framework using DDS
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
ndn-hydra: A Python-coded NDN distributed repository with five focused attributes: resiliency, scalability, usability, efficiency, and security.
Add a description, image, and links to the data-centric topic page so that developers can more easily learn about it.
To associate your repository with the data-centric topic, visit your repo's landing page and select "manage topics."