-
Updated
Oct 1, 2021
#
data-quality
Here are 141 public repositories matching this topic...
search
data-science
machine-learning
natural-language-processing
reinforcement-learning
computer-vision
deep-learning
production
data-engineering
data-discovery
recsys
data-quality
applied-data-science
applied-machine-learning
Create HTML profiling reports from pandas DataFrame objects
python
data-science
machine-learning
statistics
deep-learning
jupyter
pandas-dataframe
exploratory-data-analysis
jupyter-notebook
eda
pandas
artificial-intelligence
exploration
data-analysis
html-report
data-exploration
pandas-profiling
data-quality
data-profiling
big-data-analytics
-
Updated
Oct 3, 2021 - Jupyter Notebook
lynnro314
commented
Sep 13, 2021
andyndang
commented
Jan 7, 2021
We're using marshmallow to parse whylogs config from YAML
However, Pydantic is much more powerful as it allows users to set config via various mechanims, from YAML, JSON to Environment settings.
We should consider moving to pydantic
Data validation and organization of metadata for data frames and database tables
data-validation
testing-tools
schema-validation
data-management
yaml-configuration
data-quality
data-frames
data-verification
easy-to-understand
reporting-tool
database-tables
data-assertions
data-checker
data-inference
data-profiler
data-dictionaries
-
Updated
Sep 28, 2021 - R
re_data - data quality framework. Build on top of dbt, re_data helps you find, debug and resolve problems in your data.
data-analysis
dbt
data-quality-checks
data-quality
dataquality
open-source-tooling
data-monitoring
data-quality-monitoring
data-testing
dbt-packages
data-observability
-
Updated
Oct 1, 2021 - Python
Data profiling, testing, and monitoring for SQL accessible data.
python
data-science
airflow
monitoring
metrics
data-engineering
dbt
observability
pandas-profiling
data-quality
data-profiling
data-monitoring
data-quality-monitoring
data-unit-tests
airflow-operators
data-testing
data-pipeline-monitoring
data-observability
soda-sql
-
Updated
Oct 1, 2021 - Python
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
-
Updated
Dec 14, 2020
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
workflow
quality
compare
dss
data-quality
quality-improvement
quality-check
linkis
datashperestudio
data-quality-model
-
Updated
Sep 29, 2021 - Java
Profile and monitor your ML data pipeline end-to-end
java
statistics
spark
apache-spark
dataset
data-quality
calculate-statistics
aiops
mlops
ai-pipelines
approximate-statistics
statistical-properties
whylogs
-
Updated
Sep 28, 2021 - Java
-
Updated
Sep 21, 2021 - Vue
Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
-
Updated
Feb 21, 2021 - Jupyter Notebook
An RDF Unit Testing Suite
unit-testing
schema
validation
rdf
data-validation
schema-validation
web-ontology-language
data-quality-checks
data-quality
shacl
-
Updated
Sep 20, 2021 - Java
数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)
-
Updated
Jun 10, 2021 - Python
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile your test suite. Just create an Xml file and let the framework interpret it and play your tests. The framework is designed as an add-on of NUnit but with the possibility to port it easily to other testing frameworks.
database
etl
nunit
test-automation
test-framework
business-intelligence
cube
data-quality-checks
data-quality
-
Updated
Sep 17, 2021 - C#
1
yu-iskw
commented
Aug 3, 2021
Motivation
As odd-platform supports redshift and so on, it would be awesome to support BigQuery integration.
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
yarn
hadoop
apm
developer-tools
data-analysis
hadoop-cluster
devops-tools
data-quality
optimization-framework
cluster-monitoring
monitoring-tool
hadoop-monitor
yarn-hadoop-cluster
aiops
hadoop-monitoring
-
Updated
Sep 24, 2021 - Java
Great Expectations Airflow operator
-
Updated
Sep 23, 2021 - Python
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
-
Updated
Sep 29, 2021 - Python
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
-
Updated
May 8, 2021 - Python
A tool to help improve data quality standards in observational data science.
-
Updated
Sep 23, 2021 - JavaScript
Automated data quality suggestions and analysis with Deequ on AWS Glue
-
Updated
May 15, 2021 - Scala
A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
-
Updated
Oct 1, 2020 - Jupyter Notebook
DTCleaner: data cleaning using multi-target decision trees.
-
Updated
Jun 21, 2016 - Java
DataOps for Government
-
Updated
Sep 13, 2018
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
-
Updated
Dec 13, 2017 - Python
Migrated to: https://gitlab.com/Oslandia/osm-data-classification
-
Updated
Sep 16, 2019 - Python
R package based on "Column Names as Contracts" blog post (https://emilyriederer.netlify.app/post/column-name-contracts/)
data-validation
r-package
data-quality
schema-design
controlled-vocabulary
variable-naming
variable-names
-
Updated
Oct 2, 2021 - R
The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
-
Updated
Apr 16, 2021 - R
Improve this page
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."

Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4