-
Updated
Jul 14, 2021
#
data-quality
Here are 132 public repositories matching this topic...
search
data-science
machine-learning
natural-language-processing
reinforcement-learning
computer-vision
deep-learning
production
data-engineering
data-discovery
recsys
data-quality
applied-data-science
applied-machine-learning
Create HTML profiling reports from pandas DataFrame objects
python
data-science
machine-learning
statistics
deep-learning
jupyter
pandas-dataframe
exploratory-data-analysis
jupyter-notebook
eda
pandas
artificial-intelligence
exploration
data-analysis
html-report
data-exploration
pandas-profiling
data-quality
data-profiling
big-data-analytics
-
Updated
Jul 12, 2021 - Jupyter Notebook
nopcoder
commented
Jul 1, 2021
Currently lakeFS register openapi handlers and handle all specific routes.
In case of a call to /api/v1/test, the unknown path under the API prefix, the mux will serve the request by the UI handler and return a valid HTML (UI) page.
The expected behaviour is to return a non-2xx status code with JSON error - prefered the internal error format, so the developer will handle an error and not fai
Data validation and organization of metadata for data frames and database tables
mysql
r
spark
data-validation
sqlite
postgresql
data-frame
data-engineering
sparklyr
mssql
easy-to-use
data-quality
data-profiling
tibble
reporting-tools
email-reports
thresholds
database-tables
failure-thresholds
-
Updated
Jul 14, 2021 - R
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
-
Updated
Dec 14, 2020
Data profiling, testing, and monitoring for SQL accessible data.
python
data-science
airflow
monitoring
metrics
data-engineering
dbt
observability
pandas-profiling
data-quality
data-profiling
data-monitoring
data-quality-monitoring
data-unit-tests
airflow-operators
data-testing
data-pipeline-monitoring
data-observability
soda-sql
-
Updated
Jul 14, 2021 - Python
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
workflow
quality
compare
dss
data-quality
quality-improvement
quality-check
linkis
datashperestudio
data-quality-model
-
Updated
Jun 15, 2021 - Java
re_data - data quality framework
data-analysis
dbt
data-quality-checks
data-quality
dataquality
open-source-tooling
data-monitoring
data-quality-monitoring
data-testing
dbt-packages
data-observability
-
Updated
Jul 12, 2021 - Python
Profile and monitor your ML data pipeline end-to-end
java
statistics
spark
apache-spark
dataset
data-quality
calculate-statistics
aiops
mlops
ai-pipelines
approximate-statistics
statistical-properties
whylogs
-
Updated
Jun 18, 2021 - Java
-
Updated
May 26, 2021 - Vue
Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
-
Updated
Feb 21, 2021 - Jupyter Notebook
An RDF Unit Testing Suite
unit-testing
schema
validation
rdf
data-validation
schema-validation
web-ontology-language
data-quality-checks
data-quality
shacl
-
Updated
Jun 13, 2021 - Java
数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)
-
Updated
Jun 10, 2021 - Python
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile your test suite. Just create an Xml file and let the framework interpret it and play your tests. The framework is designed as an add-on of NUnit but with the possibility to port it easily to other testing frameworks.
database
etl
nunit
test-automation
test-framework
business-intelligence
cube
data-quality-checks
data-quality
-
Updated
May 15, 2021 - C#
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
yarn
hadoop
apm
developer-tools
data-analysis
hadoop-cluster
devops-tools
data-quality
optimization-framework
cluster-monitoring
monitoring-tool
hadoop-monitor
yarn-hadoop-cluster
aiops
hadoop-monitoring
-
Updated
Jun 23, 2021 - Java
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
-
Updated
Jun 2, 2021 - Python
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
-
Updated
May 8, 2021 - Python
Great Expectations Airflow operator
-
Updated
Jun 23, 2021 - Python
Automated data quality suggestions and analysis with Deequ on AWS Glue
-
Updated
May 15, 2021 - Scala
A tool to help improve data quality standards in observational data science.
-
Updated
Jun 16, 2021 - JavaScript
A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
-
Updated
Oct 1, 2020 - Jupyter Notebook
DTCleaner: data cleaning using multi-target decision trees.
-
Updated
Jun 21, 2016 - Java
DataOps for Government
-
Updated
Sep 13, 2018
Migrated to: https://gitlab.com/Oslandia/osm-data-classification
-
Updated
Sep 16, 2019 - Python
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
-
Updated
Dec 13, 2017 - Python
R package based on "Column Names as Contracts" blog post (https://emilyriederer.netlify.app/post/column-name-contracts/)
data-validation
r-package
data-quality
schema-design
controlled-vocabulary
variable-naming
variable-names
-
Updated
Jan 20, 2021 - R
The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
-
Updated
Apr 16, 2021 - R
Improve this page
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."
Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4