Here are
62 public repositories
matching this topic...
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
-
Updated
Sep 7, 2020
-
Python
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
-
Updated
Sep 2, 2020
-
JavaScript
A toolkit for record linkage and duplicate detection in Python
-
Updated
Sep 16, 2020
-
Python
🆔 Command line tool for deduplicating CSV files
-
Updated
Mar 31, 2020
-
Python
🆔 Examples for using the dedupe library
-
Updated
May 6, 2020
-
Python
A list of free data matching and record linkage software.
Entity resolution for Elasticsearch.
Link Discovery Framework for Metric Spaces.
-
Updated
Jul 28, 2020
-
JavaScript
Link Wikidata items to large catalogs
-
Updated
Apr 1, 2020
-
Python
Record Linkage ToolKit (Find and link entities)
-
Updated
Jun 4, 2020
-
Python
Resources for tackling record linkage / deduplication / data matching problems
Implementation in Apache Spark of the EM algorithm to estimate parameters of Fellegi-Sunter's canonical model of record linkage.
-
Updated
Aug 21, 2020
-
Python
Python implementation of anonymous linkage using cryptographic linkage keys
-
Updated
Sep 16, 2020
-
Python
Distributed Bayesian Entity Resolution in Apache Spark
-
Updated
Apr 26, 2020
-
Scala
A simple command line interface to the datamade/dedupe library.
-
Updated
Oct 22, 2019
-
Jupyter Notebook
CLK hash: hash pii for entity matching
-
Updated
Sep 14, 2020
-
Python
Merge Dirty Data with Clean Reference Tables
-
Updated
Jun 20, 2019
-
Python
Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.
-
Updated
Jul 16, 2020
-
Python
Phonetic Spelling Algorithms in R
Learned string similarity for entity names using optimal transport.
-
Updated
Nov 21, 2019
-
Python
Privacy Preserving Record Linkage Service
-
Updated
Sep 14, 2020
-
Python
Examples of spark-lucenerdd
-
Updated
Jul 8, 2020
-
Scala
A browser user interface for manual labeling of record pairs.
-
Updated
Dec 26, 2019
-
JavaScript
Fork of the Freely Extensible Biomedical Record Linkage program
-
Updated
Nov 4, 2016
-
Python
A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).
-
Updated
Jul 28, 2017
-
Python
Python implementations of record linkage blocking techniques.
-
Updated
Sep 14, 2020
-
Python
Tools for EHR patient de-duplication (aka entity resolution)
-
Updated
May 4, 2018
-
Python
Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).
-
Updated
Feb 21, 2019
-
Python
Improve this page
Add a description, image, and links to the
record-linkage
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
record-linkage
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Is your feature request related to a problem? Please describe.
Currently,
MapTypeare not supported for Spark DataFramesDescribe the solution you'd like
Add support for MapType Spark DataFrame columns
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other co