Fast, secure, efficient backup program
#2072 opened 9 months ago by fbarbeira
3
#2264 opened 3 months ago by rmetzler
8
#2239 opened 3 months ago by tbm
6
Go
Updated Jul 16, 2019
Deduplicating archiver with compression and authenticated encryption.
#4682 opened 5 days ago by ThomasWaldmann
#4426 opened 4 months ago by ThomasWaldmann
2
#4360 opened 5 months ago by ThomasWaldmann
18
C
Updated Jul 17, 2019
Prometheus Alertmanager
#1820 opened 4 months ago by swapzero
3
#1855 opened 3 months ago by DrPyser
5
#1566 opened 10 months ago by stuartnelson3
Go
Updated Jul 16, 2019
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
C
Updated Jul 12, 2019
Find duplicate files
#321 opened almost 4 years ago by iweindesmedt
2
#294 opened over 4 years ago by truth1ness
11
#237 opened almost 6 years ago by hsoft
Python
Updated Jun 26, 2019
Data deduplication engine, supporting optional compression and public key encryption.
Rust
Updated Apr 26, 2019
A powerful duplicate file finder and an enhanced fork of 'fdupes'.
C
Updated Jun 21, 2019
A toolkit for record linkage and deduplication written in Python
#53 opened over 1 year ago by J535D165
Python
Updated Jul 12, 2019
Quickly detect already witnessed data.
Go
Updated Jul 16, 2017
A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.
C
Updated Jun 14, 2019
A list of free data matching and record linkage software.
Updated Apr 26, 2019
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
Python
Updated Jan 24, 2019
Userspace tools for managing VDO volumes.
C
Updated Jun 14, 2019
Spark RDD with Lucene's query and entity linkage capabilities
#179 opened 3 months ago by zouzias
#172 opened 3 months ago by zouzias
#169 opened 4 months ago by yeikel
3
Scala
Updated Jul 14, 2019
BlobStash is your personal database.
Go
Updated May 30, 2019
Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tr…
Python
Updated Apr 11, 2019
Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Python
Updated Apr 14, 2019
The Dropbox for IPFS (without the icky stuff)
Python
Updated Oct 23, 2017
CLI utility to find duplicate files
C
Updated Nov 15, 2018
Record Linkage ToolKit (Find and link entities)
Python
Updated May 3, 2019
Dedupe/batch geocode addresses and venues around the world with libpostal
Python
Updated Feb 11, 2019
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Ja…
JavaScript
Updated Jan 2, 2019
Benji Backup: A block based deduplicating backup software for Ceph RBD images, iSCSI targets, image files and devices
Python
Updated Jul 16, 2019
A simple command line interface to the datamade/dedupe library.
Jupyter Notebook
Updated Apr 2, 2018
Resources for tackling record linkage / deduplication / data matching problems
Updated Dec 24, 2018
Continuous data protection for GNU/Linux (cdpfgl).
C
Updated Mar 15, 2019
Fast and secure open-source backup
#9 opened 6 months ago by ptabor
Go
Updated Jul 17, 2019
A backup program that does deduplication, compression, encryption
C++
Updated Dec 18, 2017
A Scalable Data Cleaning Library for PySpark.
Python
Updated Apr 4, 2019
CLI tool for image duplicate detection
Go
Updated May 21, 2019