TeamHG-Memex
Grow your team on GitHub
GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Sign upRepositories
-
autopager
Detect and classify pagination links
-
sklearn-crfsuite
scikit-learn inspired API for CRFsuite
-
eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
-
html-text
Extract text from HTML
-
soft404
A classifier for detecting soft 404 pages
-
-
Formasaurus
Formasaurus tells you the type of an HTML form and its fields using machine learning
-
-
-
arachnado
Web Crawling UI and HTTP API, based on Scrapy and Tornado
-
agnostic
Agnostic Database Migrations
-
docker-tor-rotator
Forked from mfasanya/docker-tor-rotatorA rotating socks proxy using Tor, Delegate and Haproxy
-
autologin
A project to attempt to automatically login to a website given a single seed
-
tensorboard_logger
Log TensorBoard events without touching TensorFlow
-
aquarium
Splash + HAProxy + Docker Compose
-
json-lines
Read JSON lines (jl) files, including gzipped and broken
-
scrapy-kafka-export
Scrapy extension which writes crawled items to Kafka
-
scrapy-cdr
Item definition and utils for storing items in CDR format for scrapy
-
sitehound-backend
Sitehound's backend
-
autologin-middleware
Scrapy middleware for the autologin
-
undercrawler
A generic crawler
-
domain-discovery-crawler
Broad crawler for domain discovery
-
page-compare
Simple heuristic for measuring web page similarity (& data set)
-
hh-page-classifier
Headless Horseman Page Classifier service
-
deep-deep
Adaptive crawler which uses Reinforcement Learning methods
-
scrash-lua-examples
A collection of example LUA scripts and JS utilities
-
MaybeDont
A component that tries to avoid downloading duplicate content
-
sitehound
This is the facade for installation and access to the individual components
-
sshadduser
A simple tool to add a new user with OpenSSH keys.
-
bk-string
A BK Tree based approach to storing and querying strings by Levenshtein Distance.