Pinned
Repositories
-
-
- crawlurlfrontier Public
Crawl config used to test URL Frontier on a large scale and produce WARCs for CommonCrawl.
-
- crawler4j-frontier-battle Public
-
-
-
- TextClassification Public
A Text Classification API in Java originally developed by DigitalPebble Ltd. The API is independent from the ML implementations used and can be used as a front end to various ML algorithms. libSVM and liblinear are currently embedded.
-