Pinned repositories
Repositories
-
pastebin-grab
Archiving pastebin
-
liveleak-grab
Archiving liveleak.com
-
archiveteam-megawarc-factory
Some scripts to process ArchiveTeam uploads
-
grab-site
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
-
ludios_wpull
wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved
-
-
bintray-grab
Forked from OrIdow6/bintray-grab -
liveleak-items
Managing items for liveleak-grab.
-
urls-grab
Archiving URLs (outlinks) from a variety of sources.
-
reddit-grab
Grabbing everything from reddit.
-
yahooanswers-grab
Saving all questions and answers from Yahoo! Answers.
-
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
-
grab-base-df
Base Dockerfile for warrior project grab scripts
-
bintray-items
Managing items for bintray-grab.
-
seesaw-kit
Making a reusable toolkit for writing seesaw scripts
-
zstd-dictionary-trainer
Training ZSTD dictionaries for use in ZST WARCs.
-
urls-sources
Sources for urls-grab.
-
mediafire-items
Managing items for mediafire-grab.
-
telegram-items
Managing items for telegrab-grab.
-
Ubuntu-Warrior
Scripts to build and boot warrior virtual machine containing Docker
-
periscope-grab
Archiving periscope
-
yahooanswers-items
Managing items for yahoo-answers-grab.
-
warrior-dockerfile
A Dockerfile for the ArchiveTeam Warrior
-
-
google-sites-grab
Archiving Google Sites Classic.
-
mediafire-grab
Archiving mediafire.com URLs.
-
github-grab
Archiving GitHub
-
webs-grab
Archiving webs.com
-
webs-items
Managing items for webs-grab.