Here are
66 public repositories
matching this topic...
A search engine that doesn't track you.
Updated
Nov 21, 2020
JavaScript
🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump
Updated
Apr 7, 2021
Shell
A command-line toolkit to extract text content and category data from Wikipedia dump files
Updated
Aug 11, 2022
Ruby
Corpus creator for Chinese Wikipedia
Updated
Jun 30, 2021
Python
Wikipedia-based Explicit Semantic Analysis, as described by Gabrilovich and Markovitch
Updated
May 13, 2020
Java
Reading the data from OPIEC - an Open Information Extraction corpus
Updated
Jun 12, 2019
Java
Downloads and imports Wikipedia page histories to a git repository
Updated
Dec 21, 2020
Python
A simple utility to index wikipedia dumps using Lucene.
Updated
Oct 13, 2020
Java
Extracting useful metadata from Wikipedia dumps in any language.
Updated
Sep 20, 2019
Python
Node.js module for parsing the content of wikipedia articles into javascript objects
Updated
Oct 24, 2021
JavaScript
Collects a multimodal dataset of Wikipedia articles and their images
Updated
May 26, 2022
Python
A Python toolkit to generate a tokenized dump of Wikipedia for NLP
Updated
Oct 12, 2020
Python
Python package for working with MediaWiki XML content dumps
Updated
Oct 24, 2022
Python
Research for master degree, operation projizz-I/O
Updated
Dec 27, 2017
Python
A library that assists in traversing and downloading from Wikimedia Data Dumps and their mirrors.
Updated
Apr 21, 2022
Python
Java tool to Wikimedia dumps into Java Article pojos for test or fake data.
Updated
Oct 19, 2022
Java
Updated
Aug 9, 2022
Shell
WikiBank is a new partially annotated resource for multilingual frame-semantic parsing task.
Updated
Dec 2, 2019
Python
Wikipedia archive downloader+text parser for every language
Updated
Sep 11, 2020
Shell
Improve this page
Add a description, image, and links to the
wikipedia-dump
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
wikipedia-dump
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.