Pinned repositories
Repositories
-
shub
Scrapinghub Command Line Client
-
article-extraction-benchmark
Article extraction benchmark: dataset and evaluation scripts
-
shublang
Pluggable DSL that uses pipes to perform a series of linear transformations to extract data
-
webstruct-demo
HTTP demo for https://github.com/scrapinghub/webstruct
-
jira
Forked from pycontribs/jiraPython JIRA Library is the easiest way to automate JIRA. Support for py27 was dropped on 2019-10-14, do not raise bugs related to it.
-
spidermon
Scrapy Extension for monitoring spiders execution.
-
scrapy-autoextract
Scrapinghub AutoExtract API integration for Scrapy
-
autoextract-poet
web-poet definitions for AutoExtract
-
scrapy-poet
Page Object pattern for Scrapy
-
scrapinghub-autoextract
Python clients for Scrapinghub AutoExtract API
-
scrapy-autounit
Automatic unit test generation for Scrapy.
-
number-parser
Parse numbers written in natural language
-
extruct
Extract embedded metadata from HTML markup
-
scrapinghub-entrypoint-scrapy
Scrapy entrypoint for Scrapinghub job runner
-
dateparser
python parser for human readable dates
-
collection-scanner
HubStorage collection scanner library
-
kafka-consumer-group-exporter
Forked from braedon/prometheus-kafka-consumer-group-exporterPrometheus Kafka Consumer Group Exporter
-
price-parser
Extract price amount and currency symbol from a raw text string
-
-
splash
Lightweight, scriptable browser as a service with an HTTP API
-
-
crawlera-clients
Crawlera HTTPS clients collection
-
kafka-docker
Forked from wurstmeister/kafka-docker -
autoextract-spiders
Pre-built Scrapy spiders for AutoExtract
-
spidyquotes
Example site for web scraping tutorials
-
scrapinghub-stack-scrapy
Software stack with latest Scrapy and updated deps
-