Repositories
-
scrapy-crawlera-fetch
Scrapy Downloader Middleware for Crawlera Fetch API
-
scrapy-jsonschema
Scrapy schema validation pipeline and Item builder using JSON Schema
-
scrapy-magicfields
Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
-
-
scrapy-crawlera
Crawlera middleware for Scrapy
-
scrapy-deltafetch
Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls
-
scrapy-dotpersistence
A scrapy extension to sync `.scrapy` folder to an S3 bucket
-
scrapy-pagestorage
A scrapy extension to store requests and responses information in storage service
-
scrapy-jsonrpc
Scrapy extension to control spiders using JSON-RPC
-
scrapy-hcf
Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
-
scrapy-djangoitem
Scrapy extension to write scraped items using Django models
-
scrapy-monkeylearn
A Scrapy pipeline to categorize items using MonkeyLearn
-
scrapy-splitvariants
Scrapy spider middleware to split an item into multiple items using a multi-valued key
-
scrapy-querycleaner
Scrapy spider middleware to clean up query parameters in request URLs
-
scrapy-bigml
Scrapy pipeline for writing items to BigML datasets
-
scrapy-streamitem
Scrapy support for working with streamcorpus Stream Items.