Favicon of Scrapy Redis

Scrapy Redis

Scrapy-Redis provides Redis-backed components for Scrapy, enabling distributed crawling with shared request queues and item pipelines.

Screenshot of Scrapy Redis website

Scrapy-Redis is a practical, battle-tested extension that adds Redis-based components to Scrapy for building distributed web crawling systems.

Key Features:

  • Distributed Queue — Share request queues across multiple Scrapy spiders via Redis
  • Deduplication — Redis-based request fingerprinting prevents duplicate crawls
  • Shared Scheduler — Coordinate multiple crawler instances with a centralized scheduler
  • Item Pipeline — Push scraped items to Redis for downstream processing
  • Drop-In — Minimal configuration changes to distribute existing Scrapy spiders

Whether you're scaling Scrapy crawlers across multiple servers, building real-time scraping pipelines, or coordinating distributed extraction, Scrapy-Redis provides the essential components for distributed Scrapy deployments.

Share:

  • Stars

    5.6K
  • Forks

    1.6K
  • Last commit

    last year
  • License

    MIT
  • Language

    Python
View Repository

Similar to Scrapy Redis

Favicon

 

  
  
Favicon

 

  
  
Favicon