All tools

ZenRows

AI-Ready Scraping With Anti-Bot Bypass

Scraping APIs

A commercial scraping API for bypassing bot defenses, rendering JavaScript, using residential proxies, and returning AI-ready results.

Stagehand

Automate Web Workflows With AI

Stars
23.3K
Forks
1.6K
Last commit
21 days ago

Open-source TypeScript framework for building AI-assisted browser automations that combine plain-language steps with Playwright code.

Browserbase

Reliable Cloud Browsers for AI Agents

Headless Browsers

Run hosted browser sessions for AI automation and scraping with search, fetch, stealth, observability, and replay built in.

Browse AI

No-code AI scraping and monitoring for any site

AI Web Scraping

Extract and monitor data from any website without code, turn pages into spreadsheets or APIs, and connect to thousands of apps.

ParseHub

Free visual scraper for dynamic, interactive sites

Web Scraping

Extract data from dynamic websites by clicking elements, handle AJAX, forms, and dropdowns, then export to JSON, Excel, or API.

Octoparse

Turn web pages into structured data, no code

Web Scraping

Build scrapers visually and turn web pages into structured data in minutes, with scheduling and an optional managed data service.

Perplexity Sonar

Generative AI Search API with Real-Time Research

SERP APIs

Perplexity Sonar is a generative AI search API that provides real-time web research with citations, customizable sources, and multiple performance tiers.

ValueSERP

Low-Cost Google SERP API for Data Collection

SERP APIs

ValueSERP provides real-time, reliable Google search data at the best value with rates as low as $0.50 per 1,000 searches and minimal maintenance.

SearchAPI.io

Multi-Engine Real-Time SERP API

SERP APIs

SearchAPI.io delivers real-time SERP data from Google, Bing, Baidu, YouTube, and Amazon with structured JSON output.

Zyte API

All-in-One Web Scraping API with AI Extraction

Scraping FrameworksSERP APIs

Zyte API is a full-stack web scraping API that combines anti-bot bypassing, browser rendering, and AI-powered data extraction in a single endpoint.

Exa AI

Semantic Web Search API Powered by AI

SERP APIs

Exa AI is a modern search engine and web search API that uses semantic embeddings to deliver high-quality, real-time web data for AI applications.

Tavily

Real-Time Search API for AI Agents and RAG

SERP APIs

Tavily is a real-time search engine for AI agents and RAG workflows, delivering fast web search and content extraction through a single secure API.

Brave Search API

Independent Web Search API with 35B+ Pages

SERP APIs

Brave Search API provides enterprise-grade access to an independent index of 35+ billion web pages with real-time results optimized for AI and RAG pipelines.

Cloudflare R2

S3-compatible storage with zero egress fees

Object Storage

Cloudflare R2 is S3-compatible object storage with zero egress fees, making it ideal for storing and serving frequently accessed scraped data.

AWS Athena

Serverless SQL Queries on Amazon S3

ETL Tools

Amazon Athena is a serverless, interactive query service that lets you analyze petabyte-scale data in S3 using standard SQL with pay-per-query pricing.

AWS Glue

Serverless Data Integration Service by AWS

ETL Tools

AWS Glue is a fully managed, serverless ETL service that discovers, prepares, and integrates data from over 100 sources into lakes and warehouses.

Webshare

Fast and affordable proxy servers

Proxy Services

Webshare provides self-serve proxy servers across many locations, offering datacenter and residential proxy options for scraping workflows.

IPRoyal

Premium proxies at affordable prices

Proxy Services

IPRoyal offers 32M+ residential and datacenter proxies with no contracts, non-expiring traffic, and easy self-service for scraping at any scale.

Smartproxy

AI-ready proxy and scraping solutions

Proxy Services

Smartproxy (now Decodo) offers residential and datacenter proxy products with rotating and sticky sessions for scalable scraping.

Render

Cloud platform for web services and cron jobs

Cloud Compute

Render is a cloud platform that deploys web services, background workers, and cron jobs with simple scaling for scraping workloads.

Railway

Deploy apps from GitHub with zero config

Cloud Compute

Railway deploys web apps and services from GitHub with auto-scaling, built-in databases, and instant previews for scraping workloads.

AWS Lambda

Serverless compute for event-driven pipelines

Cloud Compute

AWS Lambda runs code without managing servers, ideal for event-driven scraping pipelines triggered by schedules, queues, or API calls.

Cloudflare Workers

Run serverless code at the edge worldwide

Cloud Compute

Cloudflare Workers lets you run JavaScript and TypeScript at the edge with ultra-low latency for lightweight scrapers, proxies, and APIs.

ScraperAPI

Scale data collection with a simple API

Stars
2
Forks
1
Last commit
2 years ago

ScraperAPI handles proxy rotation, browsers, and CAPTCHAs so developers can scrape any public page with a single API call at scale.

ScrapingBee

Web scraping API with proxy and browser handling

Stars
29
Forks
4
Last commit
5 months ago

ScrapingBee is a web scraping API that handles headless browsers, rotates proxies, and offers AI-powered extraction so you can focus on your data.

DataForSEO

Comprehensive API Stack for SEO and Marketing Data

Stars
39
Forks
12
Last commit
5 months ago

DataForSEO provides SEO and digital marketing data via API, serving as a trusted data partner for 750+ software companies and agencies worldwide.

CapSolver

AI-powered automatic CAPTCHA solving service

Stars
63
Forks
10
Last commit
last year

CapSolver is an AI-powered CAPTCHA solver that automatically bypasses reCAPTCHA, Cloudflare, AWS WAF, and other challenges for automation.

Oxylabs

Premium proxy service for data at scale

Stars
93
Forks
5
Last commit
6 months ago

Oxylabs offers 175M+ residential and datacenter proxies with web scraping APIs and browser automation tools for large-scale public data extraction.

SerpAPI

Real-Time Google Search Results API

Stars
116
Forks
11
Last commit
7 months ago

SerpAPI is a real-time API that delivers structured Google search results, handling proxies, captchas, and rich data parsing automatically.

Apify

Full-stack web scraping and data extraction platform

Stars
161
Forks
21
Last commit
5 months ago

Apify is a cloud platform for web scraping, browser automation, and data extraction with 19,000+ ready-made tools and code templates.

BeautifulSoup

Python library for parsing HTML and XML

Stars
204
Forks
57
Last commit
3 years ago

Beautiful Soup is a Python library for pulling data out of HTML and XML files, providing Pythonic idioms for navigating and searching parse trees.

Modal

Serverless Python platform for AI and data teams

Stars
433
Forks
0

Modal is a serverless Python platform that runs scraping, AI, and data workloads at scale with sub-second cold starts and instant autoscaling.

2Captcha

CAPTCHA solving service with API integration

Stars
714
Forks
126
Last commit
5 months ago

2Captcha is a CAPTCHA solving service that bypasses reCAPTCHA, Cloudflare Turnstile, hCaptcha, and other challenges via a simple API.

Zendriver

Async-first undetected browser automation

Stars
1.1K
Forks
75
Last commit
7 months ago

Zendriver is a fast, async-first undetected browser automation framework based on nodriver with Docker support for scalable scraping.

Scrapy Cluster

Distributed on-demand scraping with Scrapy

Stars
1.2K
Forks
322
Last commit
2 years ago

Scrapy Cluster uses Redis and Kafka to create a distributed, on-demand Scrapy crawling cluster for coordinated large-scale web scraping.

Frontera

Scalable crawl frontier framework

Stars
1.3K
Forks
217
Last commit
last year

Frontera is a Python crawl frontier framework for managing when and what to crawl, enabling web crawlers of any scale with Scrapy integration.

ZenRows

AI-Ready Scraping With Anti-Bot Bypass

Scraping APIs

A commercial scraping API for bypassing bot defenses, rendering JavaScript, using residential proxies, and returning AI-ready results.

Stagehand

Automate Web Workflows With AI

Stars
23.3K
Forks
1.6K
Last commit
21 days ago

Open-source TypeScript framework for building AI-assisted browser automations that combine plain-language steps with Playwright code.

Browserbase

Reliable Cloud Browsers for AI Agents

Headless Browsers

Run hosted browser sessions for AI automation and scraping with search, fetch, stealth, observability, and replay built in.

Browse AI

No-code AI scraping and monitoring for any site

AI Web Scraping

Extract and monitor data from any website without code, turn pages into spreadsheets or APIs, and connect to thousands of apps.

ParseHub

Free visual scraper for dynamic, interactive sites

Web Scraping

Extract data from dynamic websites by clicking elements, handle AJAX, forms, and dropdowns, then export to JSON, Excel, or API.

Octoparse

Turn web pages into structured data, no code

Web Scraping

Build scrapers visually and turn web pages into structured data in minutes, with scheduling and an optional managed data service.

Perplexity Sonar

Generative AI Search API with Real-Time Research

SERP APIs

Perplexity Sonar is a generative AI search API that provides real-time web research with citations, customizable sources, and multiple performance tiers.

ValueSERP

Low-Cost Google SERP API for Data Collection

SERP APIs

ValueSERP provides real-time, reliable Google search data at the best value with rates as low as $0.50 per 1,000 searches and minimal maintenance.

SearchAPI.io

Multi-Engine Real-Time SERP API

SERP APIs

SearchAPI.io delivers real-time SERP data from Google, Bing, Baidu, YouTube, and Amazon with structured JSON output.

Zyte API

All-in-One Web Scraping API with AI Extraction

Scraping FrameworksSERP APIs

Zyte API is a full-stack web scraping API that combines anti-bot bypassing, browser rendering, and AI-powered data extraction in a single endpoint.

Exa AI

Semantic Web Search API Powered by AI

SERP APIs

Exa AI is a modern search engine and web search API that uses semantic embeddings to deliver high-quality, real-time web data for AI applications.

Tavily

Real-Time Search API for AI Agents and RAG

SERP APIs

Tavily is a real-time search engine for AI agents and RAG workflows, delivering fast web search and content extraction through a single secure API.

Brave Search API

Independent Web Search API with 35B+ Pages

SERP APIs

Brave Search API provides enterprise-grade access to an independent index of 35+ billion web pages with real-time results optimized for AI and RAG pipelines.

Cloudflare R2

S3-compatible storage with zero egress fees

Object Storage

Cloudflare R2 is S3-compatible object storage with zero egress fees, making it ideal for storing and serving frequently accessed scraped data.

AWS Athena

Serverless SQL Queries on Amazon S3

ETL Tools

Amazon Athena is a serverless, interactive query service that lets you analyze petabyte-scale data in S3 using standard SQL with pay-per-query pricing.

AWS Glue

Serverless Data Integration Service by AWS

ETL Tools

AWS Glue is a fully managed, serverless ETL service that discovers, prepares, and integrates data from over 100 sources into lakes and warehouses.

Webshare

Fast and affordable proxy servers

Proxy Services

Webshare provides self-serve proxy servers across many locations, offering datacenter and residential proxy options for scraping workflows.

IPRoyal

Premium proxies at affordable prices

Proxy Services

IPRoyal offers 32M+ residential and datacenter proxies with no contracts, non-expiring traffic, and easy self-service for scraping at any scale.

Smartproxy

AI-ready proxy and scraping solutions

Proxy Services

Smartproxy (now Decodo) offers residential and datacenter proxy products with rotating and sticky sessions for scalable scraping.

Render

Cloud platform for web services and cron jobs

Cloud Compute

Render is a cloud platform that deploys web services, background workers, and cron jobs with simple scaling for scraping workloads.

Railway

Deploy apps from GitHub with zero config

Cloud Compute

Railway deploys web apps and services from GitHub with auto-scaling, built-in databases, and instant previews for scraping workloads.

AWS Lambda

Serverless compute for event-driven pipelines

Cloud Compute

AWS Lambda runs code without managing servers, ideal for event-driven scraping pipelines triggered by schedules, queues, or API calls.

Cloudflare Workers

Run serverless code at the edge worldwide

Cloud Compute

Cloudflare Workers lets you run JavaScript and TypeScript at the edge with ultra-low latency for lightweight scrapers, proxies, and APIs.

ScraperAPI

Scale data collection with a simple API

Stars
2
Forks
1
Last commit
2 years ago

ScraperAPI handles proxy rotation, browsers, and CAPTCHAs so developers can scrape any public page with a single API call at scale.

ScrapingBee

Web scraping API with proxy and browser handling

Stars
29
Forks
4
Last commit
5 months ago

ScrapingBee is a web scraping API that handles headless browsers, rotates proxies, and offers AI-powered extraction so you can focus on your data.

DataForSEO

Comprehensive API Stack for SEO and Marketing Data

Stars
39
Forks
12
Last commit
5 months ago

DataForSEO provides SEO and digital marketing data via API, serving as a trusted data partner for 750+ software companies and agencies worldwide.

CapSolver

AI-powered automatic CAPTCHA solving service

Stars
63
Forks
10
Last commit
last year

CapSolver is an AI-powered CAPTCHA solver that automatically bypasses reCAPTCHA, Cloudflare, AWS WAF, and other challenges for automation.

Oxylabs

Premium proxy service for data at scale

Stars
93
Forks
5
Last commit
6 months ago

Oxylabs offers 175M+ residential and datacenter proxies with web scraping APIs and browser automation tools for large-scale public data extraction.

SerpAPI

Real-Time Google Search Results API

Stars
116
Forks
11
Last commit
7 months ago

SerpAPI is a real-time API that delivers structured Google search results, handling proxies, captchas, and rich data parsing automatically.

Apify

Full-stack web scraping and data extraction platform

Stars
161
Forks
21
Last commit
5 months ago

Apify is a cloud platform for web scraping, browser automation, and data extraction with 19,000+ ready-made tools and code templates.

BeautifulSoup

Python library for parsing HTML and XML

Stars
204
Forks
57
Last commit
3 years ago

Beautiful Soup is a Python library for pulling data out of HTML and XML files, providing Pythonic idioms for navigating and searching parse trees.

Modal

Serverless Python platform for AI and data teams

Stars
433
Forks
0

Modal is a serverless Python platform that runs scraping, AI, and data workloads at scale with sub-second cold starts and instant autoscaling.

2Captcha

CAPTCHA solving service with API integration

Stars
714
Forks
126
Last commit
5 months ago

2Captcha is a CAPTCHA solving service that bypasses reCAPTCHA, Cloudflare Turnstile, hCaptcha, and other challenges via a simple API.

Zendriver

Async-first undetected browser automation

Stars
1.1K
Forks
75
Last commit
7 months ago

Zendriver is a fast, async-first undetected browser automation framework based on nodriver with Docker support for scalable scraping.

Scrapy Cluster

Distributed on-demand scraping with Scrapy

Stars
1.2K
Forks
322
Last commit
2 years ago

Scrapy Cluster uses Redis and Kafka to create a distributed, on-demand Scrapy crawling cluster for coordinated large-scale web scraping.

Frontera

Scalable crawl frontier framework

Stars
1.3K
Forks
217
Last commit
last year

Frontera is a Python crawl frontier framework for managing when and what to crawl, enabling web crawlers of any scale with Scrapy integration.

Every tool in the directory - scraping, anti-blocking, data processing, databases, infrastructure, and utilities.