Favicon of LLM-Scraper

LLM-Scraper

LLM Scraper is a TypeScript library that extracts structured data from any webpage using LLMs with Zod schemas, Playwright, and streaming support.

Screenshot of LLM-Scraper website

LLM Scraper is a lightweight, type-safe TypeScript library that transforms any webpage into structured data using large language models. Built on Playwright, it supports multiple LLM providers and delivers fully typed extraction with Zod schemas.

Key Features:

  • Multi-Model Support — Works with GPT, Claude, Gemini, Llama, and Qwen model series
  • Type-Safe Schemas — Define extraction schemas with Zod or JSON Schema for full type safety
  • Six Formatting Modes — Process pages as HTML, markdown, text, screenshots, or custom formats
  • Streaming Extraction — Stream structured objects as they are extracted for real-time processing
  • Code Generation — Generate reusable extraction code from your schema definitions

Whether you're building data pipelines, content aggregators, or AI-powered scrapers, LLM Scraper provides simple, typed web extraction.

Share:

  • Stars

    6.2K
  • Forks

    369
  • Last commit

    4 months ago
  • License

    MIT
  • Language

    TypeScript
View Repository

Similar to LLM-Scraper

Favicon

 

  
  
Favicon

 

  
  
Favicon