Amazon Athena is a serverless, interactive query service that lets you analyze petabyte-scale data in S3 using standard SQL with pay-per-query pricing.
Heritrix is the Internet Archive's open-source, extensible, archival-quality web crawler designed for large-scale web preservation and data collection.
BullMQ is an open-source Redis-based message queue trusted by thousands of companies processing billions of jobs daily across Node.js, Python, and more.
Dagster is a data orchestration platform that helps teams build, schedule, and monitor reliable data pipelines with integrated lineage and observability.
Temporal is a durable execution platform that makes distributed applications fault-tolerant by automatically capturing state and recovering from failures.