Analytics Text-to-SQL Dataset
A community-driven, open dataset of real-world analytics use cases for training and benchmarking text-to-SQL systems.
Overview
Most text-to-SQL benchmarks use synthetic or academic data. Real-world analytics involves:
Complex joins across fact and dimension tables
Domain-specific terminology (slow-moving inventory, high-value customers, claims processing)
Nuanced logic (return rate benchmarking, cohort revenue, seasonality indexing)
Domain knowledge generic datasets don’t capture
This dataset closes that gap — contributions accepted across all domains.