NYC TLC taxi trip records
1.5 billion yellow / green / FHV trips since 2009, distributed as monthly Parquet files. Stories use build-time DuckDB aggregates. The Playground tab runs DuckDB WASM in your browser — queries execute locally against remote Parquet, no server involved.
About this dataset
Scale
1.5 billion rows, 50+ GB of Parquet. Monthly files going back to 2009.
Architecture
Stories use frozen aggregates built with DuckDB at deploy time. The Playground tab uses DuckDB WASM — queries run in your browser, no server involved.
Coverage
Yellow + green cabs, FHV (for-hire vehicles), and FHVHV (Uber / Lyft / Via). Schema includes fare, tip, distance, pickup/dropoff zones.
Stories from this dataset
story
The black car takeover
How Uber and Lyft hollowed out the yellow cab industry, 2017–2023.
story
NYC at 3 am
Where the city ends up after midnight — yellow cab drop-offs by zone.
story
Tip geography
Which neighborhoods tip the most, and what the data can and cannot say.
story
The congestion pricing effect
Did the $9 toll shift demand away from the Manhattan CBD?
Available Parquet files
Monthly files from the TLC's public distribution. Use these URLs in the Playground tab with read_parquet('url').
→ Open the Playground to query these files directly.