← All case studies
TransportationMachine Learning
1.4M trips · XGBoost

NYC Taxi Trip Duration Forecasting at City Scale

Problem

NYC trip duration is driven by time-of-day, origin-destination patterns, and city-scale congestion. Simple time averages mis-price ETAs in operational routing systems, leading to customer dissatisfaction and driver inefficiency.

Approach

Trained XGBoost model on 1.4 million trip records with engineered temporal and geospatial features. Incorporated weather data and traffic patterns using Folium for geographic sanity checks. Implemented Pandas-based data hygiene and leak-safe train-test splits for production credibility.

Result

High-accuracy trip duration model deployed on Streamlit, production-ready for routing and driver-ETA systems. Optimizes route planning in complex urban environment with real-world credibility beyond toy notebooks.

  • XGBoost model trained on 1.4M+ trip records
  • Production-ready for routing and ETA systems
  • Temporal and geospatial feature engineering
  • Weather and traffic pattern integration
1.4M trips
Training data
Temporal · Geo · Weather
Features
XGBoost
Model
Streamlit
Deployment
TransportationXGBoostUrban AnalyticsGeospatialTime Series