GNNs for Traffic Forecasting
Published:
The Traffic Forecasting Task
Input: X โ โ^{N ร T ร d} โ N sensor readings over T past timesteps, each with d features (speed, volume, occupancy)
Output: Xฬ โ โ^{N ร H ร d} โ predictions for H future timesteps
Graph: G = (V, E, W) where V = sensors, E = road segments connecting sensors, W = edge weights (distance, travel time, or correlation)
Standard benchmarks:
- METR-LA: 207 sensors on LA freeways, 4 months, 5-min intervals
- PEMS-BAY: 325 sensors in Bay Area, 6 months
Typical forecasting horizons: 15 min (3 steps), 30 min (6 steps), 60 min (12 steps).
Why Graphs Improve over ARIMA and LSTM
ARIMA / LSTM (per-sensor): each sensor is modelled independently. Cannot capture spatial correlations โ โupstream congestion causes downstream slowdownโ is invisible.
CNN on grid: grids work for regular spatial layouts (weather stations on a regular grid). Traffic networks are irregular โ sensors follow road geometry, not a grid.
GNN + temporal model: captures both spatial (road network structure) and temporal (recurrent patterns) dependencies.
DCRNN (Diffusion Convolutional Recurrent Neural Network)
DCRNN (Li et al., 2018) uses bidirectional random walk diffusion as the spatial module inside a sequence-to-sequence GRU:
Diffusion convolution (captures directional traffic flow):
Forward diffusion follows traffic direction (upstream โ downstream). Backward diffusion captures reverse influence (road closure downstream affects upstream traffic).
Encoder-decoder: DCRNN encodes T past steps with a diffusion-GRU encoder, decodes H future steps with a decoder using scheduled sampling (avoids exposure bias).
Result on METR-LA: MAE 2.77 for 60-min horizon, vs 3.99 for LSTM (without graph) โ 31% improvement.
STGCN (Spatio-Temporal Graph Convolutional Network)
STGCN (Yu et al., 2018) replaces recurrence with 1D temporal convolutions for speed:
Block: [Temporal gated conv] โ [Spatial ChebNet] โ [Temporal gated conv]
Temporal gated convolution (GLU):
No recurrence โ fully parallelisable over time โ 10ร faster training than DCRNN.
Result: similar accuracy to DCRNN on METR-LA, much faster training.
Graph Wave Net (Wu et al., 2019)
Adds an adaptive adjacency matrix that is learned from data, not just from road geometry:
Where E_1, E_2 โ โ^{N ร d} are learnable node embeddings. The adaptive adjacency captures non-geographic correlations (sensors far apart but behaviourally correlated โ e.g., parallel highways).
Also uses dilated causal convolutions (like WaveNet) for temporal modelling โ wider receptive field than standard 1D conv without more parameters.
Industrial Deployment
Google Maps: uses graph-based models for ETA (estimated time of arrival) prediction. The road network is a graph; historical traffic patterns are the training signal. GNNs helped reduce ETA prediction error by 50%+ in some regions.
DiDi / Uber: ride-hailing platforms use traffic forecasting to optimise driver positioning and surge pricing. GNNs process city-wide sensor networks in real-time.
Summary
| Model | Spatial | Temporal | Speed |
|---|---|---|---|
| ARIMA | None | Statistical | Fast |
| LSTM | None | Recurrent | Medium |
| DCRNN | Diffusion GCN | Encoder-decoder GRU | Slow (recurrent) |
| STGCN | ChebNet | Gated 1D conv | Fast (parallel) |
| Graph Wave Net | Adaptive adjacency | Dilated causal conv | Fast |
Traffic forecasting is the canonical spatio-temporal GNN application โ clean problem definition, public benchmarks, and real-world deployment at scale. Progress here has directly translated into improved navigation systems, logistics optimisation, and urban planning tools.
References
- Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2018). Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. ICLR 2018 (DCRNN: bidirectional diffusion convolution on road graphs combined with GRU encoder-decoder for traffic speed prediction).
- Yu, B., Yin, H., & Zhu, Z. (2018). Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. IJCAI 2018 (STGCN: fully convolutional approach replacing recurrent temporal processing with gated 1D convolution for faster training).
- Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., & Zhang, C. (2020). Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks. KDD 2020 (GWaveNet: learns the graph structure adaptively alongside dilated causal convolutions for long-range traffic patterns).
