Structural-IRL
Tractable welfare measurement in activity-based dynamic discrete choice models
Long-term research agenda (JSPS DC1 application)
A central question in transport policy is: does this policy make residents better or worse off, and by how much? Activity-based dynamic discrete choice models (DDCM) can answer this with behavioral realism — but exact welfare estimation has remained computationally intractable at city scale.
This research develops Structural Inverse Reinforcement Learning (Structural-IRL), an algorithm that resolves this barrier by exploiting two proven mathematical connections:
- DDCM estimation is structurally equivalent to inverse reinforcement learning (IRL)
- DDCM backward induction shares the same computational graph as graph neural networks (GNN)
These connections allow deep learning and reinforcement learning methods to be incorporated into DDCM estimation — but only under conditions that preserve the welfare guarantee. Structural-IRL establishes those conditions and builds an algorithm that is both tractable and welfare-exact.
Research themes:
- Theme 1 — Establish the conditions under which RL/DL connections preserve the welfare guarantee
- Theme 2 — Design and implement Structural-IRL satisfying those conditions
- Theme 3 — Empirical validation against the exact estimation baseline; city-scale spatial welfare analysis
Policy motivation:
For a new bus rapid transit corridor, existing methods estimate aggregate ridership or average travel-time savings. Structural-IRL can estimate, for each household, the net change in daily well-being — identifying which neighborhoods gain, which lose, and where complementary measures are needed.
Foundation:
Built on completed master’s thesis work (exact DDCM framework) — 825 persons, Higashi-Hiroshima, exact estimation in ~17 hours on GPU.