DDCM Validation Framework Summary
DDCM Validation Framework Summary
Source: 0 - Inbox/drafts/new materials for DDCM (condition)/validation_process.md
Date: 2026-01-27
Overview
This document summarizes the framework for the algorithm implementation. The framework integrates academic standards (Västberg et al., 2020), Approximate Dynamic Programming (ADP) validation (Powell, 2011), and Relational MDP (RMDP) verification.
Validation Philosophy: Theoretical Correctness → Approximation Quality → Empirical Validation → Predictive Performance → Policy Testing
Validation Components
1. Theoretical Validation (Algorithm Correctness)
- Objective: Verify the core Dynamic Programming (DP) algorithm follows theory.
- Key Tests:
- Bellman Equation: Verify log-sum-exp computation and terminal boundary conditions.
- Choice Probabilities: Ensure probabilities sum to 1.0 and satisfy IIA.
- Value Function Properties: Check for monotonicity (value decreases as time runs out) and boundedness.
2. ADP Approximation Validation
- Objective: Assess accuracy of approximate value functions ($\tilde{V}$) compared to optimal ($V^*$).
- Key Tests:
- Small Problem Comparison: Compare approximate vs. exact solutions on small state spaces (Target: Relative Error < 5%).
- Basis Function Validation: Check coefficient magnitudes, feature correlations, and residual analysis.
- Optimality Gap: Estimate the gap between approximate and optimal policies (Target: < 5%).
3. Relational MDP (RMDP) Validation
- Objective: Validate lifted representations and state abstractions used to handle large state spaces.
- Key Tests:
- Correctness (Formal Verification): Compare lifted solution to grounded solution on small instances (Value Error $\approx$ 0).
- Scalability: Confirm computational time grows polynomially, not exponentially, with the number of objects/zones.
- Abstraction Validity: Verify that objects treated as indistinguishable are truly equivalent (Partition Equivalence Test).
- Markov Property: Ensure the relational abstraction preserves the Markov property.
4. Empirical Validation
- Objective: Match observed data used for estimation (Within-Sample).
- Benchmarks:
- Mode shares: < 2% absolute error (Chi-square test).
- Trips/day: < 5% relative error (MAPE).
- Activity start times: p > 0.05 (KS Test).
- Statistical Tests: Chi-Square for categorical outcomes, KS Test for continuous distributions.
5. Predictive Validation
- Objective: Assess generalization to unseen data (Out-of-Sample).
- Key Tests:
- Holdout Validation: 70/30 train/test split. Target: Test Error < 1.5 × Train Error.
- Cross-Validation: 5-Fold CV. Target: Parameter CV < 10%.
- Sensitivity Analysis: Parameter sensitivity and discretization robustness.
6. Computational Validation
- Objective: Ensure numerical accuracy and implementation correctness.
- Key Tests:
- CPU-GPU Consistency: Verify results match within numerical precision (diff < 1e-5).
- Numerical Stability: Test log-sum-exp and ensure no NaN/Inf values.
- Implementation Verification: Unit tests (> 90% coverage) and integration tests.
7. Reporting Standards
- Requirements:
- Standardized tables for model specification, parameter estimates, and validation metrics.
- Completion of a comprehensive validation checklist before submission.
Key Success Criteria Summary
| Category | Metric | Target | ||
|---|---|---|---|---|
| Theoretical | Probability Sum | 1.0 ± 1e-6 | ||
| ADP | Approximation RMSE | < 5% of $ | V^* | $ |
| RMDP | Grounded vs Lifted Error | $\approx$ 0 | ||
| Empirical | Mode Share Error | < 2% | ||
| Empirical | Trip Frequency Error | < 5% | ||
| Predictive | Parameter CV | < 10% | ||
| Computational | CPU-GPU Diff | < 1e-5 |