Build quarterly model retraining pipeline with drift detection

## Motivation

Notebook 08 confirmed data drift: median `elapsed_days` at k=5 halved from 39d (2017Q1) to 16d (2018Q4). Without retraining, the model's probability estimates will drift from reality.

## Proposed pipeline

1. **Drift trigger**: monitor median `elapsed_days` at k=5 monthly. Alert if it shifts >5 days from the trailing 3-month baseline.
2. **Retraining**: rolling 18-month window, exclude `elapsed_days` (leaky), use temporal CV for evaluation.
3. **Gate**: only deploy if new model AUC ≥ 0.780 on the most recent quarter's hold-out. Otherwise keep previous model and page on-call.
4. **Logging**: record Brier score, AUC, and feature importances for each retrain. Flag if SHAP rank correlation vs. previous model < 0.7 (concept drift).

## Key decisions from Notebook 10

- `elapsed_days` must remain excluded (temporal leakage confirmed)
- Platt scaling is not needed (raw probabilities are well-calibrated, Brier 0.066)
- Minimum training size for 95% of peak AUC at k=5: ~1,000–1,500 cases

## References

- Notebook 08: `notebooks/08_temporal_cv.ipynb`
- Notebook 10: `notebooks/10_leakage_calibration.ipynb`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build quarterly model retraining pipeline with drift detection #7

Motivation

Proposed pipeline

Key decisions from Notebook 10

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Build quarterly model retraining pipeline with drift detection #7

Description

Motivation

Proposed pipeline

Key decisions from Notebook 10

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions