Abstract
In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.
Original language | English (US) |
---|---|
Pages (from-to) | 640-661 |
Number of pages | 22 |
Journal | Statistical Science |
Volume | 29 |
Issue number | 4 |
DOIs | |
State | Published - 2014 |
Keywords
- Advantage learning
- Bias-variance trade-off
- Model misspecification
- Personalized medicine
- Potential outcomes
- Sequential decision-making
ASJC Scopus subject areas
- Statistics and Probability
- Mathematics(all)
- Statistics, Probability and Uncertainty