Review Articles

Deep advantage learning for optimal dynamic treatment regime

Shuhan Liang ,

Department of Statistics, North Carolina State University, Raleigh, NC, USA

Wenbin Lu ,

Department of Statistics, North Carolina State University, Raleigh, NC, USA

lu@stat.ncsu.edu

Rui Song

Department of Statistics, North Carolina State University, Raleigh, NC, USA

Pages 80-88 | Received 02 May. 2017, Accepted 14 Apr. 2018, Published online: 16 May. 2018,
  • Abstract
  • Full Article
  • References
  • Citations

ABSTRACT

Recently deep learning has successfully achieved state-of-the-art performance on many difficult tasks. Deep neural networks allow for model flexibility and process features without the need of domain knowledge. Advantage learning (A-learning) is a popular method in dynamic treatment regime (DTR). It models the advantage function, which is of direct relevance to optimal treatment decision. No assumptions on baseline function are made. However, there is a paucity of literature on deep A-learning. In this paper, we present a deep A-learning approach to estimate optimal DTR. We use an inverse probability weighting method to estimate the difference between potential outcomes. Parameter sharing of convolutional neural networks (CNN) greatly reduces the amount of parameters in neural networks, which allows for high scalability. Convexified convolutional neural networks (CCNN) relax the constraints of CNN for optimisation purpose. Different architectures of CNN and CCNN are implemented for contrast function estimation. Both simulation results and application to the STAR*D (Sequenced Treatment Alternatives to Relieve Depression) trial indicate that the proposed methods outperform penalised least square estimator.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by National Institutes of Health [5P01CA142538].

Shuhan Liang is a Ph.D. student in the Department of Statistics at North Carolina State University. Her research interests focus on machine learning and optimal treatment regime estimation.

Wenbin Lu is a professor in the Department of Statistics at North Carolina State University. He received his Ph.D. in Statistics from Columbia University in 2003. His research interests focus on biostatistics, high-dimensional data analysis, and machine and reinforcement learning.

Rui Song is an associate professor in the Department of Statistics at North Carolina State University. She received her Ph.D. in Statistics from the University of Wisconsin at Madison in 2006. Her research interests focus on high-dimensional statistical learning, semiparametric inference and dynamic treatment regime.