University of Illinois Chicago
Browse

Guided Policy Gradient for Dynamic Treatment Plan Prediction with Symptom Burden in Head and Neck Cancer

Download (1.97 MB)
thesis
posted on 2025-05-01, 00:00 authored by Harshal Jagdishbhai Hirpara
Head and Neck Cancer (HNC) accounts for approximately 3–4% of all cancers worldwide, affecting regions such as the lips, tongue, throat, larynx, nose, and salivary glands. The primary risk factors for HNC include excessive tobacco and alcohol consumption or infection with Human Papillomavirus (HPV). Treatment typically follows a sequential three-stage approach, beginning with Definitive Surgery (DS), followed by Inductive Chemotherapy (IC), and concluding with either Radiotherapy (RT) alone or Radiotherapy with Concurrent Chemotherapy (RT/CC). However, treatment plans are highly patient-specific, and specific steps may be omitted based on the patient's condition and multidisciplinary medical decisions. Determining an optimal Dynamic Treatment Regime (DTR) requires extensive collaboration among specialists, often involving multiple iterations to reach a consensus. To address this challenge, this study proposes a Deep Reinforcement Learning (DRL) framework to automate DTR planning by leveraging historical patient data. Using guided Policy Gradient (PG) methodologies, two approaches—Regularization and Fine Tuning—were explored, with the Behavior Cloning (BC) model serving as the guiding framework. Guidance ensures that the DRL models align with clinical decisions by penalizing deviations only when significant discrepancies occur. This study utilizes data from 676 patients diagnosed with HNC, all receiving at least a radiation therapy (RT) at MD Anderson Cancer Center (MDACC) between 2010 and 2021. This data consists of patients' medical records and includes clinical, diagnostic, treatment-related, and patient-reported scores (PRS) information. The data was de-identified and collected at the University of Texas under Institutional IRBs and transferred to the University of Illinois Chicago (UIC) under a Material Transfer Agreement. As per the notice of determination of human subject research by the UIC Office for the Protection of Research Subjects, this dataset does not meet the definition of human subject research at UIC. Further, different variations of the original dataset were created to analyze the impact of PRS on symptoms such as fatigue, pain, and nausea. These include: A dataset where PRS was excluded before determining any treatment. A modified dataset where individual PRS values were replaced with high- or low-burden clusters using the Symptom Burden Model (SBM) API was developed at the University of Iowa. Therefore, this study presents a comprehensive analysis of DRL-based DTR models trained on all datasets, demonstrating the potential of DRL in optimizing personalized HNC treatment plans while reducing the need for extensive manual decision-making.

History

Advisor

Xinhua Zhang

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Masters

Degree name

MS, Master of Science

Committee Member

G. Elisabeta Marai Wei Tang

Thesis type

application/pdf

Language

  • en

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC