When you hear the words mental health, your reaction probably falls on a spectrum, depending on cultural context and accumulated life experiences. Public perception of psychological illnesses have long shaped how we approach treatments and research into them. In this dissertation we carefully synthesize previous years of research to retain its good bits and do away with the other parts to create a system that facilitates explainable NLP with respect to the mental heath domain, culminating in a working system to facilitate and score a clinically validated standardized interview task.
Most long-standing natural language processing research in this domain has focused on open source data usually stemming from social media such as blog posts or forums. These data collections are usually unverifiable, untrustworthy, and do not truly represent individual effects of psychological conditions. However, using social media has proven useful in some ways. Large clinical data collection is often a difficult task, and therefore we have often used open source data to form reasonable conclusions about NLP and AI in this domain.
We start this thesis with a comprehensive review of research in this area, filtering and screening through 84000+ research papers to select 500 to review in depth. Through qualitative analyses and quantitative statistical testing, we identify reasons why there has been a lack of adaptable NLP systems in patient care outcomes. We study correlations between the contents, definitions, motivations, and methods of the papers to their outcomes.
Later, we also investigate why researchers and social media users alike gravitate to these platforms to study and discuss mental health. We use three datasets sourced from online social media sources to establish the strengths of transfer learning over traditional machine learning in this space, and we also analyze the group differences between subjects from online sources.
To move towards a real clinical system we introduce a novel dataset. We recruit 644 participants from three diagnostic groups: people with schizophrenia (n=247), people with bipolar disorder (n=286), and healthy control subjects (n=110). Expert clinicians then conduct a focused interview called the social skills performance assessment SSPA for each of these participants, designed to reveal people's social abilities through standardized interactions in a clinical setting. The interviews are annotated across seven clinical dimensions by experts, and we solicit verbatim transcriptions of these interviews from a third-party service for our own downstream processing.
With this dataset, we empirically establish that standard feature engineering approaches can be used for binary classification and initial benchmarking of models across diagnostic groups. We then perform feature understanding experiments, using statistical tests to discern the importance of features and their effects across diagnostic categories and demographic categories of age and biological sex. To further distinguish between speech patterns of subjects we perform unsupervised topic modeling experiments using large language models to provide visual and analytical understanding of how speech changes can be observed across medical diagnostic groups. A carefully defined prompting setup helps us examine our topic modeling outcomes further to identify and isolate themes.
Finally, we conclude this dissertation with experiments leveraging state-of-the-art large language models to act as assistants in the SSPA task. We create an encoder-decoder, sequence-to-sequence, autonomous system that can interview patients and predict clinical scores across SSPA dimensions. We report syntactic, semantic, and alignment scores in the interview generation task. We report RMSE and standard accuracy metrics for clinical score and class prediction. The model shows promise as a new end-to-end system for focused probing. We note the strengths, limitations, and bottlenecks of this model along with explanations of its patterns and hallucinations. We end this thesis with future recommendations that can be trusted, moved to production, and used to assist in explaining psychological conditions through the deep lens of natural language processing.
History
Advisor
Natalie Parde
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Barbara Di Eugenio
Joseph Michaelis
Colin Depp
Cornelia Caragea