File(s) under embargo
until file(s) become available
Text Modeling and Mining for Healthcare Using Deep Learning
thesisposted on 01.12.2021, 00:00 by Shaika Chowdhury
Clinical texts are generated in an ever-increasing manner from sources such as EHR, medical forums and social networks. This data is information rich and being able to distill the relevant knowledge can facilitate various learning and prediction tasks in the healthcare domain. However, working with clinical texts is non-trivial and poses the following challenges: diverse expressions, heterogeneity, polysemy, data scarcity and irregular structure. This thesis focuses on effectively modeling and mining from the textual data for medical applications, so as to tackle the aforementioned challenges using deep learning techniques. To detect the diverse mentions related to pharmacovigilance from social media posts, we design a multi-task framework that benefits from joint learning of three related tasks. To extract useful patient knowledge from the heterogeneous EHR into a meaningful encoded representation, we model the data to concept graphs and fuse them using meta-embedding learning. To mine context-aware domain knowledge that is able to address the limited labeled data and polysemy issues in medical natural language inference (NLI), we supplement the medical ontology with other external resources. To mine the structured section information from the medical reports for efficient information extraction, we tackle the irregular section ordering issue by encoding both the semantic and topical dependencies of the sections using a dual sequential encoding model. Lastly, to extract the clinically-relevant information from patient-doctor conversations, we use a span-based model that helps to perform comprehensive extraction including diverse and overlapping entity mentions, and combine it with a noteworthy utterance prediction model for enhanced performance.