FEI-DISSERTATION-2017.pdf (1.17 MB)
Open Classification and Change Detection in the Similarity Space
thesis
posted on 2017-10-27, 00:00 authored by Geli FeiThe rapid emergence of new topics and the highly diverse nature of online text data have brought new challenges to existing text classification techniques. One of the main challenges is their lack of ability in handling unseen classes of documents due to the closed world assumption, under which all test classes are assumed to be known at training time. However, a more realistic scenario is to expect unseen classes during testing (open world). This problem is called open (world) classification. In this thesis, we start with studying three closely related research problems to open classification. First, we study the problem of text classification under negative covariate shift. Then we proceed to study the general problem of open (world) classification. Furthermore, we propose cumulative machine learning, where unseen classes of documents are not only detected, but also incorporated into the existing system in an efficient manner. One of the key techniques used in the above research is the transformation of documents to a similarity space to detect the special type of change in the test class distribution, i.e., the arrival of unseen classes. As the last part of this thesis, we explore the use of similarity-based approaches in detecting a new type of change in social media accounts. In particular, we study the problem of detecting changed-hands online review accounts. Extensive experiments have shown that the proposed approaches are highly effective.
History
Advisor
Liu, BingChair
Liu, BingDepartment
Computer ScienceDegree Grantor
University of Illinois at ChicagoDegree Level
- Doctoral
Committee Member
Di Eugenio, Barbara Gmytrasiewicz, Piotr Yu, Philip S Mahmud, JalalSubmitted date
May 2017Issue date
2017-03-20Usage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC