University of Illinois at Chicago
Browse

Unsupervised Feature Selection for Heterogeneous Data

Download (4.06 MB)
thesis
posted on 2017-10-27, 00:00 authored by Xiaokai Wei
In the era of big data, one is often confronted with the problem of high-dimensional data in many data mining applications. Hence, feature selection has become an important technique since it can alleviate curse of dimensionality, speed up learning process and provide better interpretability. My Ph.D. research work focuses on unsupervised feature selection as class labels are usually expensive to obtain. In unsupervised feature selection, it is typically more challenging to evaluate the quality of features than its supervised counterpart due to the lack of guidance from class labels. We designed several new criteria, which have some desirable properties and can effectively identify discriminative features without using class labels. Moreover, due to better capability of data collection, data samples usually come in heterogeneous forms, such as networked data, multi-modal/multi-view data and data equipped with complex side information. Such heterogeneous information (e.g., network structure and additional views) can be highly useful when class labels are not available. In this dissertation, we design algorithms for such heterogeneous data to effectively select high-quality features. Through the above research work, we believe our models provide new perspectives on unsupervised feature selection and address the challenges posed by the heterogeneity of big data.

History

Advisor

Yu, Philip S.

Chair

Yu, Philip S.

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Doctoral

Committee Member

Gmytrasiewicz, Piotr Hu, Yuheng Zhang, Xinhua Ziebart, Brian

Submitted date

May 2017

Issue date

2017-04-14

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC