Unsupervised Learning from Multi-view Data
MetadataShow full item record
With the advance of technology, data are often with multiple modalities or coming from multiple sources. Such data are called multi-view data. Usually, multiple views provide complementary information for the semantically same data. Learning from multi-view data can obtain better performance than relying on just one single view. Also, as the data explodes, most of the multi-view data are unlabeled and it is expensive to label the data. Thus, unsupervised learning from multi-view data is very important in many real-world applications. However, in real-world application, multi-view data are usually heterogeneous (different feature spaces for different views), incomplete, large-scale and high-dimensional. These challenges prevent us from applying existing unsupervised learning methods to real-world multi-view data. This dissertation presents my Ph.D. research works on unsupervised learning from multi-view data. First, we present the first algorithm to solve the multiple incomplete views clustering problem by collectively learning the kernel matrices for different views. Furthermore, we propose a more general tensor based multi-incomplete-view clustering method, which uses a tensor to model the multiple incomplete views and learns the latent features by sparse tensor factorization. Third, we present a faster multi-incomplete-view clustering algorithm based on weighted nonnegative matrix factorization. Lastly, we propose an online multi-view unsupervised feature selection algorithm to solve the scalability and high-dimensionality challenges.
Incomplete multi-view data
Nonnegative matrix factorization