Convex Latent Representation Learning with Generalized Invariance
thesisposted on 01.12.2020, 00:00 authored by Vignesh Ganapathiraman
Finding representations of data that are useful for the underlying prediction task has been an important and active pursuit in machine learning. Modern-day deep learning algorithms have been hugely successful in making accurate data-driven predictions by automatically learning highly discriminative representations of data via multiple nonlinear transformations. In addition to learning discriminative representations, it is also useful to encode problem / task specific information (priors) directly in the representations. For instance, in image classification, it is useful to enforce the constraint that a machine learning model's prediction should not change when the image is perturbed spatially a.k.a translation invariance. One way to realize non-trivial structured priors in a model is via carefully designed losses and regularizers. However regularization approaches result in a hard non-convex optimization problem. Non-convex optimization problems often pose serious challenges in training and seldom result in guaranteed optimal solutions (global optima). In this thesis, we address the above challenges in representation learning by: (1) exploring priors that encode non-trivial problem / data specific structures (2) designing efficient convex models and training algorithms to automatically learn these structures n a data-dependent fashion (3) providing learning and approximation guarantees wherever possible. Our first approach towards convex representation learning is to explicitly model structured priors in the latent layer of a two-layer neural network. The resulting non-convex optimization problem is then relaxed to obtain a convex model that is able to obtain all the structural regularities of the original non-convex problem. Second, we develop a new convex representation learning framework, based on semi-inner-product spaces, to model the so-called generalized invariances in an efficient and scalable manner.