Gibbs Posterior Distributions: New Theory and Applications
thesisposted on 08.02.2018 by Nicholas A Syring
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Bayesian inference is, by far, the most well-known statistical method for updating beliefs about a population feature of interest in light of new data. Current beliefs, characterized by a probability distribution called a prior, are updated by combining with data, which is modeled as a random draw from another probability distribution. The Bayesian framework, therefore, depends heavily on the choices of model distributions for prior and data, and it is the latter that is of particular concern in this dissertation. Often, as will be shown in various examples, it is particularly difficult to make a good choice of data model: a bad choice may lead to misspecification and inconsistency of the posterior distribution, or may introduce nuisance parameters, increasing computational burden and complicating the choice of prior. Some particular statistical problems that may give Bayesians pause are classification and quantile regression. In these two problems a mathematical function called a loss function serves as the natural connection between the data and the population feature. Statistical inference based on loss functions can avoid having to specify a probability model for the data and parameter, which may be incorrect. Bayes' Theorem cannot reconcile a posterior update using anything other than a probability model for data, so alternative methods are needed, besides Bayes, in order to take advantage of loss functions in these types of problems. Gibbs posteriors, like Bayes posteriors, incorporate prior information and new data via an updating formula. However, the Gibbs posterior does not require modeling the data with a probability model as in Bayes; rather, data and parameter may be linked by a more general function, like the loss functions mentioned above. The Gibbs approach offers many potential benefits including robustness when the data distribution is not known and a natural avoidance of nuisance parameters, but Gibbs posteriors are not common throughout statistics literature. In an effort to raise awareness of Gibbs posteriors, this dissertation both develops new theoretical foundations and presents numerous examples highlighting the usefulness of Gibbs posteriors in statistical applications. Two new asymptotic results for Gibbs posteriors are contributed. The main conclusion of the first result is that Gibbs posteriors have similar asymptotic behavior to a class of statistical estimators called M-estimators in a wide range of problems. The main advantage of the Gibbs posterior, then, is its ability to incorporate prior information. The second result extends results for Bayesian posteriors to Gibbs posteriors in a statistics problems where the population feature of interest is a set with a smooth boundary. Additionally, two main applications are considered, one in medical statistics and one in image analysis. The first application concerns the minimum clinically important difference (MCID), a parameter designed to indicate whether the effect of a medical treatment is practically signi cant. Modeling for the purpose of inference on the MCID is non-trivial, and concerns about bias from a misspeci fied parametric model or inefficiency from a nonparametric model motivate using the alternative Gibbs approach, which balances robustness and efficiency. The second application concerns the detection of an image boundary when the image pixels are observed with noise. Likelihood-based methods for the image boundary require modeling the pixel intensities inside and outside the image boundary, even though these are typically of no practical interest. However, a Gibbs posterior can be defined directly on the image boundary parameter, thereby avoiding this issue. Finally, the Gibbs posterior comes with a scale parameter, also referred to as a learning rate, which mainly affects its finite sample performance. Current research directions do not agree on how to select the learning rate. This dissertation presents a new method, called Gibbs posterior calibration (GPC), to select the learning rate so that Gibbs posterior credible regions are approximately calibrated to their nominal frequency coverage probabilities. Simulation results demonstrate that the proposed algorithm yields highly efficient credible regions in a variety of applications when compared to existing methods.