Assessment of a Spline-Based Spatio-Temporal Model for Use with Educational Datasets
thesisposted on 17.02.2017 by Shannon Milligan
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
In this thesis, we argued for an increase in the use of spatio-temporal analysis in educational research. Though educational research frequently addresses questions that lend well to spatial/spatio-temporal analyses, the use of such analyses is almost non-existent in this field. We argued that the dearth of usage is due to three challenges: complications of large datasets, complex statistical models, and model selection. To overcome these challenges, we provided a three-part toolkit for educational researchers. The first part of our toolkit was the introduction of a Bayesian linear spatio-temporal model, which addressed the second challenge through features such as the use of an ANOVA-based model. This first part of our toolkit also addressed the first challenge through use of Bayesian Ridge Regression and thin-plate splines, which accommodate spatio-temporal variables. The model also addressed this challenge through inclusion of marginal maximum likelihood estimation which, along with ridge regression, allowed for avoidance of the lengthy computational time often involved with Markov chain Monte Carlo sampling. The second part of our toolkit addressed the second challenge through demonstration of the ease of use of our model using the Bayesian Ridge Regression software, which utilizes a point-and-click interface similar to SPSS. To demonstrate the ease of use of our model, we analyzed two educational datasets, one relatively large (n = 18,506) and one smaller (n = 2,729). To further demonstrate the application of the model and interpretation of the results, we also included sample posterior output and a guide for how to run this particular model. The ease of interpreting the results also addressed the third challenge, in part through demonstration of the use of posterior credible intervals to determine significant predictors. Likewise, we also demonstrated the ease of determining the best combination of predictors to include in a model by using select model fit indices. The third part of our toolkit addressed the descriptive side of spatio-temporal analysis, which emphasizes further exploration of spatial/spatio-temporal relationships (e.g. through mapping). We emphasized this exploration in part to caution researchers against conflating correlation and causation, especially when sensitive variables (e.g. race and ethnicity) are included.