Rater Stability in a High-Stakes Performance Assessment:  A Longitudinal Investigation

McNaughton, Tara M

Rater Stability in a High-Stakes Performance Assessment: A Longitudinal Investigation

thesis

posted on 2018-07-25, 00:00 authored by Tara M McNaughton

The certification of medical practitioners frequently includes a performance assessment to ensure competence. Although such assessments offer richer evaluations of examinee performance compared to other exam types, the reliance on expert judgement in evaluating examinees presents some concerns. The subjective nature of the rating task may allow factors unrelated to examinee performance to influence ratings, and raters may have idiosyncratic perceptions of performance levels. To assess inter- and intra-rater differences, I used the Many-Facet Rasch Measurement model to quantify rater severity and rating scale category use. Applying a partial credit model on the rater facet, I used rater category thresholds to calculate a category breadth measure to identify central tendency and extremism. This method compares favorably with other indices used to identify these rater effects. The category breadth method identifies a slightly larger proportion of raters as exhibiting effects while providing more precise feedback to raters and rater trainers. Using hierarchical linear models, I assessed the stability of rater severity and consistency measures longitudinally. Most raters demonstrated stable severity; however, a sizable minority did not. Therefore, caution is warranted when using rater severity in common element equating designs. Conversely, nearly all raters demonstrated stable consistency measures, suggesting that rater consistency does not improve with experience. More intensive training for new raters or the use of practice ratings as a screening tool for rater selection may be necessary to improve rating quality.

History

Advisor

Smith, Jr., Everett V

Chair

Smith, Jr., Everett V

Department

Educational Psychology

Degree Grantor

University of Illinois at Chicago

Degree Level

Doctoral

Committee Member

Yin, Yue Thomas , Michael K Dobria, Lidia Incrocci, Maria

Submitted date

May 2018

Issue date

2018-04-05

Usage metrics

Keywords

rater effects rater severity rating quality Longitudinal Rasch

Licence

In Copyright