Assessing Potential Predictors of Rater Fit Measures in the Establishment of Performance Standards

Incrocci, Maria

Incrocci_Maria.pdf (928.51 kB)

Assessing Potential Predictors of Rater Fit Measures in the Establishment of Performance Standards

thesis

posted on 2016-02-16, 00:00 authored by Maria Incrocci

The purpose of this study was to determine to what extent two rater background-related variables (i.e., a rater’s gender and content domain expertise) and two item characteristic-related variables (i.e., an item’s difficulty classification and content domain classification) could account for variance in rater fit indices in the context of a standard setting for a certification examination. Licensing and certification organizations convene groups of subject matter experts in the field (i.e., raters) and engage them in a standard-setting process to recommend a cut score (performance standard) to classify proficient, knowledgeable, and competent individuals. During a standard setting, it is common practice for raters to examine individual items and then provide an estimate of the proportion of minimally qualified candidates that the rater believes would answer each item correctly. Rater fit refers to the level of accuracy or precision that an individual rater attains when providing these estimates. In this study, the fit indices were based on the variance of raters’ proportion correct estimates of the performance of minimally qualified candidates on a 200-item certification examination and empiric data gathered on the performance of another group of minimally qualified candidates who took the same items. The 24 raters who participated in the 2011 standard setting were faculty members who had taught in U.S. colleges and schools of pharmacy. The researcher used a hierarchical linear model to conduct a two-level (items nested within raters) analysis. The outcome variable was the rater fit indices. The two item characteristic-related variables accounted for 91% of the variance in the rater fit indices, suggesting that the ability to provide accurate proportion correct estimates for minimally qualified candidates was related to an item’s difficulty level and content domain classification. By contrast, the ability to provide accurate proportion correct estimates for minimally qualified candidates was not related to a rater’s gender or content domain expertise. The study’s findings support the standard-setting experts’ view that rater training which includes multiple practice rounds, discussions, interactions, and feedback can be influential in decreasing the variance in raters’ proportion correct estimates.

History

Advisor

Myford, Carol M.

Department

Education

Degree Grantor

University of Illinois at Chicago

Degree Level

Doctoral

Committee Member

Dobria, Lidia Smith, Everett Yin, Yue Wurster, Dale

Submitted date

2015-12

Language

en

Issue date

2016-02-16

Usage metrics

Keywords

performance standards standard setting rater fit

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Assessing Potential Predictors of Rater Fit Measures in the Establishment of Performance Standards

History

Advisor

Department

Degree Grantor

Degree Level

Committee Member

Submitted date

Language

Issue date

Usage metrics

Categories

Keywords

Licence

Exports