posted on 2016-10-19, 00:00authored byShalini T. Reddy
Construct aligned rating scales (CARS) have been shown to improve the inter-rater reliability of performance assessments. This study examined the use of milestones-based CARS to score chart stimulated recall (CSR) assessments for postgraduate medical trainees and compared its inter-rater reliability with a standard anchored rating scale (SARS).
Six standardized videos of medical trainees participating in CSR exercises were developed. A six-member clinician-educator expert panel generated gold-standard scores for each scenario using a peer review process. Clinical teaching faculty (raters) were recruited and randomized to two groups to score the six standardized videos, following a prospective crossover design. The two forms contained eleven identical items, and differed only in scoring anchors. The two groups scored three videos using CARS or SARS; two weeks later, the groups switched forms and scored three new videos. Rater scores were compared to gold-standard scores using inter-rater reliability indices. Generalizability study was conducted to examine sources of variability in scores and reliability. Scoring was conducted between September and December 2015.
Twenty-two faculty raters were recruited to participate. Raters scoring with CARS performed significantly better for two items (appropriately using consultants and recognize characteristics of a patient) and for overall score. Rater % variance was nearly 5 times greater for SARS. Raters using CARS had significantly greater odds of scoring accuracy, even after controlling for baseline rater characteristics (OR=2.59, p<.001).
Findings from this study indicate significantly greater scoring accuracy and reliability when CARS was used. We provide validity evidence for a milestones-mapped CARS form for scoring CSR in internal medicine post-graduate trainees. These results have implications for the development of scoring rubrics that can benefit from a CARS-type model, rather than SARS.