Measuring the reliability of diagnostic mastery classifications at multiple levels of reporting


As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting must are needed. The purpose of this paper is to summarize one simulation-based method for estimating and reporting reliability for an operational, large-scale, diagnostic assessment system. This assessment system reports the results and associated reliability evidence at the individual skill level for each academic content standard and broader content strands. The system also summarizes results for the overall subject using achievement levels, which are often included in state accountability metrics. Results are summarized as measures of association between true and estimated mastery status for each level of reporting.

Paper presented at the National Council on Measurement in Education annual meeting, New York, NY.