Reliability for the Dynamic Learning Maps Assessments: A comparison of methods

Reliability for the Dynamic Learning Maps Assessments: A comparison of methods

Abstract

Dynamic Learning Maps® (DLM®) alternate assessments use diagnostic classification models (DCMs) to report the knowledge, skills, and understandings of students with the most significant cognitive disabilities. To meet the needs of various stakeholders, student achievement is reported at multiple levels of aggregation. Reported results include mastery classifications for each individual skill, as well as aggregations of mastered skills within alternate content standards (Essential Elements [EEs]), content strands (conceptual areas/claims/domains), subject areas, and an overall performance level for the subject. This reporting structure ensures that fine-grained information is available to help teachers target specific instructional goals while also providing a high-level overview of student achievement that is often necessary for state accountability systems. Because results are reported at multiple levels, the reliability of each level of scoring must also be evaluated. To assess reliability, DLM assessments use an innovative simulated retest methodology. This method works by first generating a hypothetical second administration of the DLM assessment for students, and then comparing the results across the observed and simulated test administrations. Although prior work has provided theoretical support for this method, the reliability estimates from the simulation methodology have not been compared empirically to other estimates of reliability used in the DCM literature. This report describes a comparison between the simulation methodology for reliability used for DLM assessments and popular non-simulation approaches to evaluate reliability for DCMs. The findings provide evidence that the simulation methodology used for DLM assessments provides reliability estimates that are consistent with traditional approaches. The non-simulation estimates all fall within the ranges of estimates provided by the summaries of the simulated retests, indicating that the simulation method is systematically unbiased.

Date