educational-assessment

Empirical methods for evaluating maps: Illustrations and results

Learning progressions and learning map structures are increasingly being used as the basis for the design of large-scale assessments. Of critical importance to these designs is the validity of the map structure used to build the assessments. Most …

Empirical methods for evaluating maps: Illustrations and results

Learning progressions and learning map structures are increasingly being used as the basis for the design of large-scale assessments. Of critical importance to these designs is the validity of the map structure used to build the assessments. Most …

Dynamic Learning Maps

The Dynamic Learning Maps® Alternate Assessment System is a large scale assessment administered by the Center for Accessible Teaching, Learning, and Assessment Systems at the University of Kansas.

Evaluating Model Estimation Processes for Diagnostic Classification Models

PhD dissertation submitted to the School of Education at the University of Kansas.

Measuring reliability of student mastery classification at multiple levels

As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting must are needed. The purpose of this paper …

A hierarchical IRT model for identifying group-level aberrant growth

As cheating on high-stakes tests continues to threaten the validity of score interpretations, approaches for detecting cheating proliferate. Most research focuses on individual scores, but recent events show group-level cheating is also occurring. …

Measuring reliability of student mastery classification at multiple levels

As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting must are needed. The purpose of this paper …

Using simulation to evaluate retest reliability of assessment results

As diagnostic assessment systems become more prevalent as large-scale operational assessments, consideration must be given to the method of reporting reliability. Alternatives to traditional reliability methods must be explored that are consistent …

Construct Irrelevance

Construct irrelevance, as the name might suggest, refers to measuring phenomena that are not included in the definition of the construct. This is generally considered to be one of the two biggest threats to the validity of an assessment, along with …

Creating an EM:IP Cover Graphic Using ggplot2

This time last year, I submitted a graphic to the Educational Measurement: Issues and Practice (EM:IP) cover showcase competition. In April at the annual National Council on Measurement in Education conference, it was announced that I was one of four winners that would be featured on the cover of EM:IP this year. Earlier this week, the issue with my graphic was released! The graphic demonstrates how different levels of compensation in multidimensional item response theory models (MIRT).