Skip to main content

Table 5 Wilcoxon two-sample test results (Holms-Bonferroni adjusted) for each of the six quality metrics and the overall cognitive domain judgment item

From: Can automated item generation be used to develop high quality MCQs that assess application of knowledge?

Item

Wilcoxon median z-statistic (out of 3)

Empirical type I error

Adjusted critical type I threshold

2b

2.55

.01

.007

Overall cognitive domain judgment

2.02

.04

.008

1a

1.91

.06

.010

4d

1.01

.31

.013

6f

0.92

.36

.017

3c

0.46

.65

.025

5e

− 0.43

.67

.050

  1. Overall cognitive domain judgment is the item tests factual knowledge only/the item tests application of knowledge
  2. aThe central idea is in the stem (i.e., stem is required to answer the item)
  3. bThe directions in the stem are very clear
  4. cThere are no obvious cues or item flaws (grammatical cues, conspicuous right answer, etc.)
  5. dThe length of the choices is about equal
  6. eAll distractors are plausible
  7. fThis is a high-quality item