The difficulty of a cognitive item is traditionally defined as the proportion of people who answered the item correctly. If, for example, 80% of test takers identified the correct option to Item 1, we'd say Item 1's difficulty was 0.80.


But what if there is more than one right answer to Item 1? What then? What do we do when the scoring of a cognitive item is no longer dichotomous, right/wrong, but instead exhibits polytomous scoring? We might consider a different way of expressing item difficulty under such conditions, selecting one of the following Lertap methods.


1) proportional


Under this method, item difficulty is the sum of the people who selected one of the correct answers, divided by the total number of people responding.  This method counts any response as being correct if its corresponding weight is greater than zero.  This method does not take into account any differences which may exist among response weights.


2) item mean


A second way of assessing the difficulty of a cognitive item is to simply use the item's average, its mean.  If an item has just one correct answer, and if the weight for that answer is 1.00, then the item's mean will be identical to the proportional index of difficulty.


3) item mean / max. weight (default)


Item means can be greater than zero.  Traditionally, item difficulty has been measured on a scale which goes from 0.00 to 1.00; if we divide the item mean by the greatest response weight, we effectively re-scale the mean so that it falls back to the 0.00 to 1.00 range.  This method of indexing item difficulty does exactly that.  When there's only one correct answer to an item, it yields the same result as 1) above.


As indicated, Lertap's default method is 3), item mean divided by the maximum response weight.  To change it to one of the other methods, do this: (1) make a change in Row 19 of the System worksheet in the Lertap5.xlsm file; (2) save and close the  Lertap5.xlsm file.


Finally, we should mention where cognitive item difficulties are displayed.  They're shown in the item difficulty bands found towards the bottom of the Stats1f report, and they have their very own column in the Stats1b report.


When the item difficulty calculation method has been set to 2) above, Lertap's item difficulty bands can come under stress since they use a 0.00 to 1.00 scale.  In this case, Lertap momentarily pops into the 3) method, re-scaling the mean so that it will fall into one of the bands.  However, the item mean will display correctly in the Stats1b report.