Lertap technical docs

Some somewhat technical papers; not all of them are boring.

• Visual item analysis with quintile plots.

The matter of assessing the quality of cognitive test items has traditionally been based on tables of numeric data, embroidered with core global measures, such as estimates of test reliability.

Lertap's quintile plots provide an alternative method: pictures.

Have a look at some examples, and see if you too might not be found wearing smiles for quintiles. Click here to have an initial look; you'll branch out to a pdf document, about 400 KB. See the new "packed plots" in action here.

Then, pour yerself another cuppa something, sit back, and take in a short practical example from some quality achievement tests developed and used in Central Java ('Jateng'). (PDF file, about 110 KB.)

• Using cut scores to denote subject mastery.

The Standards for Educational and Psychological Testing (1999), published by the American Educational Research Association, recommend the use of 'special' statistics when the measurement process involves the use of cut scores. Mastery, licensing, and certification tests are examples of applications which typically use cut scores, often on a pass-fail basis.

Starting with version 5.6.3, Lertap supports all of the 'special' statistics recommended in the Standards. We've got a whiz-bang, top-flight paper which you'll not want to miss if you hope to make the cut. Have a read (PDF file, about 500 KB).

NCCA, the National Commission for Certifying Agencies, has a special report form used to summarise results from mastery (or pass/fail) exams. A paper which indicates how Lertap's output links in with the information requested by NCCA is here.

(Caution inserted June 2007: the top-flight paper just mentioned fails to point out a potential error in version 5.6.3. Two of the three conditional standard error of measurement calculations made by version 5.6.3, CSEM1 and CSEM2, may be inaccurate if users have employed one or more *mws lines in their CCs worksheet. CSEM1 and CSEM2 assume all test items are scored on a right/wrong basis. While this is far and away the normal case, the use of *mws lines changes the picture, and can in some situations introduce error in the CSEM1 and CSEM2 figures. Less than 5% of the Lertap-using world employs *mws lines; this caution is for their eyes only.)

• Differential item functioning (DIF).

Support for DIF analyses was added to the Excel 2007 version of Lertap in September, 2009.

Lertap uses Mantel-Haenszel (M-H) methods for assessing the possibility of differential item functioning.

A feature of Lertap's implementation of M-H involves the ability to make charts of group item responses, including empirical item response function plots.

Read about it / see all about it with a wee click about here (PDF file, about 800 KB).

• Response similarity analysis (RSA).

In July 2005 work started on equipping Lertap with tools to enable users to investigate the possibility of student cheating.

A description of the basics of this work may be seen by clicking here (PDF file, about 600 KB. Note: Lelp's Response similarity analysis topic is very relevant to this document).

The initial methods used in Lertap's "response similarity analysis" were tested with several data sets from two major testing centres, and a paper prepared for journal submission. Other programs, such as Integrity, Scrutiny!, and SCheck are mentioned in this paper. You may take in a near-final draft version of the paper with a click here (PDF file, about 300 KB).

In early 2006 an updated version was released, with much improved support. Read all about it (PDF file, about 350 KB).

• Iteman 4 and Lertap 5

ITEMAN is another item analysis program recognised (not to mention used) all over the world. These documents point to some of the differences between the latest versions of Iteman and Lertap 5. For a very general discussion see this small PDF file. For a more detailed account of one particular area of difference, item performance summaries and flags, you won't want to go without reading this PDF file.

•	About eigenvalues, scree tests, and coefficient alpha. Copy of a 2005 journal article having to do with interpreting some of Lertap's output, and suggesting a new method for guesstimating the number of factors underlying a set of test items. (PDF file, about 450 KB.)

• Production mode.

We have worked a bit with an Australian university to suggest the design of a system which will accept output from a scanner, and automatically (1) reformat it so that it's Lertappable, and (2), have Lertap make reports and graphs. This document gets into macros, and more macros— Holy macro! (Word doc file, about 190 KB.)

Note: Lelp's Production mode topic is very relevant to this paper.

• Programming Lertap

Talking about macros, the Macs menu may be used to link home-grown code modules to the toolbar, making it possible to customize Lertap so that it's specially tailored to local needs.

The technical aspects underpinning this capability can require some knowledge of programming, but we have an example or two ready for all users to enjoy (Word doc file, about 180 KB).

Note: to fully pursue the use of macros in Lertap, you should not be without a read of Lelp's Macs menu topic.

•	Scoring open-response items. It is possible to get Lertap to score open-ended, short-answer, and free-response questions. This document explains how, using real data from a graduate student in Minnesota. (Word doc file, about 120 KB.)

• Lertap's correlation coefficients.

As millions of people the world over launch their Lertap careers, the question sometimes arises as to why some of Lertap's results differ from results obtained from other programs. The answer has to do with correcting correlation coefficients for something your parents probably never mentioned: part-whole contamination. (Word doc file, about 200 KB.)

• Solutions for Excel's 255-character limit.

Users with a tendency to create very long lines in Lertap's CCs worksheet can find themselves up against an Excel limitation. This document might help you out, should you long for lengthy CCs lines some day. (Word doc file, about 230 KB.)

Note for Excel 2007 users: this limitation is gone! Excel 2007 has significant enhancements in some areas (but in other areas may not be as good as previous versions; see this paper).

• Experimental features in Lertap 5.

Mentions some special statistics which are available in Lertap, but not normally output. These include biserial correlation coefficients, and classical test estimates of two item-response theory (IRT) parameters. Updated in 2005 to reference Dawber's doctoral research. (Web page, will open in a new browser window.)

• Item analysis in criterion-referenced situations.

Comments on the use of item analysis for competency-based testing, by Ian Boyd, director of one of Western Australia's technical and further education colleges. The mastery test procedures discussed in this paper are supported by Lertap 5. (Word doc file, about 120 KB.)

• Rasching an achievement test (draft of 27 May 2008) .

Examines the use of two Rasch (IRT) oriented systems, ConQuest 2.0 and Winsteps, using an achievement test traditionally processed with classical test theory (CTT) and Lertap. Argues that there may be little or (likely) even no gain in using Rasch scaling, pointing out that there is reason to question the Rasch assertion of "fundamental measurement" and interval scaling. (PDF file, about 450 KB.)

• Some CTT and IRT comments.

For years playground bullies have been running amuck, suggesting that classical test theory is outdated. You may have seen them— they wear strange hats, and t-shirts with "IRT is me" printed on them. This paper looks at some recent literature which compares the use of CTT and IRT methods; bastante interestante! (Web page, will open in a new browser window.)

Not too massive missives and musings (more than too, in fact)

•

Visual item analysis with quintile plots.

•

Using cut scores to denote subject mastery.

•

Differential item functioning (DIF).

•

Response similarity analysis (RSA).

•

Iteman 4 and Lertap 5

•

About eigenvalues, scree tests, and coefficient alpha.

•

Production mode.

•

Programming Lertap

•

Scoring open-response items.

•

Lertap's correlation coefficients.

•

Solutions for Excel's 255-character limit.

•

Experimental features in Lertap 5.

•

Item analysis in criterion-referenced situations.

•

Rasching an achievement test (draft of 27 May 2008) .

•

Some CTT and IRT comments.