The LenguaBIg09c.xlsx workbook may be downloaded by clicking here.
As to possible exercises to put to you, there being no baseball or cricket on TV today to watch, well, here ya'go:
On the 40-item "Core" subtest, how many items ended up with an entry in Stats1b's ? column? You'll find that there weren't many at all. Should any of the items have been double-keyed? If so, how would test reliability have been affected? What do the quintile plots look like? (Most of them are actually fairly good.)
We found four of the trial items to be worth a bit of salt, I6, I15, I23, and I31. What would the reliability of a subtest formed from the 40 core items, plus these four new ones, be? (According to Spearman-Brown, it could be around 0.83.)
You could put mastery=50 on the first *sub line to check on the "Prop. consistent placings" for the core items. You could also put it on the third *sub line, and then compare results: does adding the ten trial items have any impact on "Prop. consistent placings"? What about just adding our four promising ones, I6, I15, I23, and I31?
There are two columns in the Data worksheet, "Reg." (district, or region), and "Gender", which can be used to get score and item breakouts of the sort exemplified in the M.Nursing sample. However, were they to be used in breakouts, the Reg. codes would have to be recoded first. This is because they are numeric; see the caveat at the top of this topic for help and further comments -- it's easy to fix this limitation.
Being aware of the exact source of this dataset (a secret), and the importance of the test to the students involved, we can suggest that a check for cheating would likely uncover some cases of inappropriate behavior. If this interests you, look at the Negocios sample to get an idea of what's what, that is, how to go about checking for cheating. An RSA analysis (response-similarity analysis, Lertap's method for cheat checking) should involve only students who may have had a chance to cheat; ordinarily, this will mean that the student records involved in an RSA analysis will be from a single test venue. However, we don't have the test venue codes for this dataset --the best you could do would be to limit the RSA analysis to a single school district, using the codes found in the Data worksheet's "Reg." column. For comments on how to make a dataset which includes only selected records, read here.
Or, you might just put your walking shoes on, and let them take you out for a stroll in the park (watch out for Lertap hawkers).