Thursday 29 November 2007

High stake national assessments and ranking

It's been a long day full of many, many presentations. Fortunately the last presentation was actually one of the more interesting ones, and I did not have to fight off the embarrassment of falling asleep too hard. It was a presentation by Jakob Wandall from Skolestyrelsen.
about the new national computer based assessments in secondary education that have been introduced in Denmark.

While the technical side of this was interesting (they were using computer adaptive testing for instance), the most interesting bit of the talk had nothing to do with technology at all. It had to do with how the test was used, presented used and regulated.

In England, high stakes tests are a very big deal. The main reason is that they are always is that they are inevitable translated into rankings and funding consequences, leading to teachers and school becoming completely obsessed with assessments, drilling students until they are green in the face in the idle expectation it might raise the school a place or 2 in the oh so important regional league tables. It is this abomination that I think the Danish have elegantly addressed (apparently with the English system as the example of what they wanted to avoid at all costs, and understandably so!)

The publication of the results of these national benchmarks is strictly regulated. The national average is published and used for policy purposes, but no regional or individual result is public. Teachers can review all results of all their students, and even responses to individual questions, but are forbidden to communicate these results other then to the student and their parents (and this communication is not in the form of a grade, but of a textual report with feedback). Students have to be given their result by a qualified teacher that discusses the results and provides relevant feedback on the performance.

So it is impossible for a school, a local authority or the press, to rate and rank scores just on the numerical outcomes of a single test. It provides stakeholders on every level with the relevant information, without the detrimental effects of publication that we see in the US and UK. I think we've got a lot to learn from the Scandinavian approach to education

Monday 26 November 2007

The ideal assessment engine

I've been looking into criteria for assessment technologies a lot lately. One reason is that we are looking into migrating our current system to a new platform (as the old one, Authorware, is no longer supported). The other reason is that I have been invited by the Joint Research Centre to take part in a workshop on quality criteria for computer based assessments. I will be posting on the outcomes of that workshop next week. For now though, here are some of my thoughts on the topic.

Flexibility
The main strength of our current system is flexibility. This has several aspects, that are all important in their own right:
  • Flexibility in design: The layout of the question can be modified as desired, using media and such to create an authentic and relevant presentation
  • Flexible interactions: There is no point in systems that have parameterized 5 question types for you, and all you can do is define a title, question text, alternatives and select the right answer. Interactions testing and supporting higher order skills are, or should be, more complex then that.
  • Detailed and partial scoring: A discriminating question does not just tell you whether you were completely right, or completely wrong. It can tell you the degree to which you were right, and what elements of your answer had any value. It might also penalize you for serious and fundamental mistakes.
  • Detailed feedback: A lot of mistakes learners make are predictable. If we allow assessment systems to capture these mistakes and give targeted feedback, learners can practice their skills while lecturers can focus there time on more in depth problems that require their personal engagement.
  • Extensive question generation and randomization options: For the re-usability of assessments, generating questions using rules and algorithms given a single question almost infinite re usability. On the assessment level, the same is true for assessment generation based on large banks with questions tagged with subject matter and difficulty.
So far, no real news for TRIADS users (although no proprietary system I know of really supports this well).

Questions without assessments
As Dylan Wiliam so eloquently worded at the ALT-C conference (you can find his podcast on the matter on http://www.dylanwiliam.net/), the main value in learning technology lies in "to allow teachers to make real-time instructional decisions, thus increasing student engagement in learning, and the responsiveness of instruction to student needs." I could not agree more. However, this means that questions should not just exist within the assessment, but instead be embedded within the materials and activities. Questions become widgets that can of course still function within an assessment, but also work on their own without loosing the ability to record and respond to interaction. This, as far as I'm aware, is unchartered territory for assessment systems. Territory that we hope to explore in the next iteration of our assessment engine.

Wednesday 21 November 2007

e-APEL article in Response

The new online journal Response has published a 'work in progress' report I wrote on the e-APEL project that I'm involved in. I'm afraid it is rather dated, as the journal took more then 8 months to actually publish this version. Still, for those interested in the accreditation of prior learning, or IT projects in education in general, it might be a worthwhile read.