The "Why" Behind the Favorites
I know what you might be thinking: It’s not fair to play favorites, even with something as inanimate as an item type. And you’d be absolutely right; stop reading this article right now! Joking aside, my “favorite” item type is any one that best helps me achieve my particular measurement goals; in other words, my favorite can and does change based on my or my clients’ specific needs.
The "Why" Behind the Least Favorites
Conversely, my least favorite item type is one I’m being forced to use, based on a strict numeric target of having x number of specific types on an exam, where I and my item writing cohorts must jam content into a format rather than using the item type that will result in the best measurement. The item type that measures the skill in the best way possible is always going to win out for me (and I believe most, if not all, veteran item writers would agree with me on that point.)
The 10 Best Item Types
After spending a decade creating items and exams, particularly in Caveon’s robust Scorpion tool, I have formed some preferences, though. And I’d love to share them with you. Think of this list less as prescriptive and more as my own psychometric version of “Oprah’s Favorite Things.” (“You get a great item type, you get a great item type, you ALL get great item types!”) Or, if you prefer, sing this list to the tune of “These are a few of my favorite things” a la The Sound of Music.
Discrete Option Multiple-Choice™ (DOMC)
The DOMC™ item type hits all the marks for me:
- Adaptable: Suitable for most measurement goals. With options shown one at a time to the test taker, there’s a lot of flexibility. For example, you can control how many total options are in your option pool (hundreds, or just two if you want). You also choose how many total options are presented to the test taker, how many must be answered correctly for the test taker to score a point, and more. Item writers have a lot of control to make the item best measure the skill.
- Easy train writers on
- Fast to build
- Built-in security: Who doesn’t love a hard-working, multi-tasking item type that prevents many forms of test fraud and helps keep your exams healthy for longer?
- Fun: It’s relatively new to the industry (compared to the 100-year-old multiple-choice item type) so my teams feel like psychometric trailblazers when we fill our exams with these item types.
A SmartItem contains multiple variations, all of which work together to cover an objective entirely. Each time the item is administered, the computer generates a random variation. Utilizing SmartItem technology comes with huge benefits such as increasing fairness, decreasing cheating and theft, saving money, etc. As an exam developer, creating SmartItems, and managing exam development projects that utilize them, is a joy. Like DOMC, you get security built-in to your exam, and you have the satisfaction of using smart computer technology to build hundreds or thousands of variations within one item. Item types where a computer does a lot of work for you go high on my Favorite Things list. I should note that a SmartItem is technically a treatment for an item type. It can be used to improve any item type you can think of—from multiple choice to short answers. Technicalities!
While not “smart” like the previous two item types, the multiple-choice item type is very easy to create and train on since it’s been around for so long, and since writers have seen it before. Randomizing options in multiple-choice items and delivering on-the-fly forms can help make these traditional item types a little more secure.
Matching & Build List
Matching items work great for 1:1 relationships that need to be tested, such as matching a capital city to its nation. Build Lists are wonderful where a task involves steps that must be performed in a specific order. In Caveon’s Scorpion (the system I develop items in the most), these two item types can become “smart.” For example, I could build one matching item that contains every country and its capital city in Europe. You can make them a bit more challenging by adding distractors (i.e., extra options in the matches that do not match to anything and extra steps in the build list that aren’t used).
Short answer options are popular in academia, where a portion of the skill being measured is synthesizing, analyzing, and evaluating information, and presenting it coherently in written form. However, I’ve also enjoyed using Short Answer to measure how well a test taker can generate a specific line of code or generate a solution to a math problem. It is a great way to measure a skill that has only one possible way to state the answer.
Hot spots require an image to be uploaded, and then specific area(s) within the image represent the correct answers, which the test taker must click. Imagine, for example, a map of North America, with the country names omitted. A student is asked to click on Canada. If they click on the image where Canada is, they get a point. The Hot Spot item type is not as ubiquitous as some of the other item types above because it has narrow use cases with images. Still, it’s a wonderful item type to use for certain measurement goals.
True/False is simple to create; the downside, a test taker who doesn’t know the content has a 50/50 chance of guessing the correct answer on this item type. However, one way I often use to get around the “guessing” problem is to use a cluster of True/False items working together, in which a test taker must get each of, say, 5, True/False items correct to earn a point. This cluster reduces the probability of guessing the correct answer, which is more helpful to most measurement goals.
Scale item types are great for gathering survey data or Standard-Setting data. You can set the scale, but typically you have a scale of 0 to 100.
Like the Scale item type, a Likert item can help you collect survey data. For example, a series of Likert items can be used as an end-of-course survey to assess student satisfaction with the course.
The essay item type is terrific for assessing specific measurement goals in rhetoric, composition, writing, etc. As a student, I loved essay exams because they allowed for more nuance, analysis, justification, and exploration than multiple choice. As an instructor, I also utilized this item type often. However, as an exam manager, mainly for IT organizations, the measurement goals of my clients often do not include essay writing. If you have a large-scale assessment organization, essay items are difficult to scale without computer-automated grading.
Whether you’re just starting your journey as an assessment professional or you’re a veteran, choosing which types of items to use in your program’s exams is a critical step. The exercise itself—in generating a list of “favorites” or “most-used” and exploring the reasons why—is something I highly recommend. And remember, “When the dog bites, when the bee stings, when you’re feeling sad, simply remember your favorite [item types], and then you won’t feel so bad.”