The World's Only Test Security Blog
Pull up a chair among Caveon's experts in psychometrics, psychology, data science, test security, law, education, and oh-so-many other fields and join in the conversation about all things test security.
Posted by Erika Johnson
updated over a week ago
You have been given the task of building an examination for your organization. After spending two weeks panicking about how you would do this and procrastinating the work that must be done, you are finally ready to begin the test development process. But where in the world do you begin? Why do you need to create this exam? You know you need to construct test items, but which item types are the best fit for your exam? Who is your audience? How do you determine that?
Luckily for you, Caveon has an amazing team of experts on hand in our Caveon Secure Exam Development (C-SEDs) department to help. But if you want to try your hand at test development on your own, here’s some information on best practices to guide you on your way.
First thing’s first. Before creating your test, you need to: 1) determine why you are testing your candidates, and 2) figure out who exactly will be taking your exam. Assessing the purpose of your exam is the first vital step of the development process. You do not want to test just to test; you want to scope out the “why” of your exam: why this exam is important to your organization, and what you are trying to achieve with having your test takers sit for it. You can narrow down your purpose for testing by asking yourself a few questions:
Learning the purpose of your exam will help you come up with a plan on how best to set up your exam—which exam type to use, which type of exam items will best measure the skills of your candidates (we will discuss this in a minute), etc. Determining this purpose will also help you to be better able to figure out your test audience. Whether they are students still in school, individuals looking to qualify for a position, or experts looking to get certification in a certain product or field—it’s important to make sure your exam is actually testing at the appropriate level. Your exam will not be valid if your items are too easy or too hard, so keeping the minimally qualified candidate (MQC) in mind during all of the steps of the exam development process will ensure you are capturing valid test results overall.
MQC is the acronym for “minimally qualified candidate.” The MQC is a conceptualization of the assessment candidate who possesses the minimum knowledge, skills, experience, and competence to just meet the expectations of a credentialed individual. If the credential is entry level, the expectations of the MQC will be less than if the credential is designated at an intermediate or expert level. Think of an ability continuum that goes from low ability to high ability. Somewhere along that ability continuum, a cut point will be set. Those candidates who score below that cut point are not qualified and will fail the test. Those candidates who score above that cut point are qualified and will pass. The minimally qualified candidate, though, should just barely make the cut. It’s important to focus on the word “qualified,” because even though this candidate will likely gain more expertise over time, they are still deemed to have the requisite knowledge and abilities to perform the job.
You’ve determined the purpose of your exam and identified the audience. Now it’s time to decide on the exam type and which item types to use that will be most appropriate to measure the skills of your test takers. The type of exam you choose depends on what you are trying to test and the kind of tool you are using to deliver your exam (note that you should always make sure the software you use to develop and deliver your exam is thoroughly vetted—here are some things to look for). The type of items you choose depends on your measurement goals and what you are trying to assess. It is essential to take all of this into consideration before moving forward with development. Let’s take a look at some common exam types for you to consider.
Fixed-form delivery is a method of testing where every test taker receives the same items. An organization can have more than one fixed-item form in rotation, using the same items that are randomized on each live form. Additionally, forms can be made using a larger item bank and published with a fixed set of items equated to a comparable difficulty and content area match.
A CAT exam is a test that adapts to the candidate's ability in real time by selecting different questions from the bank in order to provide a more accurate measurement of their ability level on a common scale. Every time a test taker answers an item, the computer re-estimates the tester’s ability based on all the previous answers and the difficulty of those items. The computer then selects the next item that the test taker should have a 50% chance of answering correctly.
A LOFT exam is a test where the items are drawn from an item bank pool and presented on the exam in a way that each person sees a different set of items. The difficulty of the overall test is controlled to be equal for all examinees. LOFT exams utilize automated item generation (AIG) to create large item banks.
The above three exam types can be used with any standard item type. Before moving on, however, there is another more innovative exam type to consider if your delivery method allows for it:
A performance-based assessment measures the test taker's ability to apply the skills and knowledge learned beyond typical methods of study and/or learned through research and experience. For example, a test taker in a medical field may be asked to draw blood from a patient to show they can competently perform the task. Or a test taker wanting to become a chef may be asked to prepare a specific dish to ensure they can execute it properly.
There are many different item types to choose from. (Check out a few of our favorites in this article.) While utilizing more item types on your exam won’t ensure you have more valid test results (as discussed here), it’s important to know what’s available in order to decide on the best item format for your program. Here are a few of the most common items to consider when constructing your test:
A multiple-choice item is a question where a candidate is asked to select the correct response from a choice of four (or more) options.
A multiple response item is an item where a candidate is asked to select more than one response from a select pool of options (i.e., “choose two,” “choose 3,” etc.)
Short answer items ask a test taker to synthesize, analyze, and evaluate information, and then to present it coherently in written form.
A matching item requires test takers to connect a definition/description/scenario to its associated correct keyword or response.
A build list item challenges a candidate’s ability to identify and order the steps/tasks needed to perform a process or procedure.
DOMC™ is known as the “multiple-choice item makeover.” Instead of showing all the answer options, DOMC options are randomly presented one at a time. For each option, the test taker chooses “yes” or “no.” When the question is answered correctly or incorrectly, the next question is presented. DOMC has been used by award-winning testing programs to prevent cheating and test theft. You can learn more about the DOMC item type in this white paper.
A self-protecting item, otherwise known as a SmartItem, employs a proprietary technology resistant to cheating and theft. A SmartItem contains multiple variations, all of which work together to cover an entire learning objective completely. Each time the item is administered, the computer generates a random variation. SmartItem technology has numerous benefits, including curbing item development costs and mitigating the effects of testwiseness. You can learn more about the SmartItem in this infographic and this white paper.
Regardless of the exam type and items types you choose, focusing on some best practice guidelines can set up your exam for success in the long run. There are many guidelines for creating tests (see this handy guide, for example), but this list sticks to the most important points. Little things can really make a difference when developing a valid and reliable exam, so be sure to follow along!
Although you want to ensure that your items are difficult enough that not everyone gets them correct, you never want to trick your test takers! Keeping your wording clear and making sure your questions are direct and not ambiguous is very important. For example, asking a question such as “What is the most important ingredient to include when baking chocolate chip cookies?” does not set your test taker up for success. One person may argue that sugar is the most important, while another test taker may say that the chocolate chips are the most necessary ingredient. A better way to ask this question would be “What is an ingredient found in chocolate chip cookies?” or “Place the following steps in the proper order when baking chocolate chip cookies.”
When creating your items, ensuring that each item aligns with the objective being tested is very important. If the objective asks the test taker to identify genres of music from the 1990s, and your item is asking the test taker to identify different wind instruments, your item is not aligning with the objective.
Your items should be relevant to the task that you are trying to test. Coming up with ideas to write on can be difficult, but avoid asking your test takers to identify trivial facts about your objective just to find something to write about. If your objective asks the test taker to know the main female characters in the popular TV show Friends, asking the test taker what color Rachel’s skirt was in episode 3 is not an essential fact that anyone would need to recall to fully understand the objective.
As discussed above, remembering your audience when writing your test items can make or break your exam. To put it into perspective, if you are writing a math exam for a fourth-grade class, but you write all of your items on advanced trigonometry, you have clearly not met the difficulty level for the test taker.
When writing your options, keep these points in mind:
Constructing test items—and creating entire examinations—is no easy undertaking. This article helps you identify your specific purpose for testing and helps you determine the most common exam and item types you can use to measure the skills of your test takers. We’ve gone over general best practices to consider when constructing items, and we’ve sprinkled helpful resources throughout to help you on your exam development journey.
This article helps you tackle the first step of the 8-step assessment process: planning & developing test specifications. To learn more about creating your exam, including how to increase the usable lifespan of your exam, review our ultimate guide on secure exam creation and our workbook on evaluating your testing engine, leveraging secure item types, and increasing the number of items on your tests. And as always, if you need help constructing your test or items, reach out to our incredible C-SEDs team—it's what they do!
Erika is an Exam Development Manager in Caveon’s C-SEDs group. With almost 20 years in the testing industry, nine of which have been with Caveon, Erika is a veteran of both exam development and test security. Erika has extensive experience working with new, innovative test designs, and she knows how to best keep an exam secure and valid.View all articles
For more than 18 years, Caveon Test Security has driven the discussion and practice of exam security in the testing industry. Today, as the recognized leader in the field, we have expanded our offerings to encompass innovative solutions and technologies that provide comprehensive protection: Solutions designed to detect, deter, and even prevent test fraud.