The World's Only Test Security Blog
Pull up a chair among Caveon's experts in psychometrics, psychology, data science, test security, law, education, and oh-so-many other fields and join in the conversation about all things test security.
Posted by Caveon
updated over a week ago
A SmartItem uses special technology during exam development and delivery so that the item renders differently each time it is given on a test. Translation—no two examinees see the same thing during a test. The SmartItem can prevent all test theft, and almost all forms of cheating. It makes stealing test questions and answers a pointless endeavor, so attempts at gaining an advantage on a test are powerless. Moreover, to encourage proper studying and instruction, the SmartItem is coded to completely cover the target standard or competency within each item. Therefore, “teaching to the test” is no longer a beneficial practice.
The testing industry currently faces several serious challenges. The SmartItem was created to effectively end these problems. This simple, but significant innovation yields un-stealable tests that last virtually forever, meaning you don’t have to update exam content unless core objectives change. Here’s how the SmartItem pinpoints each of these perennial challenges:
SmartItems actually enhance the fairness of your exams by eliminating the two largest sources of unfairness: testwiseness and cheating.
Well, it can, but it won't matter. A SmartItem covers the entire breadth and depth of a skill, rendering differently for each test taker. (You can learn more in this section.) So, memorizing a SmartItem answer key would be about the same as just studying the materials in the first place.
The costs of redevelopment after a security breach or content theft can be financially draining for any testing program. Utilizing SmartItem technology ensures a longer lifespan for your items and tests. The SmartItem eliminates continual authoring and associated expenses, as well as the need to spend money on a multitude of other detection tools. To learn more about the short- and long-term benefits of utilizing SmartItem technology, see page 7 of this booklet.
After two case studies, three scientific experiments, one simulation, and a white paper, here’s what we know.
To learn more about these research initiatives, we invite you to read Caveon’s in-depth book, SmartItem™: Stop Test Fraud, Improve Fairness, and Upgrade the Way You Test. It discusses each of the concepts within this article in greater depth.
A SmartItem has nine distinct properties:
A SmartItem is an item—a single item—used on exams to measure important, clearly identified skills. Similar to traditional items, a SmartItem is created by subject matter experts (SMEs), assigned an I.D. number, stored in an “item bank,” evaluated with expert reviews, and eventually with item analyses. Also, like traditional items, a SmartItem can be used with any type of test design. If and when evaluation indicates a SmartItem isn’t functioning well, the SmartItem can be revised or retired.
These similarities with traditional items are important. There is a tendency to assume SmartItems are something more than items. It is assumed they produce items or are item models—but this is not the case. A SmartItem is an item; it’s just an unusually innovative one.
Just as two equal angles are considered congruent, a SmartItem is an item that is completely congruent with its respective skill. This is an unusual circumstance in the field of testing, and it warrants further explanation.
Let’s look at an example of how traditional items are only congruent with a small part of a skill.
Let’s say the learning objective is this: “Identify the characteristics and basics needs of varieties of living organisms and ecosystems."
A traditional item might include writing one question about the characteristics of a specific living organism, one question about the basic needs of another specific living organism, one question about the characteristics of an ecosystem, and one question about the basic needs of an ecosystem. As you can see, each traditional item written for this objective only measures a small slice of the entire skill outlined in the objective. In comparison, a SmartItem will measure the entire scope of a skill. The SmartItem for this example item will be designed to measure all four aspects of the learning objective. It will measure the student's ability to:
How can a single item do all of that?
Here’s how: a SmartItem uses unique technology during the development process, including programming, code, and SME-supplied variables. All of these components enable it to cover the entire objective. In addition to using new supportive technology, writing a SmartItem depends even more on the quality of the description of the skill. This includes any constraints or examples deemed relevant. You can view a live SmartItem example here.
A SmartItem displays differently to every test taker. Each unique way a SmartItem is displayed is what we call a “rendering.” Depending on the number of variables and other factors, a SmartItem can render in thousands, hundreds of thousands, or many millions of ways.
Test security threats related to item harvesting are neutralized if a test taker is unable to predict what they will see on the test. Not only does a SmartItem cover the entire content of a skill, but the sheer number of renderings it is capable of producing make the SmartItem impossible for any test taker to share or gain useful pre-knowledge about test content. This makes sharing questions or posting them online a pointless effort; any shared information will simply not be helpful to the next person taking the test.
That leaves only one way to truly prepare for and score well on a test consisting of SmartItems—by genuinely improving your knowledge of the skills being tested. The examinee must properly prepare by setting aside the necessary time to learn about and understand the content.
Each rendering of a SmartItem is created and presented randomly. While it is constrained to the limits of the skill being measured, the SmartItem’s form of randomization ensures the entire scope of each skill can be represented on the exam.
Often referred to as “stratified randomization,” this approach is used by scientists conducting experiments across a wide array of fields. The randomization of a SmartItem is constrained within each skill. If a particular skill is given greater weight, the test designer can arrange for the SmartItem related to that skill to be administered more than once on the test.
Randomization, stratified or not, has another interesting and positive effect. Test takers have no choice but to prepare across the entire breadth and depth of a described skill, as the specific content measured within that skill will be unpredictable. To learn more about the benefits of randomization in high-stakes and standardized testing, you can view this article or jump ahead to this section.
A SmartItem can be built by using code, by utilizing the SmartItem Graphical User Interface (GUI), or by creating a preponderance of options. One method is not inherently better than another. Rather, some skills are better suited to a coded question, some to a GUI interface, and others to a question with many options. Also, depending on the skill, it can be easier to use one method compared to the other. These methods can even be combined into the same SmartItem, as well as across an entire test.
Imagine two test takers are sitting side-by-side and are administered the same SmartItem at the same time. While these test takers have both been assigned the same SmartItem, they will each see a different rendering. The item is the same, but the specific content within each rendering is different.
Even though the content of a SmartItem is variable, the format of each rendering is recognizable and familiar for individual test takers. For example, if the SmartItem uses the multiple-choice format, then all test takers will see a multiple-choice item. Test takers are generally not aware of, nor concerned by, what is happening behind the scenes, or by what other test takers might be experiencing.
In addition to the values of particular variables that change with each rendering of a SmartItem, there are certain characteristics that also change:
Because these characteristics are mostly dictated by the content of the skill, the overall range of these differences will likely be small, but they will still be significant. They will likely still impact test taker performance, and therefore, the test taker’s score. Learn more in this section.
Creating a SmartItem is not equivalent to using AIG. AIG technology (like this one) is used to inexpensively build a pool of items that can be used to (1) expand pools for adaptive testing and other test designs and (2) replace compromised items and tests. You can learn more about AIG in this article.
SmartItems can be used for these purposes too. However, the primary purpose of SmartItems is to use them on a test to render on-the-fly to test takers. The value of SmartItems in this context is to serve as a preventative security method, not a reactive one. You can learn more about the differences between AIG and SmartItem technology in this article.
A SmartItem is not another item type or item format. A SmartItem is not equivalent conceptually to multiple-choice, DOMC™, or other selected-response or constructed-response formats. Instead, the SmartItem can be applied to any of these. In fact, SmartItem technology must be applied to an existing item format to work at all.
As with any other item, SmartItems can make use of text, audio, video, animation, and simulations files. The most important factor is that the item type and any included media are selected because they are best suited to measure the prescribed skill.
While SmartItems can prevent theft and many types of cheating by covering the entire skill and having many renderings, it is only by employing the DOMC item format that you can effectively eliminate testwiseness as well. The multiple-choice item type enables and encourages the use of testwiseness skills, helping examinees who are good at using test-taking cues get a better score. Even for high-quality exams, the testwiseness effect can inflate an individual’s test score between 5%-10%.
On the other hand, DOMC enables a multiple-choice-based SmartItem to eliminate almost any unfair advantage a test taker may gain because they are test-wise. You can learn more about combined DOMC and SmartItem capabilities in chapters 7 and 8 of this book, or on this page.
These nine properties are what make a SmartItem so very “smart.” SmartItems are unlike any other item used on any type of test anywhere in the world. This innovation offers are a unique concept, born out of need and made possible by new technology.
High-stakes exam questions have always been static, fixed, and immutable, with exams that are always identical. After a moment of contemplation, testing professionals usually wonder whether the variability in SmartItem renderings might have an unfair result. They wonder whether the effect of randomness is tracked in such a way that final test scores can be statistically adjusted to be comparable.
Their concern is a valid one. The commitment to fairness rightfully belongs at the center of all measurement efforts in the field of testing and assessment.
In addition to questions about the differences between individual SmartItem renderings, there are concerns about general difficulty. Could one test taker's overall exam end up being more difficult than another's?
The difficulty of individual SmartItems may vary on a test. With that in mind, it is the design of SmartItems—as well as the presence of a significant number of SmartItems on an exam—that makes them fair. Below are the logical, theoretical, practical, and empirical arguments to these item variability and difficulty questions.
The Standards for Psychological and Educational Testing does not, in fact, require that items be equally difficult for all test takers. Item difficulty variation is an inseparable part of every test taker's experience—even when test takers view identical items on the same test form. This is based on the concept of "perception" in psychology. An examinee’s perception of an item will be affected by many factors, including their reading ability, primary language, familiarity with the testing modality, and more.
“Can test theory support the use of SmartItems?” After a careful and extensive review of test theory literature, the answer to this question is a conclusive "Yes." As described, SmartItems render randomly within the constraints of each skill being measured by the test. This is the single most important property of SmartItems, and it is supported by both classical and modern test theory.
Despite its newness and oddness, the SmartItem is a significant innovation because it decimates critical sources of systematic error, such as test fraud and testwiseness. In essence, the benefits of SmartItems outweigh the costs.
"Do they work?" The question, while only three words, is multi-faceted. Here are some ways of re-stating it:
We have designed, run, and collected data from research projects that "prove" SmartItem technology works as promised. The results are very encouraging. While more research is always welcomed, there is enough empirical evidence today to encourage the serious consideration of the SmartItem.
These results are surprising and almost unbelievable to most testing professionals. It is hard to believe that SmartItems, varying as they do, can be described using common item analysis statistics. It is even harder to believe they can contribute to test reliability and validity. But they can, and they do. Learn more in this white paper.
As a case study of those results, SailPoint uses SmartItem technology exclusively for their two certification exams. Since the tests went operational, the SmartItem technology has performed well. Test-level reliability and validity evidence provided support for the use of SmartItems to produce test scores for high-stakes certification decisions. Read all the details about that project in this case study and this booklet.
Getting started is a simple process. Whether you integrate with your current system or use Caveon Scorpion™, you’ll need to train your team on how to design a great SmartItem. This is not a prescribed process. In fact, we fully expect to learn a lot about the application of SmartItem technology from early adopters. Each content universe is unique, and the approach to building that SmartItem is like an art form.
There are three basic ways to get started experimenting with the SmartItem:
A SmartItem is a self-protecting item treatment that employs a proprietary technology to prevent all test theft and almost all forms of cheating. SmartItem technology pinpoints and solves many of the modern challenges facing testing today, including test theft and test fraud, testwiseness, ballooning costs, and unfairness. As a result, it increases the fairness of exams for all test takers. One well-designed SmartItem can render in millions of ways, compute variable changes in real time, and present a different version of the item to each test taker. SmartItem technology performs well psychometrically, and tests comprised of SmartItems demonstrate high levels of validity and reliability.
For more than 18 years, Caveon Test Security has driven the discussion and practice of exam security in the testing industry. Today, as the recognized leader in the field, we have expanded our offerings to encompass innovative solutions and technologies that provide comprehensive protection: Solutions designed to detect, deter, and even prevent test fraud.
Get expert knowledge delivered straight to your inbox, including exclusive access to industry publications and Caveon's subscriber-only resource, The Lockbox.