The World's Only Test Security Blog
Pull up a chair among Caveon's experts in psychometrics, psychology, data science, test security, law, education, and oh-so-many other fields and join in the conversation about all things test security.
Posted by David Foster, Ph.D.
updated over a week ago
Change drives improvement in every industry, but what are the seven most groundbreaking innovations in assessment history? Each innovation has had a lasting impact, and one has the potential to change the future of testing.
As I’ve thought about the many innovations that have impacted my life (cell phones, medical advances, and transportation, to name a few), I can’t help but turn my attention to my calling of 36 years—assessment.
I have selected seven incredible innovations in assessment that I think have had the biggest impact on the testing world. I've included a brief description, context, and reasoning on why each innovation is one of the greatest of our time. (Note: I will mention if I have personal experience with one of these innovations.)
The multiple-choice question was invented in the early 1900s by Frederick Kelly. He created it to standardize the testing process and reduce teacher scoring errors and bias.
The item was first used on the Kansas Silent Reading Test in 1915. It was quickly followed two years later by the wholesale adoption of the format for the Army Alpha test. You can learn more about the history of the multiple-choice item in this white paper.
The popularity of the multiple-choice question format has continued and is apparent to all, even those giving or taking tests today. Few innovations can boast such a long lifespan, and few have been virtually untouched by aging. Below is an example of the multiple-choice item from the Kansas Silent Reading Test.
In 2018, Caveon introduced the SmartItem. The most important innovation in this list, the SmartItem has the potential to change the future of testing as we know it. This is because SmartItem technology completely eliminates threats from item thieves and harvesters, and it protects against almost all forms of cheating. Another advantage of the SmartItem includes cost savings on both test development and test administration. To test takers, the SmartItem appears as any other item would.
SmartItem technology is defined by three main characteristics:
Supported in its use by psychometric theory (you can learn more about that in this e-book), it is not unusual for a single SmartItem to vary in hundreds of thousands, or even millions, of ways—including in difficulty. Any item type can be converted into a SmartItem. You can learn more about SmartItem technology in this short booklet, or view a live example of a SmartItemm here.
The first Optical Mark Recognition (OMR) scanner was created in 1932, followed by three decades of many newer inventions and patents. You can read more about its history here. Today, scanners continue to be used in K-12 and higher education settings where paper-based tests are given in large quantities.
The below image shows a typical OMR sheet and a scanner from the end of the last century. The OMR technologies from that time significantly advanced and solidified the use and popularity of paper-based multiple-choice testing. The OMR equipment allowed the quick collection and scoring of student capabilities, particularly of end-of-year, summative tests.
Item response theory, or IRT, was introduced around 1950 by Frederic Lord, George Rasch, and others. Compared to classical test theory, IRT is generally believed to have brought greater sophistication to test analyses, greater ability to demonstration reliability, and new applications (such as computerized adaptive testing, or CAT). CAT allowed tests to be administered more quickly (using fewer items) and more securely.
In perhaps the first worldwide use of CAT for a high-stakes test, at Novell, a software company certifying the competence of its administrators and engineers, I used IRT analyses and CAT in over one million exams administered globally from 1990 to 1997. Pictured below is a typical IRT-based item information curve.
Some people equate computerized testing with CAT, but computerized testing is a much greater concept. Certainly, moving testing in the 1980s from the limitations of paper booklets and answer sheets took CAT from the drawing board and into reality. But it brought so much more, even to tests that were not adaptive.
With tests being computerized, scoring was quicker, even instantaneous, as were the decisions based on those scores. Accommodations for disabilities could be made on the fly, such as text-to-speech and larger fonts. Such testing also encouraged new item formats, such as dragging and dropping objects, the use of speech, scoring flexibility, the use of multiple languages in tests, having people complete tasks during the exam, taking the tests at home, and many, many more.
The first computers to provide this flexibility were mainframes controlling so-called “dumb” terminals. Then, networked personal computers grew in popularity in schools and homes. Today, such computers allow tests to be given on cell phones, tablets, and other mobile devices in any location.
As an avid user and firm believer in using technology in testing, I invented a new version of the multiple-choice question in 2009 where all of the answer options aren’t revealed initially, but presented one at a time until the test taker responds either correctly or incorrectly.
The Discrete Option Multiple Choice™ (DOMC) format makes the use of testwiseness (the skill of using testing cues to one’s advantage) from multiple-choice questions almost impossible. It removes a 5%-10% blight on test scores—scores that the testing industry seemed to have accepted over the decades as a necessary evil. DOMC also reduces security problems and promotes fairness. It has significant support from scientific research and is used successfully in a number of operational testing programs today.
Last but certainly not least, this list would not be complete without a proper tribute to the internet.
In the 1990s, I had heard of the new technology referred to as the “World Wide Web.” While I didn’t understand how it worked (nor was I able to predict its usefulness in the field of testing), one event during my time working at Novell showed its clear value. At that time, our company was spending many thousands of dollars collecting testing results from far-flung reaches of the world. In one such event, it cost over one thousand dollars in long-distance charges from just one weekend of using a modem to collect data from a test site in Poland. Then, almost overnight, we used that new thing, the Internet, to complete the same outcome for no cost at all!
Of course, today’s tests can be given using the internet (properly called “online testing”) where tests can be created, administered, and proctored, and then assessment data can be stored, analyzed, and much more. The internet has permeated the testing realm just as it has every other industry, and it is as important as it is widespread. And with the infinite possibilities the internet offers, it's no wonder more and more programs are transitioning to online testing. (By the way, if you are one of those programs, be sure to view this page chock-full of helpful resources for online testing.)
There are many valuable innovations in testing (some may not be on this list), but above are seven of the most impactful and important innovations in assessment history.
What assessment innovations have made the greatest impact on you? How have those innovations impacted your experiences? Whether from a student’s position, career capacity, or otherwise, testing affects us all in many ways.
A psychologist and psychometrician, David has spent 37 years in the measurement industry. During the past decade, amid rising concerns about fairness in testing, David has focused on changing the design of items and tests to eliminate the debilitating consequences of cheating and testwiseness. He graduated from Brigham Young University in 1977 with a Ph.D. in Experimental Psychology, and completed a Biopsychology post-doctoral fellowship at Florida State University. In 2003, David co-founded the industry’s first test security company, Caveon. Under David’s guidance, Caveon has created new security tools, analyses, and services to protect its clients’ exams. He has served on numerous boards and committees, including ATP, ANSI, and ITC. David also founded the Performance Testing Council in order to raise awareness of the principles required for quality skill measurement. He has authored numerous articles for industry publications and journals, and has presented extensively at industry conferences.View all articles
For more than 18 years, Caveon Test Security has driven the discussion and practice of exam security in the testing industry. Today, as the recognized leader in the field, we have expanded our offerings to encompass innovative solutions and technologies that provide comprehensive protection: Solutions designed to detect, deter, and even prevent test fraud.
Topics from this blog: Test Security Basics DOMC™ Online Exams SmartItem™
Get expert knowledge delivered straight to your inbox, including exclusive access to industry publications and Caveon's subscriber-only resource, The Lockbox.