CAVEON SECURITY INSIGHTS BLOG

The World's Only Test Security Blog

Pull up a chair among Caveon's experts in psychometrics, psychology, data science, test security, law, education, and oh-so-many other fields and join in the conversation about all things test security.

Can You Prove Cheating on Tests Using Statistics?

Posted by Dennis Maynes

updated over a week ago

Introduction

A misquote from a news site concerning additional security announced by the TEA (Texas Education Agency) for the TAKS (Texas Assessment of Knowledge and Skill) caused me to pause and reflect about using statistical evidence to “prove” that someone cheated on a test. The reporter wrote, "Among other security measures, scramble field test questions on tests to provide proof if someone is copying someone else's answer sheet." (Italics added.)

Being well aware of the controversy surrounding the use of statistics alone to detect potential cheating, I immediately doubted the accuracy of the above statement. Actually, in 2007, TEA announced that “the Texas Education Agency today will immediately initiate the following: … analyze scrambled blocks of test questions to detect answer copying…” TEA then later clarified that the scrambling would only involve field test items. News outlets were quick to criticize the scrambling plan, but I applauded TEA’s intent to use statistics to detect potential cheating.

Can Statistics Prove Cheating on Exams?

We naturally ask whether statistical evidence can be relied on to detect invalid test scores. Many authors have expressed the opinion that "statistical evidence must be corroborated by eye-witness accounts before making allegations of cheating."

In reality, statistical evidence should be used to assess the validity of a test score, and not to "prove" cheating. Statistics alone can never prove that cheating occurred, because cheating is a combination of behavior and intent. 

What statistics can tell us is that there is sufficient evidence that a score is invalid and should not be trusted. Statistics can also tell us that the evidence for one hypothesis outweighs the evidence for another.

For example, we may have sufficient evidence to conclude that it is more likely than not that a particular examinee accessed disclosed test content. Based on that foundation, I believe that corroboration of the statistical evidence is unnecessary if the statistics are reliable. But what is reliable statistical evidence?

The Conditions of Reliable Statistical Evidence

Reliable Evidence Is Factual, Objective, Credible, and Defensible

In my opinion, reliable evidence must meet the following conditions:

    1. It must be factual
    2. It must be objective
    3. It must be credible
    4. It must be defensible

Here's how that breaks down—statistical evidence is:

    1. Factual when it is based on test result data (an actual record of the test event),
    2. Objective when it provides a statistic with a probability statement,
    3. Credible when the statistics have been shown to work because the models accurately depict actual test taking, and
    4. Defensible when the underlying science withstands scrutiny.

Reliable Evidence Must Be Strong

An additional fifth criterion the evidence must meet for taking action on a suspected instance of cheating is that the evidence must be strong. Statistical evidence is strong when the calculated probabilities are so small that we no longer believe the observed data are the result of normal test taking. Statistics can provide guidance for determining how strong is strong enough to take action, but ultimately the establishment of a probability threshold (i.e., the strength of the statistic) is a matter of policy that must be answered by the testing program administrator.

The Statistics Are Well-Suited for the Task at Hand

It is important with any statistical investigation to choose statistics that are well-suited and designed for the task at hand. For example, if the concern is that answer sheets are being modified, then erasure counts should be analyzed. Having analyzed over one hundred data sets for a wide variety of clients including state Departments of Education, admissions tests, certification programs, and licensure exams, I can unequivocally state that pre-knowledge of item content is currently one of the most predominant means of cheating on tests. In the heydays of paper-and-pencil testing, answer-copying was predominant. Depending on the type of potential cheating you would like to detect, appropriate statistics should be selected.

You can view our Ultimate Guide on Data Forensics to learn more. Until next time, may your tests remain secure.

Dennis Maynes

View all articles

About Caveon

For more than 18 years, Caveon Test Security has driven the discussion and practice of exam security in the testing industry. Today, as the recognized leader in the field, we have expanded our offerings to encompass innovative solutions and technologies that provide comprehensive protection: Solutions designed to detect, deter, and even prevent test fraud.

Topics from this blog: Data Forensics Detection Measures