Caveon Security Insights Blog

Web Crawling vs Web Patrolling (Why Test Programs Need Humans) — Caveon

Written by Christie Zervos | May 26, 2021 at 8:59 PM

While it is possible to use an automated “web crawling” system to search out your exposed exam content online, human "web patrolling" is the best approach.

Automated Technology vs. The Human Element

We live in an increasingly automated age. Just today, I deposited a check using online banking, ordered my groceries online, followed Google Maps to avoid a traffic jam, was alerted by my car that it needs an oil change, and scheduled a doctor’s appointment through an app on my phone. Automated technology is transforming our way of life, and in many cases, is saving us time, money, and headaches. I’ll be the first to admit that, in many cases, I’m a big fan of the role automated technology is playing in my life.

However, there times when relying on automated technology limits rather than helps. Just think about the last time you tried to communicate your unique needs to an automated customer service number. If you are like me, you found yourself yelling “representative” after just a couple of frustrating minutes. In fact, recognizing this limitation, many customer service hotlines have since pulled back their reliance on a purely automated model, and instead use automation to get you in contact with the correct human representative faster and more efficiently. Sometimes, you just need a human to help you.

The same problems you face when battling a robotic voice over the telephone apply to online security. Let me explain how.

Protecting Your Proprietary Test Content From Being Stolen

Regardless of whether you test remotely or in a testing center, every testing organization knows it needs to be worried about its proprietary test content (items and answer keys) being stolen, then shared and sold on the internet. As Jane Austen (sort of) says, “It is a truth universally acknowledged that live test questions are frequently shared online, and unethical test takers find and use them with alarming ease.”

As such, it is the responsibility of every testing organization to do their best to find and remove any and all test content exposed online—before it causes irreparable damage. It is at this point we can learn lessons from ineffective automated customer support systems. While it is possible to use an automated “web crawling” system to search out your exposed exam content, is this really the best idea?

The Automated Web Crawling Experience

For those of you who don’t know, a web crawler (sometimes known as a “spiderbot”) is an automated algorithm that crawls and indexes the internet, looking for data such as your leaked test content. In an ideal world, you plug in what you want the crawler to find, then it simply scours the internet until it finds it.

The Human Web Patrolling Experience

On the other hand, “web patrolling,” rather than using a computer bot, relies on well-trained humans to search known hotspots for exposed content and then gets it removed.

At first glance, you might think that this is a perfect moment for automated technology to step in and take the place of human workers. But can algorithms really guarantee your online security? Unfortunately, not quite. Web crawlers are limited in distinct and undeniably impactful ways that restrict their ability to effectively search for exposed exam content online.

Five Reasons to Prioritize Human Web Patrollers

Like with automated customer service calls, finding stolen exam content is one instance when the human element isn’t just nice, it is needed. Here are five reasons to rely on living, breathing web patrolling experts instead of an automated bot:

  1. They are flexible and adaptable: Automated web crawlers lack the flexibility of human web patrollers. Human web patrollers are capable of adapting seamlessly to not just the rapid evolution of the online sphere (to the sudden popularity of new social media platforms, to new braindump sites that pop up, etc.), but also to the specific needs of the client. No two clients face the same threats or have the same security needs. A team of human web patrollers can quickly and easily adapt their approach to fit your unique specifications.
  2. They go where no bot has gone before: There are certain areas of the internet that cannot be searched and indexed by a bot, particularly when it comes to social media. A bot cannot tell you if your test questions are being shared on WhatsApp or in private chat rooms. Only humans can find, infiltrate, and uncover content shared in these settings. (And believe me, content sharing on these platforms is becoming increasingly popular and dangerous.)
  3. They interact with fellow humans and follow clues: At times, patrolling the web involves a complicated process of piecing together a series of clues that only humans are able to recognize and sniff out as part of their investigation. There are times when uncovering exposed content only occurs with the help of conversations, private messages, and eliciting tips from those “in the know.” Only humans are capable of following these clues and troubleshooting these areas.
  4. They know where to look: You don’t need to search the entire internet to find exposed exam content. The items to your IT certification exam are probably not being shared in a forum for “dog memes” or on an online dating site. More likely, they are found on specific braindump sites, social media pages, and blog posts. A team of well-trained web patrollers knows the most likely places to start their search for your specific content.
  5. They know the ins and outs of the “international” internet: Humans are uniquely capable of navigating the web in various regions and using various languages. The internet of the United States is not, for example, the internet of South Korea, or of Turkey, or of Germany. Patrolling the web in various regions requires understanding foreign search engines and hotspots of online activity, as well as understanding local consumer behavior and etiquette. Experienced web patrollers are familiar with the online ecosystems in specific regions of the world and can search out any exposed exam content using the local language and dialects.

The Best Approach: Human Web Patrollers with Supplemental Automated Web Crawlers

As we march into the era of automation, it behooves us to remember that algorithms have limitations. Of course, that’s not to say that algorithmic, automated technologies should not be used as important analytical and investigative tools in the right context—they are powerful tools and absolutely should be.

However, online security is, and likely always will be, a human enterprise—demanding flexibility, adaptability, intelligence, and wisdom. Automated technologies still can’t live up to the benefits that only flesh-and-blood human web patrolling professionals can provide. And while automated web crawlers can assist, human web patrollers should always be at the center of your web monitoring solution.