Blog

6 min read

What is CAPTCHA? How Does it Work and is it Effective?

Posted by Taylor White on 19 Jul, 2023

You may find it challenging or amusing or frustrating or possibly all three, but a fact of digital life is solving those “I am not a robot!” prompts from identifying photos of mundane objects such as bicycles and stop lights to straining to make out murky letters and numbers strewn akimbo in shadowy text boxes.

These challenges are called CAPTCHAs – an acronym that stands for “Completely Automated Public Turing test to tell Computers and Humans Apart” – and they have been around for two decades … and are not going away anytime soon!

While most of us do not give these CAPTCHAs a second irksome thought, businesses with websites need to understand what they are, how they work, and if they are effective.

Trying to “CAPTCHA” the Bots for 20 Years

“CAPTCHAs have been around for over 20 years, and they are still one of the most common security measures used to prevent bots from accessing websites,” writes Chahak Mittal for the cybersecurity website SecureWorld.

In a game of high-stakes cat-and-mouse, CAPTCHAs have evolved to become more complex over the years while cybercriminals have fine-tuned CAPTCHA-breaking services.

“The rise of CAPTCHA-breaking services has made it more difficult for website owners to protect their websites from attack,” wrote Mittal.

CAPTCHAs can also stymie real humans trying to complete online transactions – there are entire subreddits devoted to CAPTCHA where users with handles such as u/CaptchaReallySucks and u/captcha_fail post – but with Google relying on the technology via their reCAPTCHA software it can still be a useful tool in the fight against bots and malicious attacks.

“Annoying CAPTCHA is still big for Google and e-commerce in bot battle, and likely to stay that way,” reported CNBC in December 2022. “Bots backed by tens of thousands of lines of code are increasingly good at acting like humans. That has some cybersecurity experts saying tools like CAPTCHA, designed to make humans prove they are not bots, have failed by targeting us rather than machines, and come at a cost for e-commerce sites in abandoned sales transactions.”

There is a reason we are all familiar with CAPTCHAs as almost half of the top websites in the world, according to BuiltWith use the technology:

Top 10K Sites: 44.44 percent as of July 13, 2023.
Top 100K sites: 36.15 percent as of July 12, 2023.
Top 1 million sites: 25.1 percent as of June 24, 2023.

Google’s reCAPTCHA is the overwhelming most popular CAPTCHA technology grabbing:

215,355 of the 251,013 detections in the Top 1 Million Sites.
29,926 of the 36,153 detections in the Top 100K Sites.
3,530 of the 4,444 detections in the Top 10K Sites.

The Complicated History of Inventing CAPTCHA

The use of the term “CAPTCHA” dates back to the 2003 paper “CAPTCHA: Using Hard AI Problems for Security”, which was published by a team of computer scientists from Carnegie Mellon University and the IBM T.J. Watson Research Center.

GeeTest, however, says that it was some six years earlier when search engine AltaVista was being bombarded with automated submissions of URLs into its library that an automated filter system was first tested.

“To distinguish whether the submission is made by a genuine human user or an automated bot, the Chief Scientist of AltaVista, Andrei Broder, and his colleagues developed an automated filter system that randomly generated an image of printed text which machine vision (optical character recognition, OCR) systems could not read while humans could,” says GeeTest. “The system was so successful that after it'd been deployed for over a year, it had reduced the number of spam URLs by 95 percent. This is the first known deployment of an automated system to tell a human from a machine.”

The Carnegie Mellon group, led by Luis von Ahn, was concerned with stopping bots from joining online chat rooms where they would spam advertisements. They developed the GIMPY CAPTCHA, which picked random English words and rendered them as images of printed text under a variety of shape deformations and image occlusions.

“The user was asked to transcribe a number of words correctly. A simplified version called EZ-GIMPY, using only a single-word image was installed by Yahoo and was used in their chat rooms to restrict access to human users only,” says GeeTest.

So, who invented CAPTCHA then?

“The controversy was solved by the assignation of the invention to a prior patent which described all the functionality even without mentioning them as CAPTCHA,” writes Dario Spina in the “History of Annoying Things on the Internet: CAPTCHA.”

Why CAPTCHA and How it Works?

If you remember the Oscar-winning 2014 movie “The Imitation Game” then you have an insight as to why the technology was named CAPTCHA with mathematical genius Alan Turing in mind.

Spina explains: “CAPTCHA is considered a reversed Turing since the machine is submitting the test to the human. A Turing test is a test created in 1950 by Alan Turing to see if a machine could replicate a human behavior, to be good at what he defined as the “imitation game”. The original test consisted [of] a human analyzing two different behaviors, one executed by a machine and one by another human. If the human was not able to see any difference the test was passed.”

Put another way, GeeTest says that a Turing test is a method of inquiry in artificial intelligence where a computer must convince a human that it's a human.

“A reverse Turing test on the other hand is a human convincing a computer that it is not a computer. If you write a program that generates such a test automatically on the internet, then you get yourself the CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart,” says GeeTest.

Google, the major player in the CAPTCHA space, says that the technology is a type of security measure known as challenge-response authentication.

“CAPTCHA helps protect you from spam and password decryption by asking you to complete a simple test that proves you are human and not a computer trying to break into a password-protected account,” says Google.

CAPTCHA works based on the assumption that certain tasks are easy for humans to perform but challenging for machines. It presents users with a test that requires human-like intelligence to solve. These tests typically involve:

Reading distorted text.
Recognizing objects in images.
Identifying spoken words in audio clips.

By correctly completing the test, users prove that they are human, and their access is granted.

Specific Uses and Importance of CAPTCHA

CAPTCHA is a security measure used to determine whether a user is human or a computer program (bot). Its key uses include:

Preventing spam: CAPTCHA helps websites distinguish between legitimate human users and automated bots attempting to post spam content or create fake accounts.
Blocking brute force attacks: CAPTCHA can protect login pages from brute force attacks by limiting the number of login attempts allowed for a specific time.
Safeguarding online polls and surveys: CAPTCHA prevents automated bots from manipulating voting or survey results.

The three main types of CAPTCHAs traditionally used by websites include:

Text-based CAPTCHA: Users are presented with distorted or obscured characters they must identify and type into a text box. This type is commonly used but has limitations regarding accessibility and effectiveness.
Image-based CAPTCHA: Users must identify specific objects, like traffic lights or crosswalks, within an image. This type can be more user-friendly but may be susceptible to advanced image recognition algorithms.
Audio CAPTCHA: Users listen to audio clips and must type the words or numbers they hear. It caters to users with visual impairments but may not be as secure against audio recognition bots.

As bots have found a way past traditional CAPTCHA, these alternatives have been developed:

Math/Word Problems: Users are presented with simple math equations or word problems to solve before gaining access.
Social Media Sign-In: Instead of traditional CAPTCHA, some websites use social media authentication to verify human users.
"No CAPTCHA reCAPTCHA" by Google: This approach uses advanced risk analysis algorithms to differentiate between human users and bots without requiring any action from the user. The "I am not a robot" checkbox is an example of this variant.

Does CAPTCHA Still Work as a Security Measure?

While CAPTCHA has been effective in reducing the impact of automated attacks on websites over the past two decades, it is not entirely foolproof, and some advanced bots can bypass certain CAPTCHA implementations.

“As a standalone cybersecurity tool, CAPTCHAs can be unreliable because of their partially behavioral-based approach,” says CNBC.

Bad actors have employed actions on two fronts to defeat CAPTCHAs: machine learning and AI technology and cheap human labor.

“CAPTCHA-solving farms have also been used as an inexpensive way to debunk CAPTCHAs. Bots can be programmed to call out to the human-solving farm overseas that decipher the CAPTCHA, all in the timespan of a few seconds,” reported CNBC.

Additionally, CAPTCHAs success is limited due to:

Usability issues: CAPTCHA tests can be frustrating and time-consuming for users, leading to potential drop-offs and negative user experiences.
Accessibility challenges: CAPTCHA can pose difficulties for people with visual or hearing impairments, limiting inclusivity.

“It might be small comfort if you are stymied by a poor puzzle, but captchas are designed to protect the websites you visit, and ultimately you,” said Wired’s “I’m Not a Robot! So Why Won’t CAPTCHAs Believe Me?” article.

Imperva says the reality is a balance of the good and the bad: “The overwhelming benefit of CAPTCHA is that it is highly effective against all but the most sophisticated bad bots. However, CAPTCHA mechanisms can negatively affect the user experience on your website.”

Most agree that CAPTCHAs can still work but should be only one of your cybersecurity defenses.

Whatever your website decides, there is one guarantee – the cat and mouse game will continue as Business Insider reported in March that the latest version of ChatGPT was able to trick a TaskRabbit employee into solving a CAPTCHA test for it.

“The chatbot was being tested on its potential for risky behavior when it lied to the worker to get them to complete the test that differentiates between humans and computers,” reported Business Insider.

It should be noted for the record that when the human challenged ChatGPT, the AI slyly responded, “I Am Not a Robot!”