My name is Serhii Kalachnikov, I am a software engineer here at Fulcrum. In my free time I study the capabilities of Node.js Typescript NestJS. I’m also a strong believer in automation (Pupeteer/Selenium/Postman/Click`em) which is quite useful in my day-to-day workflow.
If a task is done more than 3 times a day, creating a tool that will do the job in one click is simply indispensable.
In this post I am going to tell you about recaptcha bypass which saves me a lot of time. The article will be useful for those who have a lot of same-type tasks which can be automated. It will also be of interest for the security specialists. And highload systems developers. Building a protection strategy is much easier when you know the bots’ workflow and capabilities.
Why Automate Recaptcha Bypass in the First Place
Cloudflare recently calculated that we waste over 500 years daily to prove we are human. Find all the buses, click on the traffic lights, add 2 and 7.
An average user sees a captcha at least once every 10 days. An IT specialist, on the other hand, oftentimes has to face it hundreds of times per day. There are companies that have to go for outsourcing their recaptcha solving. There are whole departments that do just that - pass recaptcha instead of the client company’s employees.
Another case is when you work on an app or web platform. You have to think about bot-attack risks. Thus you have to develop security systems that will protect the app while not annoying the users.
I’ve been researching automation of these processes for the last few years. First, because I was working on a pet project dealing with test automation. Later in Fulcrum, we were looking for ways to use it for educating artificial neural networks. As a result we have developed our own software to identify people wearing masks on the streets. Then we adapted it for recaptcha bypass.
Types of Captcha
The first thing I did was define the captcha types:
- Graphic - the ones where the user has to choose the correct elements among the offered numbers or images
- Anchor - where the website gives out the address and key to the captcha and it’s created separately from the website
- Program - new captchas which can be passed by the browser. Ideally, the user doesn’t even see it.
This is the simplest type of captcha. We don’t see it too often these days. It's becoming more and more out of use. But some websites still have it, so we can’t dismiss it completely.
These captchas are built directly into the website and exchange the keys with the main server. When a user enters the website they receive the key and solve the captcha with it. Then the user returns the answer with the key.
A graphic captcha can be an anchor one. For example - a user sees a number of images and one of them is upside down. To pass the captcha the user has not only to identify the “defective” image, but to turn it into the correct position.
One of the most popular examples of this type of captcha is the one provided by Google. I wouldn’t recommend it though. It’s very easy to bypass with the simplest of neural networks. The pictures for this database are out of date and can be easily replicated. For example, you can take pictures of the traffic lights in your area and “teach” the neural network to define and bypass them.
KeyCaptcha is another example. To pass it the user needs to assemble a small jigsaw puzzle. It's not hard to bypass too. All you need to do is teach the neural network to discern the categories assigned to the images.
A bit harder anchor captchas are the ones where a number of images is offered. And the user has to perform an action with them. Find an upside-down one, find a colorful image among grey ones, or choose an inapplicable image.
How to Bypass Different Types of Captcha
To automate recaptcha bypass we need to teach a neural network. There’s a number of tools on the market that help with that, as well as automation of product protection testing. One bypass of a captcha with such a tool can cost you less than a cent. But 3-5 thousand bot entries to a website can become a hefty sum. For my automation project I chose CapMonster for $10 per month. I also pay $10-20 per month for a proxy.
To bypass the above mentioned captcha types I wrote a rather simple piece of software. It registers or logins to a website and does the job itself. The code is 3800 lines long. I then uploaded it to the CapMonster software suite.
It took one day of learning and 2000 images for the neural network to learn to bypass 98% of graphic captchas. With more than 2000 images the result became worse. But for bot farms a result of 93% is a good result.
One of the hardest anchor captchas is SolveMedia. It has several levels: easy, average, hard and very hard. The captcha has interwoven symbols that are very difficult to decipher. The average level is often too hard for the users. I was able to teach the neural network to bypass it with 61% of success.
But the pictures were not the hardest to work with. The hardest was the anchor itself. It took me a while, but in the end it turned out to be a common captcha that was solved in 2-3 tries. You can find the bypass of this anchor here. It’s done in the «Click`Em Project» language but can easily be adopted for other programming languages.
Finally, the program captchas can be passed by most of the browsers. If there is an issue and the browser can’t solve it the browser is blocked and asks the user to pass a regular captcha. The blocking factors are numerous:
- screen resolution;
- number of entries to the website;
- IP address;
- time zone, etc.
To help the browser solve these captchas easier you can use additional settings like: add an extra proxy, change the IP address, swap the browser.
Which Captcha is the Most Effective
Cloudflare, which switched to hCaptcha last year, was the hardest for me to bypass. You can see the Cloudflare captcha on market.dota2.net for example. The browser pauses for 4 seconds before entering the website. Overall, this captcha is easy to use and link with. And it filters most of the unwanted traffic.
It took me about a week to break in. The effectiveness is ensured by two types of captcha simultaneously - the program and the anchor ones. I had to create a server and add a 3 seconds holdback before the captcha was solved. Because a human being would need a few seconds to understand the task before them. The complexity of solving it is ensured by a mass of mixed words. You can find a repository with an HCaptcha solve hack here.
The modern world invents new ways to verify the users’ identity. From biometric authentication to 2FA. And captcha becomes more and more complex, because the bots become smarter. IT companies and developers inevitably have to find new ways to speed up captcha solving to stay effective. Automation is a great way to start.
FAQ: Recaptcha Bypass
What is recaptcha?
It’s a system meant to help the websites to distinguish between humans and bots trying to access the website.
How does recaptcha work?
It offers to choose a picture, or type a hard to decipher word. Something, presumably, only a human being can do.
How to remove recaptcha?
A user can’t remove recaptcha if it’s switched on by the website owner.
What is invisible recaptcha?
Program captchas which can be solved by the browser itself.
How to bypass recaptcha?
Try the ways described in this article.