How to Solve Amazon Captcha When Scraping

If you've ever performed any searches on Amazon, you've likely encountered times when you've been asked to enter a CAPTCHA, at which point you may have wondered why this is happening. This is just your personal scenario, however, for most businesses, collecting Amazon data is also critical for entering new markets and for sellers seeking to grow sales. However, as soon as you scale your scrapers from a couple of pages to even tens, CAPTCHAs become your nightmare. In this article, we'll show you a simple, effective way to solve CAPTCHAs while scraping Amazon product data, allowing you to gain a competitive advantage in your industry.

Understanding Captcha

CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. It is a security measure used by websites to differentiate between real human users and automated bots. CAPTCHAs typically involve presenting users with challenges that are easy for humans to solve but difficult for computers. These challenges can include tasks such as identifying distorted letters or numbers, selecting specific images from a set, or solving simple puzzles. By requiring users to successfully complete a CAPTCHA, websites can ensure that interactions on their platforms are performed by humans and not automated programs.

Amazon's Captcha Measures against Scraping:

Amazon, a popular e-commerce platform, takes various measures to protect the integrity of its website. It uses Captcha to detect and prevent automated scraping attempts. When scraping Amazon, you may encounter CAPTCHA challenges that need to be resolved before you can access further.Amazon currently has 2 main scenarios for captcha, one is FunCaptcha and one is Imagetotext

Funcaptcha is a type of CAPTCHA technology that was developed by a company called Arkose Labs. Unlike traditional CAPTCHAs, Funcaptcha uses interactive puzzles and games to differentiate between humans and bots. These puzzles are designed to be engaging and fun for humans, but difficult for bots to solve.

Image-to-text, also known as optical character recognition (OCR), is a technology that converts printed or handwritten text within an image into machine-readable text. It involves using algorithms and computer vision techniques to analyze the visual patterns and structures of characters in an image and translate them into editable and searchable text.

Solving Amazon's Captchas with CapSolver

Capsolver, widely used in the market today, is an enterprise level specialized in solving Amazon CAPTCHA, as its high accuracy and fastness are chief in the market. Here are some detailed steps and details to solve Amazon CAPTCHA

Solving Amazon Funcaptcha

Create Task

Create a task with the createTask to create a task.

Task Object Structure

Properties	Type	Required	Description
type	String	Required	`FunCaptchaTaskProxyLess`
websiteURL	String	Required	Web address of the website using funcaptcha, generally it's fixed value. (Ex: https://google.com)
websitePublicKey	String	Required	The domain public key, rarely updated. (Ex: E8A75615-1CBA-5DFF-8031-D16BCF234E10)
funcaptchaApiJSSubdomain	String	Optional	A special subdomain of funcaptcha.com, from which the JS captcha widget should be loaded. Most FunCaptcha installations work from shared domains.
data	String	Optional	Additional parameter that may be required by FunCaptcha implementation. Use this property to send "blob" value as a stringified array. See example how it may look like. {"\blob":"HERE_COMES_THE_blob_VALUE"} Learn how to get FunCaptcha blob data
proxy	String	Optional	Learn Using proxies

Example Request

POST https://api.capsolver.com/createTask
Host: api.capsolver.com
Content-Type: application/json

{
    "clientKey": "YOUR_API_KEY_HERE",
    "task": {
        "type":"FunCaptchaTaskProxyLess", //Required
        "websiteURL":"", //Required
        "websitePublicKey":"", //Required
        "data": "{\"blob\": \"flaR60YY3tnRXv6w.l32U2KgdgEUCbyoSPI4jOxU...\"}" // Optional
    }
}

After you submit the task to us, you should receive in the response a 'Task id' if it's successfull. Please read errorCode: full list of errors if you didn't receive the task id.

Example Response

{
    "errorId": 0,
    "status": "idle",
    "taskId": "61138bb6-19fb-11ec-a9c8-0242ac110006"
}

Getting Result

Use the getTaskResult method to get the recognition results

Depending on the system load, you will get the results within the interval of 1s to 20s

Example Request

POST https://api.capsolver.com/getTaskResult
Host: api.capsolver.com
Content-Type: application/json

{
    "clientKey": "YOUR_API_KEY",
    "taskId": "61138bb6-19fb-11ec-a9c8-0242ac110006"
}

Example Response

{
    "errorId": 0,
    "solution": {
        "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
        "token": "3AHJ_q25SxXT-pmSeBXjzScW-EiocHwwpwqtk1QXlJnGnU......"
    },
    "status": "ready"
}

Solving Amazon Imagetotext

:::

Create Task

Create the task with the createTask.

Task Object Structure

Note that this type of task returns the task execution result directly after createTask, rather than getting it asynchronously through getTaskResult.

Properties	Type	Required	Description
type	String	Required	ImageToTextTask
websiteURL	String	Optional	Page source url to improve accuracy
body	String	Required	base64 encoded content of the image (no newlines) (no data:image/*; base64, content
module	String	Optional	Specifies the module. Currently, the supported modules are common and queueit
score	Float	Optional	`0.8 ~ 1`, Identify the matching degree. If the recognition rate is not within the range, no deduction
case	Boolean	Optional	Case sensitive or not

Example Request

POST https://api.capsolver.com/createTask
Host: api.capsolver.com
Content-Type: application/json

{
  "clientKey": "YOUR_API_KEY",
  "task": {
    "type": "ImageToTextTask",
    "websiteURL": "https://xxxx.com",
    // You can choose the module you need to use
    // ocr single image model, default common
    "module": "queueit",
    // base64 encoded image
    "body": "/9j/4AAQSkZJRgABA......"
  }
}

Example Response

{
  "errorId": 0,
  "errorCode": "",
  "errorDescription": "",
  "status": "ready",
  "solution": {
    "text": "44795sds"
  },
  "taskId": "2376919c-1863-11ec-a012-94e6f7355a0b"
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to Solve Amazon Captcha When Scraping

Understanding Captcha

Amazon's Captcha Measures against Scraping:

Solving Amazon's Captchas with CapSolver

Solving Amazon Funcaptcha

Create Task

Task Object Structure

Example Request

Example Response

Getting Result

Example Request

Example Response

Solving Amazon Imagetotext

Create Task

Example Request

Example Response

About

Releases

Packages

ERIZOAT/Amazon-Captcha-Solving

Folders and files

Latest commit

History

Repository files navigation

How to Solve Amazon Captcha When Scraping

Understanding Captcha

Amazon's Captcha Measures against Scraping:

Solving Amazon's Captchas with CapSolver

Solving Amazon Funcaptcha

Create Task

Task Object Structure

Example Request

Example Response

Getting Result

Example Request

Example Response

Solving Amazon Imagetotext

Create Task

Example Request

Example Response

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages