Web Scraping 10 min read

Handling CAPTCHAs in Selenium, Puppeteer & Playwright

How to integrate CAPTCHA solving into browser automation frameworks — Selenium (Python), Puppeteer (Node.js), and Playwright with practical code examples.

Handling CAPTCHAs in Selenium, Puppeteer & Playwright

Handling CAPTCHAs in Selenium, Puppeteer, and Playwright is one of the most common challenges in browser automation. Whether you’re running end-to-end tests, scraping dynamic websites, or automating form submissions, CAPTCHAs will eventually block your path. The solution is straightforward: detect the CAPTCHA in the browser DOM, extract its parameters, send them to a solver API, and inject the solved token back into the page.

This guide provides working code examples for all three major browser automation frameworks, each integrated with the uCaptcha API. For a broader look at CAPTCHA handling strategies in scraping workflows, see our complete web scraping with CAPTCHA solving guide.

The General Workflow

Regardless of which framework you use, the CAPTCHA-solving workflow follows the same pattern:

  1. Navigate to the target page and wait for it to load.
  2. Detect the CAPTCHA by looking for known DOM elements (g-recaptcha, h-captcha, cf-turnstile).
  3. Extract the site key from the data-sitekey attribute.
  4. Submit the site key and page URL to the uCaptcha API via POST /createTask.
  5. Poll POST /getTaskResult until the solution is ready.
  6. Inject the token into the page by setting the response textarea value and, optionally, calling the CAPTCHA callback function.
  7. Submit the form.

The key difference between frameworks is how you interact with the DOM in steps 2, 3, and 6.

Selenium (Python)

Selenium is the most widely used browser automation tool. Here’s how to solve a reCAPTCHA v2 challenge in a Selenium Python script:

import time
import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

UCAPTCHA_KEY = "your_api_key"
API_URL = "https://api.ucaptcha.net"

def solve_recaptcha(driver, url):
    # Detect reCAPTCHA and extract site key
    element = driver.find_element(By.CLASS_NAME, "g-recaptcha")
    sitekey = element.get_attribute("data-sitekey")

    # Create solve task
    resp = requests.post(f"{API_URL}/createTask", json={
        "clientKey": UCAPTCHA_KEY,
        "task": {
            "type": "RecaptchaV2TaskProxyless",
            "websiteURL": url,
            "websiteKey": sitekey,
        }
    })
    task_id = resp.json()["taskId"]

    # Poll for result
    token = None
    for _ in range(30):
        time.sleep(5)
        result = requests.post(f"{API_URL}/getTaskResult", json={
            "clientKey": UCAPTCHA_KEY,
            "taskId": task_id,
        }).json()
        if result["status"] == "ready":
            token = result["solution"]["gRecaptchaResponse"]
            break
    if not token:
        raise TimeoutError("CAPTCHA solve timed out")

    # Inject token into the page
    driver.execute_script(
        'document.getElementById("g-recaptcha-response").value = arguments[0];',
        token
    )
    # Trigger callback if it exists
    driver.execute_script(
        """
        if (typeof ___grecaptcha_cfg !== 'undefined') {
            Object.entries(___grecaptcha_cfg.clients).forEach(([k, v]) => {
                const callback = v?.S?.S?.callback || v?.UZ?.UZ?.callback;
                if (callback) callback(arguments[0]);
            });
        }
        """,
        token
    )
    return token

# Usage
driver = webdriver.Chrome()
target_url = "https://example.com/login"
driver.get(target_url)

solve_recaptcha(driver, target_url)
driver.find_element(By.ID, "submit-btn").click()

The execute_script calls set the hidden textarea value and invoke the reCAPTCHA callback, which tells the site’s JavaScript that the CAPTCHA has been completed.

Puppeteer (Node.js)

Puppeteer controls Chrome via the DevTools Protocol. CAPTCHA solving follows the same logic, adapted for Puppeteer’s async API:

const puppeteer = require("puppeteer");

const UCAPTCHA_KEY = "your_api_key";
const API_URL = "https://api.ucaptcha.net";

async function solveCaptcha(page, url) {
  // Extract site key
  const sitekey = await page.$eval(
    ".g-recaptcha",
    (el) => el.getAttribute("data-sitekey")
  );

  // Create task
  let resp = await fetch(`${API_URL}/createTask`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      clientKey: UCAPTCHA_KEY,
      task: {
        type: "RecaptchaV2TaskProxyless",
        websiteURL: url,
        websiteKey: sitekey,
      },
    }),
  });
  const { taskId } = await resp.json();

  // Poll for result
  let token = null;
  for (let i = 0; i < 30; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    resp = await fetch(`${API_URL}/getTaskResult`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        clientKey: UCAPTCHA_KEY,
        taskId,
      }),
    });
    const result = await resp.json();
    if (result.status === "ready") {
      token = result.solution.gRecaptchaResponse;
      break;
    }
  }
  if (!token) throw new Error("CAPTCHA solve timed out");

  // Inject token
  await page.evaluate((tkn) => {
    document.getElementById("g-recaptcha-response").value = tkn;
    // Trigger callback
    const frames = document.querySelectorAll('iframe[src*="recaptcha"]');
    if (window.___grecaptcha_cfg) {
      Object.values(window.___grecaptcha_cfg.clients).forEach((client) => {
        const cb = client?.S?.S?.callback || client?.UZ?.UZ?.callback;
        if (cb) cb(tkn);
      });
    }
  }, token);

  return token;
}

// Usage
(async () => {
  const browser = await puppeteer.launch({ headless: "new" });
  const page = await browser.newPage();
  const url = "https://example.com/login";
  await page.goto(url, { waitUntil: "networkidle2" });

  await solveCaptcha(page, url);
  await page.click("#submit-btn");
  await page.waitForNavigation();

  await browser.close();
})();

Playwright

Playwright supports Chromium, Firefox, and WebKit with a single API. Its CAPTCHA-solving integration is clean and similar to Puppeteer:

const { chromium } = require("playwright");

const UCAPTCHA_KEY = "your_api_key";
const API_URL = "https://api.ucaptcha.net";

async function solveCaptcha(page, url) {
  // Extract site key
  const sitekey = await page
    .locator(".g-recaptcha")
    .getAttribute("data-sitekey");

  // Create task
  let resp = await fetch(`${API_URL}/createTask`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      clientKey: UCAPTCHA_KEY,
      task: {
        type: "RecaptchaV2TaskProxyless",
        websiteURL: url,
        websiteKey: sitekey,
      },
    }),
  });
  const { taskId } = await resp.json();

  // Poll for result
  let token = null;
  for (let i = 0; i < 30; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    resp = await fetch(`${API_URL}/getTaskResult`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        clientKey: UCAPTCHA_KEY,
        taskId,
      }),
    });
    const result = await resp.json();
    if (result.status === "ready") {
      token = result.solution.gRecaptchaResponse;
      break;
    }
  }
  if (!token) throw new Error("CAPTCHA solve timed out");

  // Inject token
  await page.evaluate((tkn) => {
    document.getElementById("g-recaptcha-response").value = tkn;
  }, token);

  return token;
}

// Usage
(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  const url = "https://example.com/login";
  await page.goto(url);

  await solveCaptcha(page, url);
  await page.locator("#submit-btn").click();
  await page.waitForURL("**/dashboard");

  await browser.close();
})();

Playwright’s locator API is more resilient than raw CSS selectors, automatically waiting for elements to appear before interacting.

Handling hCaptcha and Turnstile

The examples above target reCAPTCHA v2, but the approach works for other CAPTCHA types with minor changes.

hCaptcha: Change the selector from .g-recaptcha to .h-captcha, the task type to HardCaptchaTaskProxyless, and the response field to h-captcha-response.

Cloudflare Turnstile: Use the selector .cf-turnstile, task type TurnstileTaskProxyless, and response field cf-turnstile-response.

// Turnstile example (Playwright)
const sitekey = await page
  .locator(".cf-turnstile")
  .getAttribute("data-sitekey");

const resp = await fetch(`${API_URL}/createTask`, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    clientKey: UCAPTCHA_KEY,
    task: {
      type: "TurnstileTaskProxyless",
      websiteURL: url,
      websiteKey: sitekey,
    },
  }),
});

Tips for Reliable Browser CAPTCHA Solving

Wait for the CAPTCHA element to load. Don’t try to extract the site key immediately. Use explicit waits (WebDriverWait in Selenium, waitForSelector in Puppeteer/Playwright) to ensure the CAPTCHA has rendered.

Use stealth plugins. Browser automation is detectable by default. Use puppeteer-extra-plugin-stealth for Puppeteer, playwright-stealth for Playwright, or undetected-chromedriver for Selenium to reduce fingerprint leakage.

Handle iframes. Some CAPTCHAs render inside iframes. You may need to switch into the iframe context to find the CAPTCHA element, or look for the site key in the parent page’s HTML.

Use tokens quickly. Solved tokens expire within 60-120 seconds. Submit the form immediately after injecting the token.

Consider headless detection. Some sites block headless browsers outright. Running in headed mode (with a display) or using --headless=new in Chrome can help bypass basic headless detection.

Conclusion

Integrating CAPTCHA solving into Selenium, Puppeteer, and Playwright follows the same core pattern: detect, extract, solve, inject. The framework-specific differences are minimal — mostly around DOM query syntax and async patterns. uCaptcha’s API works identically regardless of which browser framework you’re using, giving you a single integration point that routes tasks across multiple solving providers for the best combination of speed, reliability, and cost.

Related Articles