Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't click interactive turnstile with Camoufox like patchright #150

Closed
D4Vinci opened this issue Dec 30, 2024 · 22 comments
Closed

Can't click interactive turnstile with Camoufox like patchright #150

D4Vinci opened this issue Dec 30, 2024 · 22 comments
Labels
detection issue Potential leak in Camoufox.

Comments

@D4Vinci
Copy link
Contributor

D4Vinci commented Dec 30, 2024

Hey mate, there is this new script that can click interactive turnstile captcha: https://github.com/Xewdy444/CF-Clearance-Scraper
I tested it on https://sergiodemo.com/security/challenge/legacy-challenge and it worked
The main version uses Patchright and there's another one with nodriver. Both can only bypass it in headed mode.

The thing is I tried to make a version that does this with Camoufox but it doesn't work, here's my research results:

  1. Because the script depends on clicking on specific hard-coded dimensions on the page, it needs the browser to show with the default viewport/window size of Playwright's Chrome.
    I searched it and found it's OS dependent so for my MacOS it's {'width': 1280, 'height': 720}. I used the attribute window=(width, height) of Camoufox but the window keeps opening in different sizes every time.
    So I used the screen argument as well like this screen=Screen(max_width=width, min_width=width, max_height=height, min_height=height) and the window now opens in the same size but still something doesn't look right.
    So I added another test:
    print(self.page.viewport_size)
    print(self.page.evaluate("({ width: window.innerWidth, height: window.innerHeight })"))
    and it turns out the height on the second print is always wrong!
  2. The page takes more time to load than Patchright and about double the time for the captcha spinner to disappear, something seems really off with the page on Camoufox. I think browserforge fingerprints are missing something, I was using it on Scrapling to inject headers to raw playwright and I had many issues with normal websites not loading that don't have protections and recaptcha not loading. In the end, it turns out the format used in browserforge headers is very old and different than the one used in Chrome now so just disabling it made all those websites load correctly.

With Camoufox the function called solve_challenge doesn't work at all no matter how I optimize it for Camoufox, it doesn't detect any of the elements.

I guess all this problem revolves around Camoufox taking too much control over what's happening without too many options for the user to control the behavior. I don't know.

Last month I made a script to click the turnstile interactive captcha the same way but depending on OpenCV for the dimensions and Camoufox as a browser but I had a big issue with the mouse click not registering and I think this happens here too. Maybe Camoufox handles nested iframes very differently than Patchright?

@D4Vinci D4Vinci added the detection issue Potential leak in Camoufox. label Dec 30, 2024
@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 4, 2025

@daijro
To narrow the issue down even more, this script makes the mouse move to the captcha correctly every time but the click doesn't happen:

import asyncio
from camoufox.async_api import AsyncCamoufox


async def main():
    async with AsyncCamoufox(headless=False, humanize=True, window=(1280, 720)) as browser:
        page = await browser.new_page()
        await page.goto('https://sergiodemo.com/security/challenge/legacy-challenge')
        await page.wait_for_load_state(state="domcontentloaded")
        await page.wait_for_load_state('networkidle')

        await asyncio.sleep(5)
        await page.mouse.click(210, 290)
        input('Press enter to close')
        await browser.close()

if __name__ == "__main__":
    asyncio.run(main())

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 4, 2025

@daijro
A friend of mine managed to make the click work by simply doing this:

import asyncio
from camoufox.async_api import AsyncCamoufox


async def handle_route(route):
    response = await route.fetch()
    await route.fulfill(body=await response.body())


async def main():
    async with AsyncCamoufox(headless=False, humanize=True, window=(1280, 720)) as browser:
        page = await browser.new_page()
        await page.route("**/*", handle_route)
        await page.goto('https://sergiodemo.com/security/challenge/legacy-challenge')
        await page.wait_for_load_state(state="domcontentloaded")
        await page.wait_for_load_state('networkidle')

        await asyncio.sleep(5)
        await page.mouse.click(210, 290)
        await page.wait_for_timeout(30000)
        input('Press enter to close')
        await browser.close()

if __name__ == "__main__":
    asyncio.run(main())

Do you have any idea of WTF? 😄

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 4, 2025

@daijro I have improved the script a lot so no constants or periods of sleep are used now but still, the route thing making the captcha clickable doesn't make sense so I will leave the issue open for you to see this, maybe it's a bug that needs fixing

@netdev1
Copy link

netdev1 commented Jan 4, 2025

I'm just guessing but could it be:

  • adblock extension that's enabled by default blocking some requests and route.fulfill fixing that
  • requests timing out for some other reason and route.fulfill changing the default timeouts

Either way try logging all requests to see if anything fails

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 4, 2025

@netdev1 I tried before to disable UBO and it didn't matter so it's not that. The 2nd point makes sense because before doing that route, the Turnstile page seemed laggy sometimes on Camoufox

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 7, 2025

@daijro I will close this ticket as my issue is solved so you have one less ticket to worry about but I think you should have a look on the route solution when you have time as it's really weird behaviour from Camoufox.

@daijro
Copy link
Owner

daijro commented Jan 22, 2025

@daijro I will close this ticket as my issue is solved so you have one less ticket to worry about but I think you should have a look on the route solution when you have time as it's really weird behaviour from Camoufox.

Thanks! Sorry for the long delay— I just got back from a long break (had exams, was sick for a bit, also been working on a new project for Camoufox). I'll start working on bug fixes this week. 👍

Using .route() changes the behavior of network caching in the browser. Resources like JavaScript, CSS files, and images are fetched from the network on every request rather than being loaded into/from the cache. I think this issue happens because of changes introduced in FF133 that cause caching to affect the behavior of iframes (Camoufox is a couple versions ahead of the base Playwright FF release). Unfortunately, Playwright hasn't updated their Firefox fork since November 13th. I'll implement the route solution you provided in the Python library as a temporary fix (Thank you btw!)

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 22, 2025

Welcome back bro @daijro

@jezonek
Copy link

jezonek commented Jan 23, 2025

Hi guys, I try to replicate the code example and I got into endless loop of verification. Did you encounter the same behavior?

@daijro
Copy link
Owner

daijro commented Jan 24, 2025

Looks like a new commit for the latest FF has been pushed on Playwright's repo a few days ago (seems to have a new maintainer now)
microsoft/playwright@a121f85

I'll implement it into Camoufox after class and check if this issue still exists.

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 24, 2025

Seems good @daijro curious to see how it goes!

@daijro
Copy link
Owner

daijro commented Jan 25, 2025

Hello,

I've merged the latest changes from Playwright's upstream patches, and it turned out their commit is still not compatible with FF133+ (turns out they're still patching an old release from Oct 21, 2024). However, I found a commit this FF commit made 4 days later that could have caused the regression in how elements within shadow roots or frames are located/interacted with. I will look into a potential workaround, or maybe reverting this commit and seeing if it fixes the issue.

@daijro
Copy link
Owner

daijro commented Jan 25, 2025

Hello,

I figured out that reason this issue doesn't happen when using the route solution is because headers aren't being passed to route.fulfill:

async def handle_route(route):
    response = await route.fetch()
    await route.fulfill(
        body=await response.body(),
        headers=response.headers,  # Missing in example
        status=response.status
    )

Specifically, the reason Turnstile can be clicked is because the Cross-Origin-Opener-Policy (or COOP) header is being removed. Seems like Playwright's click function does not support the latest security changes in FF133+.

A workaround for this could be to use this preference which disables COOP from being handled:

firefox_user_prefs={
    'browser.tabs.remote.useCrossOriginOpenerPolicy': False,
}

However, this could potentially be detected by anti-bots, though I haven't seen this used in a production environment (all major anti-bot providers & Camoufox testing sites still pass with it disabled). I will consider adding a COOP toggle in the Python library until Playwright bumps to FF133+ 👍

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Jan 26, 2025

Nice job @daijro ! I have just tested it on the same script without routing and it worked!

@sebhansen
Copy link

Seems like there is a new issue to this?
I've tried using the exact same setup as you guys, with the new routing, and it doesnt want to click at all.

import asyncio
from camoufox.async_api import AsyncCamoufox


async def handle_route(route):
    response = await route.fetch()
    await route.fulfill(
        body=await response.body(),
        headers=response.headers,  # Missing in example
        status=response.status
    )


async def main():
    async with AsyncCamoufox(headless=False, humanize=True, window=(1280, 720)) as browser:
        page = await browser.new_page()
        await page.route("**/*", handle_route)
        await page.goto('https://sergiodemo.com/security/challenge/legacy-challenge')
        await page.wait_for_load_state(state="domcontentloaded")
        await page.wait_for_load_state('networkidle')

        await asyncio.sleep(5)
        await page.mouse.click(210, 290)
        await page.wait_for_timeout(30000)
        input('Press enter to close')
        await browser.close()

if __name__ == "__main__":
    asyncio.run(main())

@sebhansen
Copy link

Is this still working for you @D4Vinci ?

@daijro
Copy link
Owner

daijro commented Jan 29, 2025

@sebhansen Hello, try passing disable_coop=True, you should be able to click it then.

@sebhansen
Copy link

@daijro it's giving this error: TypeError: BrowserType.launch() got an unexpected keyword argument 'disable_coop'

I have to pass this instead: firefox_user_prefs = {'browser.tabs.remote.useCrossOriginOpenerPolicy': False}

@daijro
Copy link
Owner

daijro commented Jan 30, 2025

@daijro it's giving this error: TypeError: BrowserType.launch() got an unexpected keyword argument 'disable_coop'

I have to pass this instead: firefox_user_prefs = {'browser.tabs.remote.useCrossOriginOpenerPolicy': False}

I think your version is out of date. Try running pip install -U camoufox 👍

@sebhansen
Copy link

That's odd, I did python -m camoufox fetch and checked version, but apparently that didnt do the trick. Oh well, it works now.

Now I just have issues with being sent straight back to the same challenge page after clicking #170

daijro added a commit that referenced this issue Feb 5, 2025
@daijro
Copy link
Owner

daijro commented Feb 5, 2025

Hello,

A fix for this issue has been added in v135.0-beta.21 (without having to disable COOP).

Camoufox won't have any issues interacting with cross origin iframes anymore.

@D4Vinci
Copy link
Contributor Author

D4Vinci commented Feb 6, 2025

Awesome work man @daijro you are the best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detection issue Potential leak in Camoufox.
Projects
None yet
Development

No branches or pull requests

5 participants