-
-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement stealth mode #142
Comments
Would be great if it could pass those tests |
Also it would probably make sense to add the intoli's checks to the specs. |
@route Any thoughts on adding this in? We've been using ferrum for a while now and started getting blocked on one of the sites. I'm happy to take a cut at implementing this if you want to outline some of your thoughts on how you envision doing it. I studied the source code for about an hour tonight just thinking through some options here. |
Hi @brettallred,
This would be so wonderful! 🙏 I'm not a maintainer here but I would like to see Stealth mode as an integrated extension. My idea would be: Specs
Implementation of the extension itselfthere are good references out there:
Outside of the specs, you could also check the reCAPTCHA score how good the scripts work. Summary of a possible solution — TL;DR;
Again, this is just an idea and I'm not the maintainer here. So please take it with a grain of salt. PS: Updating the stealth extension could even be a GitHub action later on. |
I just wanted to pass a small note that the move @alexanderadam proposed is absolutely feasible. Absurdly so. I've always been a bit intimidated wrangling the js/extension side of things so I kind of brushed that last comment off a bit, assuming additional wiring would need to happen. Tonight I stumbled back into it and noted in particular First off, thank you @alexanderadam for your detailed note. I saw it this spring, but like I said... I didn't understand it's proposed simplicity. Second, I wanted to report these findings just in case it inspires someone else. |
According to these webpages : Tests of bot.sannysoft.com and www.nowsecure.nl are successfully passed with this configuration of browser : browser = Ferrum::Browser.new(browser_path: BROWSER_PATH, headless: false, browser_options: { "disable-blink-features": "AutomationControlled" }) I don't yet find how to pass them in headless mode. |
Isn't this a problem better solved at the Chromium level? I read this article recently, seems like there are improvements in an upcoming version of Chrome: https://antoinevastel.com/bot%20detection/2023/02/19/new-headless-chrome.html I'd close this issue, out of scope for Ferrum. |
It is, but still ferrum itself can provide some guidance and scripts to make it even harder from the beginning to detect automation. |
Is there documentation on how to get the new headless mode in Ferrum? |
You've found a solution on how to transfer them in headless mode? |
You can enable the new headless mode in chromium by modifying the browser options: Ferrum::Browser.new(browser_options: { "headless": "new" }) |
it doesn't work, because there's a lot more work to be done #379 |
Sick, this works great, got all test to pass, and CF unblocked, thanks again. |
For anyone intersted, I wrote up the tips from this thread + many others into an article: Stealthly Browsing and Scraping with Ferrum It covers the tips from @ttilberg on integrating |
@harrison-broadbent Excellent article, I wasn't aware of ferrum before reading it. It looks like the puppeteer-extra-plugin-stealth has not been updated in over two years. Is that a potential issue, or does it simply not require updates often? |
It’s the case that the maintainer took his valuable work private. It’s a lot of work to maintain this tool, and it’s profitable to leverage in consulting engagements, so I can’t blame him. It also gives the cat a big advantage in the age old cat and mouse game when the best evasions are public. That said, the work that is public is still quite valuable and also helps me get around certain challenges more frequently. |
Thanks @ttilberg that is great to know 🙏 |
The article is very interesting, thank you! |
I hate botting. |
@joshfester @akavitaliy thank you both! And I agree with what @ttilberg said — despite being out of data (compared the the state-of-the-art) I believe the evasions are still valuable, particularly when scraping average / lightly-protected websites |
https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth
The text was updated successfully, but these errors were encountered: