Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(plugin-stealth): Add support for UA hints #413

Merged
merged 8 commits into from
Feb 2, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
const { PuppeteerExtraPlugin } = require('puppeteer-extra-plugin')

/**
* Fixes the UserAgent info (composed of UA string, Accept-Language, Platform).
* Fixes the UserAgent info (composed of UA string, Accept-Language, Platform, and UA hints).
*
* If you don't provide any values this plugin will default to using the regular UserAgent string (while stripping the headless part).
* Default language is set to "en-US,en", default platform is "win32".
* Default language is set to "en-US,en", the other settings match the UserAgent string.
*
* By default puppeteer will not set a `Accept-Language` header in headless:
* It's (theoretically) possible to fix that using either `page.setExtraHTTPHeaders` or a `--lang` launch arg.
Expand All @@ -28,14 +28,13 @@ const { PuppeteerExtraPlugin } = require('puppeteer-extra-plugin')
*
* // Stealth plugins are just regular `puppeteer-extra` plugins and can be added as such
* const UserAgentOverride = require("puppeteer-extra-plugin-stealth/evasions/user-agent-override")
* // Define custom UA, locale and platform
* const ua = UserAgentOverride({ userAgent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", locale: "de-DE,de;q=0.9", platform: "Win32" })
* // Define custom UA and locale
* const ua = UserAgentOverride({ userAgent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", locale: "de-DE,de" })
* puppeteer.use(ua)
*
* @param {Object} [opts] - Options
* @param {string} [opts.userAgent] - The user agent to use (default: browser.userAgent())
* @param {string} [opts.locale] - The locale to use in `Accept-Language` header and in `navigator.languages` (default: `en-US,en;q=0.9`)
* @param {string} [opts.platform] - The platform to use in `navigator.platform` (default: `Win32`)
* @param {string} [opts.locale] - The locale to use in `Accept-Language` header and in `navigator.languages` (default: `en-US,en`)
*
*/
class Plugin extends PuppeteerExtraPlugin {
Expand All @@ -50,21 +49,104 @@ class Plugin extends PuppeteerExtraPlugin {
get defaults() {
return {
userAgent: null,
locale: 'en-US,en',
platform: 'Win32'
locale: 'en-US,en'
}
}

async onPageCreated(page) {
// Determine the full user agent string, strip the "Headless" part
const ua =
this.opts.userAgent ||
(await page.browser().userAgent()).replace('HeadlessChrome/', 'Chrome/')

// Full version number from Chrome
const uaVersion = ua.includes('Chrome/')
? ua.match(/Chrome\/([^\s]+)/)[1]
: (await page.browser().version()).match(/\/([^\s]+)/)[1]

// Get platform identifier (short or long version)
const _getPlatform = (extended = false) => {
if (ua.includes('Mac OS X')) {
return extended ? 'Mac OS X' : 'MacIntel'
} else if (ua.includes('Android')) {
return 'Android'
} else if (ua.includes('Linux')) {
return 'Linux'
} else {
return extended ? 'Windows' : 'Win32'
}
}

// Source in C++: https://source.chromium.org/chromium/chromium/src/+/master:chrome/browser/chrome_content_browser_client.cc;l=1187-1238
const _getBrands = () => {
const seed = uaVersion.split('.')[0] // the major version number of Chrome

const order = [
[0, 1, 2],
[0, 2, 1],
[1, 0, 2],
[1, 2, 0],
[2, 0, 1],
[2, 1, 0]
][seed % 6]
const escapedChars = [' ', ' ', ';']

const greaseyBrand = `${escapedChars[order[0]]}Not${
escapedChars[order[1]]
}A${escapedChars[order[2]]}Brand`

const greasedBrandVersionList = []
greasedBrandVersionList[order[0]] = {
brand: greaseyBrand,
version: '99'
}
greasedBrandVersionList[order[1]] = {
brand: 'Chromium',
version: seed
}
greasedBrandVersionList[order[2]] = {
brand: 'Google Chrome',
version: seed
}

return greasedBrandVersionList
}

// Return OS version
const _getPlatformVersion = () => {
if (ua.includes('Mac OS X')) {
return ua.match(/Mac OS X ([^_]+)/)[1]
} else if (ua.includes('Android')) {
return ua.match(/Android ([^;]+)/)[1]
} else if (ua.includes('Windows')) {
return ua.match(/([\d|.]+);/)[1]
} else {
return ''
}
}

// Get architecture, this seems to be empty on mobile and x86 on desktop
const _getPlatformArch = () => (_getMobile() ? '' : 'x86')

// Return the Android model, empty on desktop
const _getPlatformModel = () =>
_getMobile() ? ua.match(/Android.*?;.*?\/([^;]+)/)[1] : ''

const _getMobile = () => ua.includes('Android')

const override = {
userAgent:
this.opts.userAgent ||
(await page.browser().userAgent()).replace(
'HeadlessChrome/',
'Chrome/'
),
userAgent: ua,
acceptLanguage: this.opts.locale || 'en-US,en',
platform: this.opts.platform || 'Win32'
platform: _getPlatform(),
userAgentMetadata: {
brands: _getBrands(),
fullVersion: uaVersion,
platform: _getPlatform(true),
platformVersion: _getPlatformVersion(),
architecture: _getPlatformArch(),
model: _getPlatformModel(),
mobile: _getMobile()
}
}

this.debug('onPageCreated - Will set these user agent options', {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -120,22 +120,42 @@ test('stealth: navigator.languages with custom locale', async t => {
t.deepEqual(lang, 'de-DE')
})

test('stealth: navigator.platform with default platform', async t => {
const puppeteer = addExtra(vanillaPuppeteer).use(Plugin())
const browser = await puppeteer.launch({ headless: true })
const page = await browser.newPage()
test('stealth: test if UA hints are correctly set', async t => {
const puppeteer = addExtra(vanillaPuppeteer).use(
Plugin({
userAgent:
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36'
})
)

const platform = await page.evaluate(() => navigator.platform)
t.true(platform === 'Win32')
})
const browser = await puppeteer.launch({
headless: false, // only works on headful
args: ['--enable-features=UserAgentClientHint']
})

test('stealth: navigator.platform with custom platform', async t => {
const puppeteer = addExtra(vanillaPuppeteer).use(
Plugin({ platform: 'MyFunkyPlatform' })
const majorVersion = parseInt(
(await browser.version()).match(/\/([^\.]+)/)[1]
)
const browser = await puppeteer.launch({ headless: true })
if (majorVersion < 88) {
return t.true(true) // Skip test on browsers that don't support UA hints
}

const page = await browser.newPage()

const platform = await page.evaluate(() => navigator.platform)
t.true(platform === 'MyFunkyPlatform')
await page.goto('https://headers.cf/headers/?format=raw')
const firstLoad = await page.content()
t.true(
firstLoad.includes(
`sec-ch-ua: "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99"`
)
)

await page.reload()
const secondLoad = await page.content()
t.true(secondLoad.includes('sec-ch-ua-mobile: ?0'))
t.true(secondLoad.includes('sec-ch-ua-full-version: "88.0.4324.96"'))
t.true(secondLoad.includes('sec-ch-ua-arch: "x86"'))
t.true(secondLoad.includes('sec-ch-ua-platform: "Windows"'))
t.true(secondLoad.includes('sec-ch-ua-platform-version: "10.0"'))
t.true(secondLoad.includes('sec-ch-ua-model: ""'))
})
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,20 @@

#### Table of Contents

- [class: Plugin](#class-plugin)
- [class: Plugin](#class-plugin)

### class: [Plugin](https://github.com/berstend/puppeteer-extra/blob/e6133619b051febed630ada35241664eba59b9fa/packages/puppeteer-extra-plugin-stealth/evasions/user-agent-override/index.js#L41-L77)
### class: [Plugin](https://github.com/berstend/puppeteer-extra/blob/f96d8b0cedfe93b2867fcdd2049364a242bdc036/packages/puppeteer-extra-plugin-stealth/evasions/user-agent-override/index.js#L40-L159)

- `opts` **[Object](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object)?** Options (optional, default `{}`)
- `opts.userAgent` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** The user agent to use (default: browser.userAgent())
- `opts.locale` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** The locale to use in `Accept-Language` header and in `navigator.languages` (default: `en-US,en;q=0.9`)
- `opts.platform` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** The platform to use in `navigator.platform` (default: `Win32`)
- `opts` **[Object](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object)?** Options (optional, default `{}`)
- `opts.userAgent` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** The user agent to use (default: browser.userAgent())
- `opts.locale` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** The locale to use in `Accept-Language` header and in `navigator.languages` (default: `en-US,en`)

**Extends: PuppeteerExtraPlugin**

Fixes the UserAgent info (composed of UA string, Accept-Language, Platform).
Fixes the UserAgent info (composed of UA string, Accept-Language, Platform, and UA hints).

If you don't provide any values this plugin will default to using the regular UserAgent string (while stripping the headless part).
Default language is set to "en-US,en", default platform is "win32".
Default language is set to "en-US,en", the other settings match the UserAgent string.

By default puppeteer will not set a `Accept-Language` header in headless:
It's (theoretically) possible to fix that using either `page.setExtraHTTPHeaders` or a `--lang` launch arg.
Expand All @@ -32,23 +31,19 @@ as it will reset the language and platform values you set with this plugin.
Example:

```javascript
const puppeteer = require('puppeteer-extra')
const puppeteer = require("puppeteer-extra")

const StealthPlugin = require('puppeteer-extra-plugin-stealth')
const StealthPlugin = require("puppeteer-extra-plugin-stealth")
const stealth = StealthPlugin()
// Remove this specific stealth plugin from the default set
stealth.enabledEvasions.delete('user-agent-override')
stealth.enabledEvasions.delete("user-agent-override")
puppeteer.use(stealth)

// Stealth plugins are just regular `puppeteer-extra` plugins and can be added as such
const UserAgentOverride = require('puppeteer-extra-plugin-stealth/evasions/user-agent-override')
// Define custom UA, locale and platform
const ua = UserAgentOverride({
userAgent: 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)',
locale: 'de-DE,de;q=0.9',
platform: 'Win32'
})
const UserAgentOverride = require("puppeteer-extra-plugin-stealth/evasions/user-agent-override")
// Define custom UA and locale
const ua = UserAgentOverride({ userAgent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", locale: "de-DE,de" })
puppeteer.use(ua)
```

---
* * *