Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record third-party pings (navigator.sendBeacon) as tracking #2024

Closed
wants to merge 15 commits into from

Conversation

ghostwords
Copy link
Member

@ghostwords ghostwords commented May 15, 2018

Fixes a part of #367 and #1808. This catches Google Analytics when it's configured to send pings (more commonly, it drops pixels (#794); pixel detection can be added in a separate PR).

Let's see what other new trackers Privacy Badger will catch if it treats all third-party pings as tracking. Besides Google Analytics, I noticed Clicktale, DoubleVerify and mPulse Beacon (akstat.io).

Sites that I see use GA with pings/the beacon API:

  • github.com
  • lidl.de (Chrome-only for some reason)
  • stackoverflow.com
  • superuser.com
  • telegraph.co.uk

@ghostwords ghostwords requested a review from bcyphers May 15, 2018 22:26
pingAccounting: function (details) {
const fqdn = (new URI(details.url)).host,
origin = window.getBaseDomain(fqdn),
tab_origin = tabOrigins[details.tabId];
Copy link
Member Author

@ghostwords ghostwords May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably susceptible to navigation-related misattribution (tabOrigins has the new site's URL instead of the site that originated the ping; similar to #1997), but this may be OK for now (problem exists elsewhere, no need to block on the fix if the new feature works well enough).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an issue for the ping attribute on hyperlinks. For example, visit google.com in chrome, make a search, and click on any outbound link. For me, most of the time clicking on foo.com adds that domain as an entry for google.com in the snitch_map. Since the ping request fires off simultaneously with the user navigating away from Google, this bug is much more likely to come up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but is that a blocking issue, practically speaking? Privacy Badger is very likely to learn to block google.com anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in general the navigation bug means we are likely to record first-party pings as third-party tracking. Or third-party pings as third-party on the wrong party.

Copy link
Contributor

@bcyphers bcyphers May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a big deal for google in particular, but my concern is that it will happen to any site that uses the ping attr on links. Maybe that's not a bad thing? If we decide that we want to consider ping on outgoing links to be a tracking action, then this could be a feature, not a bug.

The wrong-third-party scenario would end up messing up the snitch_map for a tracking domain, but not the tracking domain itself, right?

Copy link
Member Author

@ghostwords ghostwords May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this navigation-related attribution bug doesn't affect the tracking domain, just where it was seen. We could attribute the tracking domain to the wrong site domain, the one the user just navigated to. #1997 is waiting on information from Chromium devs at the moment.

@ghostwords
Copy link
Member Author

As this adds a new input to our learning engine, this should help when cookie learning has been curtailed (#1851) for whatever reason.

@ghostwords
Copy link
Member Author

Beacons are one of three transport methods for Google Analytics.

I'm very curious to see if Badger will catch Google Analytics in its pretrained data set with this PR.

Copy link
Contributor

@bcyphers bcyphers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should decide whether we want to count ping requests on a first-party site as tracking actions, or wait until the tabOrigin issue is resolved.

@ghostwords ghostwords force-pushed the record-pings-as-tracking branch 4 times, most recently from 56df177 to bec65aa Compare September 17, 2018 20:24
@ghostwords ghostwords force-pushed the record-pings-as-tracking branch from bec65aa to c40bf55 Compare September 17, 2018 20:57
Copy link
Member Author

@ghostwords ghostwords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should account for incognito. ✔️ (Can we reuse heuristicBlockingAccounting or do we need a new listener?)

This records all pings/beacons as tracking (regardless of payload), but there are non-tracking uses (gorhill/uBlock#1493 (comment), https://bugs.chromium.org/p/chromium/issues/detail?id=611453). (We already block "tab-less" pings (gorhill/uBlock#1884 (comment)).

May want to update documentation wherever we mention supported detection methods (popup tooltip, design doc, various FAQ entries). Previously: #2403

Should rebase https://github.com/EFForg/privacybadger/tree/explain-blocking on master and update to log pings/beacons after this is merged.

@ghostwords ghostwords force-pushed the record-pings-as-tracking branch from c6772fb to cad6368 Compare October 16, 2018 22:37
@ghostwords ghostwords mentioned this pull request Dec 4, 2018
2 tasks
Conflicts:
	src/js/heuristicblocking.js
	tests/selenium/pbtest.py
Conflicts:
	src/js/heuristicblocking.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants