-
-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record third-party pings (navigator.sendBeacon) as tracking #2024
Conversation
pingAccounting: function (details) { | ||
const fqdn = (new URI(details.url)).host, | ||
origin = window.getBaseDomain(fqdn), | ||
tab_origin = tabOrigins[details.tabId]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably susceptible to navigation-related misattribution (tabOrigins
has the new site's URL instead of the site that originated the ping; similar to #1997), but this may be OK for now (problem exists elsewhere, no need to block on the fix if the new feature works well enough).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an issue for the ping
attribute on hyperlinks. For example, visit google.com in chrome, make a search, and click on any outbound link. For me, most of the time clicking on foo.com
adds that domain as an entry for google.com
in the snitch_map
. Since the ping request fires off simultaneously with the user navigating away from Google, this bug is much more likely to come up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but is that a blocking issue, practically speaking? Privacy Badger is very likely to learn to block google.com
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in general the navigation bug means we are likely to record first-party pings as third-party tracking. Or third-party pings as third-party on the wrong party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not a big deal for google in particular, but my concern is that it will happen to any site that uses the ping
attr on links. Maybe that's not a bad thing? If we decide that we want to consider ping
on outgoing links to be a tracking action, then this could be a feature, not a bug.
The wrong-third-party scenario would end up messing up the snitch_map
for a tracking domain, but not the tracking domain itself, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this navigation-related attribution bug doesn't affect the tracking domain, just where it was seen. We could attribute the tracking domain to the wrong site domain, the one the user just navigated to. #1997 is waiting on information from Chromium devs at the moment.
As this adds a new input to our learning engine, this should help when cookie learning has been curtailed (#1851) for whatever reason. |
Beacons are one of three transport methods for Google Analytics. I'm very curious to see if Badger will catch Google Analytics in its pretrained data set with this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should decide whether we want to count ping
requests on a first-party site as tracking actions, or wait until the tabOrigin
issue is resolved.
Obsoleted by #2047
56df177
to
bec65aa
Compare
bec65aa
to
c40bf55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should account for incognito. ✔️ (Can we reuse heuristicBlockingAccounting
or do we need a new listener?)
This records all pings/beacons as tracking (regardless of payload), but there are non-tracking uses (gorhill/uBlock#1493 (comment), https://bugs.chromium.org/p/chromium/issues/detail?id=611453). (We already block "tab-less" pings (gorhill/uBlock#1884 (comment)).
May want to update documentation wherever we mention supported detection methods (popup tooltip, design doc, various FAQ entries). Previously: #2403
Should rebase https://github.com/EFForg/privacybadger/tree/explain-blocking on master and update to log pings/beacons after this is merged.
c40bf55
to
91eeb6c
Compare
9bdd0a7
to
c6772fb
Compare
c6772fb
to
cad6368
Compare
Conflicts: src/js/heuristicblocking.js tests/selenium/pbtest.py
Conflicts: src/js/heuristicblocking.js
Fixes a part of #367 and #1808. This catches Google Analytics when it's configured to send pings (more commonly, it drops pixels (#794); pixel detection can be added in a separate PR).
Let's see what other new trackers Privacy Badger will catch if it treats all third-party pings as tracking. Besides Google Analytics, I noticed Clicktale, DoubleVerify and mPulse Beacon (akstat.io).
Sites that I see use GA with pings/the beacon API: