Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider building a pairwise whitelist for Tracking Protection breakage #53

Open
englehardt opened this issue Jan 30, 2019 · 1 comment

Comments

@englehardt
Copy link
Contributor

Right now, tracking protection breakage is handled using the "Content" category of the Tracking Protection lists. That is, when a tracking domain is found to cause an unacceptable level of breakage, it is moved to the "Content" category of the Disconnect list, which is only blocked when the "Strict" version of the Disconnect list is enabled. This has a number of downsides. The main one is that severe breakage on a few popular sites may lead to a tracking domain being whitelisted on all sites.

We should instead consider pairwise whitelisting in these scenarios. That is, when breakage is discovered that would normally lead to a domain being moved from the Basic list to the Strict list, we instead add a pairwise whitelist for that domain. For example, if we discover that tracker.example breaks news.example and video.example, then we only whitelist tracker.example on those two sites when in the "Basic" mode of protection. The platform should already have support for this with our entity lists.

@groovecoder
Copy link
Contributor

This is a great idea. The entitylist is already produce in a pair-wise format ...

https://github.com/mozilla-services/shavar-list-creation/blob/4b37cac21ba4d93e89aa2b74e9f8856cb81bda5b/lists2safebrowsing.py#L335-L340

I.e., the entitylist that allows fbcdn.net resources on facebook.com is facebook.com?resource=fbcdn.net

So, I think we can do this all server side in the content by:

  1. Changing "entitylist" into "allowlist"
  2. Create any "allow set" of pairwise rules
  3. Put the appropriate properties and resources into the allow set

E.g., we could add this:

"CNN": {
  "properties": [
    "cnn.com"
  ],
  "resources": [
    "youtube.com"
  ]
}

We would not need to change the URL classifier logic at all. That "allow set" would effectively generate a line in the allowlist: cnn.com?resource=youtube.com which would start allowing youtube.com resources on cnn.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants