Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement referrer policy #10311

Open
paulrouget opened this issue Mar 31, 2016 · 83 comments · Fixed by #11468
Open

Implement referrer policy #10311

paulrouget opened this issue Mar 31, 2016 · 83 comments · Fixed by #11468
Labels
A-content/dom Interacting with the DOM from web content A-network B-meta This issue tracks the status of multiple, related pieces of work

Comments

@paulrouget
Copy link
Contributor

See https://w3c.github.io/webappsec-referrer-policy/

Necessary for #10309 (comment)

@jdm jdm added A-content/dom Interacting with the DOM from web content A-network labels Mar 31, 2016
@jdm
Copy link
Member

jdm commented Mar 31, 2016

Tests: tests/wpt/web-platform-tests/referrer-policy/

@jdm
Copy link
Member

jdm commented Mar 31, 2016

The goal of this specification is to allow webpages to control the behaviour of the Referer header (sic) when network requests are initiated.

The basic pieces here are the following:

  • create an enum representing a referrer policy in net_traits/lib.rs
  • introduce a place where referrer policies are stored for global environments (ie. Document and WorkerGlobalScope, in document.rs and workerglobalscope.rs respectively)
  • propagate this policy to the actual HTTP network request code in http_loader.rs and modify the Referer header accordingly (via document_loader.rs, fetch the policy from the global environment)
  • introduce per-element attributes for controlling the associated requests, ensure those get propagated accordingly (eg. htmlscriptelement.rs, htmliframeelement.rs, etc.)
  • integrate with the Fetch implementation (components/net/fetch/methods.rs) and write unit tests (tests/unit/net/fetch.rs)
  • support setting the initial global policy via the Referrer-Policy HTTP header in the initial network response for a worker/document (in http_loader.rs, add a new field to the Metadata type and extract that when creating the new Document in script_thread.rs)

@rebstar6
Copy link
Contributor

I'm going to work on this issue!

@Manishearth Manishearth added the C-assigned There is someone working on resolving the issue label Mar 31, 2016
@jdm
Copy link
Member

jdm commented Mar 31, 2016

I've added some file pointers to my previous comment. Let us know if anything's unclear!

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 4, 2016

I'm a little confused on scope - the spec refers to the "settings object" as having a referrer policy - is this specific to the Document? What do you mean above by "global environment"?

Also, I've worked with the document file a little, but not workerglobalscope - can you give a quick explanation of what that file is and how that and document interact (if they interact).

@jdm
Copy link
Member

jdm commented Apr 4, 2016

A "global environment" refers to the root object for executing JavaScript code. You can think of this as a per-tab global object. The "settings object" is a concept that we don't have an equivalent for in Servo yet, so we'll treat it as "data that resides in the global object". WorkerGlobalScope is the global object for a web worker; you should feel free to ignore workers at first and focus on how the specification applies to ordinary documents.

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 5, 2016

On the policy propagation piece - I'm not totally clear what has access to what.

Looking at the files, what seems to be happening is that when you click on a link that should load a new page, the document_loader 'prepare_async_load' method gets called, and that, in theory, is where things can be passed between the document and the http_loader methods (most notably, load). If I have a document with some referrer policy, the load method would need both the current document's referrer policy and the current document's URL (and maybe some other stuff) in order to properly set the referer header. Right now, load doesn't have any access to the current doc(?). Is this right?

@jdm
Copy link
Member

jdm commented Apr 5, 2016

That is correct; the desired referrer policy would need to be passed as part of the LoadData. Passing it to the PendingAsyncLoad constructor via the methods in Document that invoke the DocumentLoader code would make sense to me.

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 5, 2016

But beyond that, if I just have the referrer policy, I still cant set the header because I also need the previous document's URL to populate it - the loaddata currently just has the url to be loaded, right?

@jdm
Copy link
Member

jdm commented Apr 5, 2016

Yep. We should probably add that to LoadData as well!

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 6, 2016

Alright, so just wanna confirm I'm on the right track here: rebstar6@8c91f8a

I added Option for referrer policy and referrer URL to both PendingAsyncLoad and LoadData. From document, they get passed 1st to PendingAsyncLoad which passes them forward to LoadData (where they will be picked up by the http_loader when setting headers). With the exception of the prepare_async_load/load_async methods, these are None.

Just to confirm - document.rs has both prepare_async_load and load_async, each of which call their respective methods in document_loader...but the document_loader's load_async calls prepare_async_load. Is the prepare getting called twice? What is the point of prepare_async_load within the document class?

@jdm
Copy link
Member

jdm commented Apr 6, 2016

Yes, that looks like it's on the right track. The LoadData constructors in XMLHttpRequest and ScriptThread::start_page_load should have non-None values, but those can be dealt with later as long as there's a TODO marking them.

As for the second question, prepare doesn't get called twice for the same value. The difference between prepare_async_load and async_load is that the former returns a structure that encapsulates a network request that can be started at some point in the future, while the second initiates a network request immediately.

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 6, 2016

Alright cool. Also, where does fetch come into play (and what is it, exactly?)?

The next thing I was going to try was to set the referer header within modify_request_headers based on the policy and urls. However, unsure if this should actually go into the fetch code instead (or both if theyre separate things)?

@jdm
Copy link
Member

jdm commented Apr 6, 2016

The code in components/net/fetch/ implements the Fetch specification and is designed to replace the ad-hoc network implementation that currently exists in stuff like http_loader.rs. It's only used in unit tests so far; we're going to switch over to using it after it's more complete. So yes, doing it in both would be most correct (and you'll need to add similar referrer data to Request as you did LoadData, since they don't overlap).

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 9, 2016

@SimonSapin - I mentioned this in IRC but I don't think you were on then. At some point for this issue, I will need to strip the username/password, fragment, and in some cases path and query from a URL (see https://w3c.github.io/webappsec-referrer-policy/#strip-url). I was told that you wrote (and are working on) rust-url. Do you have any suggestions about the best way to approach this? Are there changes to the crate that might make some of this easier? Any help appreciated!

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 9, 2016

@jdm - I've noticed that there are a bunch of requests on a page load where http_loader is capturing the referrer_policy/url. However, some are excluded (and tests fail as a result). For example, I added prints in http_loader for referrer_url and the url being loaded. if I do ./mach run https://www.wikipedia.org/ I get some assets receiving the referrer, and some not. Right now, I have no policy logic in place - every request should have the referrer added if there is one:

URL "https://www.wikipedia.org/"
REFERRER UNKNOWN
URL "https://www.wikipedia.org/portal/wikipedia.org/assets/js/abtesting-c8078be6c2.js"
REFERRER: "https://www.wikipedia.org/"
URL "https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia_wordmark.png"
REFERRER UNKNOWN
URL "https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png"
REFERRER UNKNOWN
URL "https://www.wikipedia.org/portal/wikipedia.org/assets/img/sprite-bookshelf_icons.png?18d8b7f58860758a154b3f1b0d329784d0f4235a"
REFERRER UNKNOWN
URL "https://www.wikipedia.org/portal/wikipedia.org/assets/js/index-1d96a05608.js"
REFERRER: "https://www.wikipedia.org/"
(... and some more, mixed with/without referrer)

Then note, if I click the 'English' link, the next URL to load has no referrer, which I think may be the cause of failing tests for me:

URL "https://en.wikipedia.org/"
REFERRER UNKNOWN
URL "https://en.wikipedia.org/wiki/Main_Page"
REFERRER UNKNOWN
URL "https://en.wikipedia.org/w/load.php?debug=false&lang=en&modules=ext.gadget.DRN-wizard%2CReferenceTooltips%2Ccharinsert%2Cfeatured-articles-links%2CrefToolbar%2Cswitcher%2Cteahouse%7Cext.tmh.thumbnail.styles%7Cext.uls.nojs%7Cext.visualEditor.desktopArticleTarget.noscript%7Cext.wikimediaBadges%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.raggett%2CsectionAnchor%7Cmediawiki.skinning.interface%7Cskins.vector.styles&only=styles&skin=vector"
REFERRER: "https://en.wikipedia.org/wiki/Main_Page"
URL "https://en.wikipedia.org/w/load.php?debug=false&lang=en&modules=site&only=styles&skin=vector"
REFERRER: "https://en.wikipedia.org/wiki/Main_Page"
(and more again, assets within the english wiki page)

The point of this is, the referrer_url is definitely not making the transition into http_loader in all cases after this logic is in place: rebstar6@d8c06b4 . Any idea what's failing here? Is this maybe related to the ScriptThread::start_page_load you mentioned above?

@jdm
Copy link
Member

jdm commented Apr 9, 2016

The image loader code isn't providing a referrer because the current design of the image cache makes it impossible to do so correctly. We can ignore any failures to do with images, accordingly. The failure for the top-level Document is start_page_load, as you correctly intuited.

@SimonSapin
Copy link
Member

@rebstar6 Yes I’m the main author and maintainer of rust-url. You can find the API documentation for the Url struct in the version of rust-url used by Servo at https://servo.github.io/rust-url/url/struct.Url.html

As you can see there, each component is stored separately. You can reset them with code like:

if let Some(relative) = url.relative_scheme_data_mut() {
    // String::clear sets the length of a string to zero, effectively removing its content.
    relative.username.clear();
    relative.password = None;
}
url.fragment = None;

I also have PR #9840 that updates Servo to a version of rust-url where the Url struct is completely different. But a number of thing still need to happen before it can be merged, so maybe your PR will be merged first and you won’t have to worry about it. If #9840 is merged first you’ll have to adapt your PR to the new API. In that API the Url struct will then have a single String internally but as a private field. To manipulate it, go through the setter methods:

url.set_fragment(None);
let _ = url.set_password(None);
let _ = url.set_username("");

Since strings are contiguous in memory, touching something in the middle that changes the length mean the rest of the string afterwards needs to be moved. For that reason it’s slightly more efficient to remove components at the end of the URL first.

let _ = is used to ignore the returned Result<(), ()>. This result can signal an error since some URLs like data:text/html,<p>Example can’t have a username or password, but since you were trying to remove them anyway that’s OK. (This is by the way when the previous code would not go into the if let block.)

Feel free to ask more questions here or on IRC. I’m in the central Europe time zone, but if you mention my nickname on IRC while I’m away I’ll still see it later and can answer then (or maybe someone else can answer), so leave your IRC client online if you can.

@rebstar6
Copy link
Contributor

@jdm a few more questions...

  1. Everything in tests/wpt/web-platform-tests/referrer-policy/ seems to be testing img, iframe, script tags, xhr-requests (not, or fetch. I'm not seeing anything that tests the basic case (I'm on a page and click a link). Is this group of tests just geared towards these cases for some reason? Is there anything elsewhere I can use (she asks hopefully)?
  2. I followed the script methods backwards as best I could, and settled on load_url in window.rs as the place that the referrer info can be passed along (load_url function - see rebstar6@f48fff9 if curious). This seems to work for my wikipedia case above. Is it too broad though (read: does window.load_url() get called in cases where we may not want that referrer info sent, like if you type a url into the browser? With my change, it will always send the referrer info for whatever that window's current document is)

@jdm
Copy link
Member

jdm commented Apr 12, 2016

  1. Unfortunately, since the test harness must be loaded in the top level document, any code that tests navigation must rely on an iframe so the tested document doesn't replace the test harness. That being said, it may be possible to test some cases by having the framed document call a method (eg. parent.navigation_complete(document.referrer)) instead of using postMessage, which we don't support.
  2. That's an interesting question. I'll have to get back to you on that.

@rebstar6
Copy link
Contributor

rebstar6 commented Apr 18, 2016

Gotcha. I'm definitely to the point where having some tests to code from would be helpful. If you could give me a little more guidance on how to edit what we have (or create new?), that would be great.

I think I have the Referer header setting piece mostly in place (read: http-loader will set the referer header given the some policy and url provided by the calling document). This assumes some policy given though, I'm not yet pulling the policy from anywhere. I could, at a minimum, write some unit tests for this to verify it, but those will be moot if we can get the full html tests working.

Also, thinking in terms of eventual review, is there any way you want me split this up for PRs? I don't want it to get too big and stuck in review. (code at https://github.com/rebstar6/servo/tree/referrerPolicy)

@jdm
Copy link
Member

jdm commented Apr 18, 2016

Whoops, didn't get back to you earlier. Window.load_URL looks like a fine choke point.

@jdm
Copy link
Member

jdm commented Apr 18, 2016

In terms of tests, we can't write unit tests for network code that interacts with code in script, so the best we can do there is write tests for the network code given LoadData with certain characteristics. Those would be tests in unit/net/http_loader.rs that verify that the expected Referer header is present in the request that would be sent to the server. For those iframe WPT tests, I'll take a look and see if there's a way to copy and rewrite some of them to not timeout.

@jdm
Copy link
Member

jdm commented Apr 18, 2016

As for reviewing, if you'd like to split it up like:

  • enabling relevant tests with lots of initial failures
  • data types, fields for storing relevant data
  • passing around referrer policy and referrer values, storing appropriately
  • setting referrer based on policy, updating expected test results

That should make each part straightforward to review.

@rebstar6
Copy link
Contributor

So I guess the question is, is there any point in writing some unit tests and PR-ing what I have now so I can get an early review on it?

That would include:

  • passing policy from document to httploader (think of it as 1-way. The policy will still have to flow backwards, as it can be set via a header)
  • setting referrer based on policy
  • Note: the policy being passed is set to 'no-referrer' for all document's, so this code should have no impact in production, but is in place for later

The unit tests, as you suggested, would actually validate what I have so far - given a policy, does http-loader set the right header?. The iframe tests that we want to edit become necessary only in the next piece, which is setting the policy as delivered by: https://w3c.github.io/webappsec-referrer-policy/#referrer-policy-delivery

@rebstar6
Copy link
Contributor

Awesome! PR out - #11238

@rebstar6
Copy link
Contributor

rebstar6 commented Jun 2, 2016

FYI - once meta referrer is in, the tests definitely need more changes for the other delivery policies (more so than we thought...). Referrer-Policy header is the obvious next step, but the tests use Content-Security-Policy, which seems to have been the old way of things at some point.

Not sure how easy a switch this is - I may be able to just change referrer_policy/generic/tools/generate.py for http-csp to add Referrer-Policy not Content-Security-Policy, but that feels hacky. Hrmm.

@jdm
Copy link
Member

jdm commented Jun 2, 2016

That is likely the best way to do it, assuming that w3c/webappsec-referrer-policy@fc55d91#commitcomment-17717693 is correct. We should make changes like that in the upstream repository to make sure the right eyes see it.

bors-servo pushed a commit that referenced this issue Jun 2, 2016
Implement meta referrer policy delivery (3)

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #10311 (github issue number if applicable).

<!-- Either: -->
- [X] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11468)
<!-- Reviewable:end -->
bors-servo pushed a commit that referenced this issue Jun 3, 2016
Implement meta referrer policy delivery (3)

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #10311 (github issue number if applicable).

<!-- Either: -->
- [X] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11468)
<!-- Reviewable:end -->
@paulrouget
Copy link
Contributor Author

@rebstar6 @jdm once meta referrer PR (3) lands, is it enough to ask the duckduckgo folks to set servo as "trusted"? See what they say here: #10309 (comment)

@jdm jdm reopened this Jun 3, 2016
@rebstar6
Copy link
Contributor

rebstar6 commented Jun 3, 2016

@paulrouget - I think? You can certainly ask, and if they want more, hopefully they'll say so (and looks like you have anyway).

To summarize, thus far we have implemented the meta-referrer delivery piece of this for Documents only (not workers), and using the policies in the w3c tests (and spec at the time the DuckDuckGo issue happened). The spec changed, so there is an issue to update the policies to the latest, which should be trivial once the tests are changed (#11384).

FWIW, we maintain a default of 'No-Referrer' (rather than blank) for now, so that the Referer header isn't sent unless a policy is specified in the right meta element - this is safer than the alternative, since we have not implemented the other delivery methods.

Other delivery types are a TODO, though those also require test changes (seems like those are pretty out of date).

@rebstar6
Copy link
Contributor

rebstar6 commented Jun 6, 2016

Test change for header delivery: web-platform-tests/wpt#3116
More broad test change for cross-origin (not crossorigin) web-platform-tests/wpt#3115 (while I was in there..)

Not sure what review is like on w3c, but those are in there. Once the 1st is approved, work can start on referrer policy delivery via the Referrer-Policy header

@rebstar6
Copy link
Contributor

This set of more manageable issues should cover the remaining work (at least, as of today's spec):

There may also be some fetch work when the other delivery policies are implemented (?).

I'm stepping off this as much with my job starting Monday, so these issues are free for the taking! I can definitely answer questions though for whoever picks it up :)

@jdm jdm added B-meta This issue tracks the status of multiple, related pieces of work and removed C-assigned There is someone working on resolving the issue labels Jun 27, 2016
@nox
Copy link
Contributor

nox commented Oct 1, 2017

We still need to fix #11861 and #11863.

gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 1, 2019
…r=jdm

PR1 for servo/servo#10311

This puts the code and data structures in place to set the Referer header based on the Referrer Policy for a given document. Note that document:: get_referrer_policy() always returns the 'No Referrer' option, so for now, this should have no impact on production code, and that policy requires that the Referer header is not added.

Later PRs will determine the policy and edit that get_referrer_policy() accordingly.

Source-Repo: https://github.com/servo/servo
Source-Revision: 34900814fca3b21fbb27bed58d4f4af8a8e307e9

UltraBlame original commit: d088446dcbc0a71d1b2cade87d0ec28deb233de1
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 1, 2019
…r=jdm

PR1 for servo/servo#10311

This puts the code and data structures in place to set the Referer header based on the Referrer Policy for a given document. Note that document:: get_referrer_policy() always returns the 'No Referrer' option, so for now, this should have no impact on production code, and that policy requires that the Referer header is not added.

Later PRs will determine the policy and edit that get_referrer_policy() accordingly.

Source-Repo: https://github.com/servo/servo
Source-Revision: 34900814fca3b21fbb27bed58d4f4af8a8e307e9

UltraBlame original commit: d088446dcbc0a71d1b2cade87d0ec28deb233de1
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 1, 2019
…r=jdm

PR1 for servo/servo#10311

This puts the code and data structures in place to set the Referer header based on the Referrer Policy for a given document. Note that document:: get_referrer_policy() always returns the 'No Referrer' option, so for now, this should have no impact on production code, and that policy requires that the Referer header is not added.

Later PRs will determine the policy and edit that get_referrer_policy() accordingly.

Source-Repo: https://github.com/servo/servo
Source-Revision: 34900814fca3b21fbb27bed58d4f4af8a8e307e9

UltraBlame original commit: d088446dcbc0a71d1b2cade87d0ec28deb233de1
@Darkspirit
Copy link
Contributor

  • Remove "Determine policy for token" w3c/webappsec-referrer-policy#67 removed "Determine policy for token":
    /// <https://w3c.github.io/webappsec-referrer-policy/#determine-policy-for-token>
    pub fn determine_policy_for_token(token: &str) -> Option<ReferrerPolicy> {
    match_ignore_ascii_case! { token,
    "never" | "no-referrer" => Some(ReferrerPolicy::NoReferrer),
    "default" | "no-referrer-when-downgrade" => Some(ReferrerPolicy::NoReferrerWhenDowngrade),
    "origin" => Some(ReferrerPolicy::Origin),
    "same-origin" => Some(ReferrerPolicy::SameOrigin),
    "strict-origin" => Some(ReferrerPolicy::StrictOrigin),
    "strict-origin-when-cross-origin" => Some(ReferrerPolicy::StrictOriginWhenCrossOrigin),
    "origin-when-cross-origin" => Some(ReferrerPolicy::OriginWhenCrossOrigin),
    "always" | "unsafe-url" => Some(ReferrerPolicy::UnsafeUrl),
    "" => Some(ReferrerPolicy::NoReferrer),
    _ => None,
    }
    }
  • Default to 'strict-origin-when-cross-origin'. w3c/webappsec-referrer-policy#125 changed default policy from NoReferrerWhenDowngrade to StrictOriginWhenCrossOrigin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-content/dom Interacting with the DOM from web content A-network B-meta This issue tracks the status of multiple, related pieces of work
Projects
None yet
7 participants