Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deferred fetching #1647

Open
wants to merge 92 commits into
base: main
Choose a base branch
from
Open

Deferred fetching #1647

wants to merge 92 commits into from

Conversation

noamr
Copy link
Contributor

@noamr noamr commented May 2, 2023

Add a JS-exposed function to request a deferred fetch, called fetchLater.

A deferred fetch would be invoked in one of two scenarios:

  • The document is destroyed (the fetch group is terminated)
  • A given period of time has passed.

A few constraints:

  • Request body streams are not allowed
  • This is only allowed in documents, and only reporting to potentially-trustworthy URLs
  • Deferred fetch requests are limited to 64KB per origin. Exceeding this would immediately throw.

The quota algorithm is a bit intricate, but its default should be somewhat reasonable for all but advanced cases.

  • A top level document has 640kb of quota for deferred fetching. This is important to avoid wasting high bandwidth after a tab has been closed. This quota is shared, by default, with the top-level's document same-origin same-agent descendants. The same-agent restriction is important for avoiding race conditions, as same-agent frames are guaranteed to call fetchLater in sequence.
  • By default, 128kb out of the 640kb quota is reserved for cross-origin or cross-agent iframes. Permissions policy (deferred-fetch-minimal) controls that, and the top-level document can disable that allocated quota by disabling that permissions policy.
  • Any document can delegate 64kb out of its reserved quota for cross-origin or cross-agent subframes, by explicitly enabling the deferred-fetch permissions policy.
  • Reserving some of the quota to a cross-origin or cross-agent subframe happens when the frame is being navigated by the container, e.g. setting src on an iframe. It is not guaranteed that the subframe would actually be able to use that
    quota, as it might end up navigating to a same-origin URL or disable the feature in its own permissions policy. However, the container's document only cares about the initial reserved value for subframes it doesn't have direct access to.

See WICG/pending-beacon#70

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. "inactive timeout" -> "deferred delay". At least that seems clearer to me.
  2. I think fetchLater was quite nice in that it sorts the same as fetch. fetchDeferred could also work, though is a bit harder to spell. I don't think we need request in the name.
  3. It's not clear how the name fetch group states get activated. That needs some kind of additional PR against HTML I suppose?
  4. I think we should describe the API in the same section as the fetch method. Could be called "Fetch methods" then.
  5. Deferred fetching itself could then precede the "Fetch API" section. Maybe it could even be a subsection of "Fetching" though I don't mind a new top-level section.

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
@noamr
Copy link
Contributor Author

noamr commented May 8, 2023

  1. "inactive timeout" -> "deferred delay". At least that seems clearer to me.

But this is deferred delay specifically in the case of inactivity. "inactivity deferred delay"?

  1. I think fetchLater was quite nice in that it sorts the same as fetch. fetchDeferred could also work, though is a bit harder to spell. I don't think we need request in the name.

I was thinking about "what are we doing right now?" which is requesting/scheduling a deferred fetch for later. But fetchLater is also fine with me, it's very easy to remember.

  1. It's not clear how the name fetch group states get activated. That needs some kind of additional PR against HTML I suppose?

Yes, HTML would activate/deactivate in the BFCache code path. Need to prepare a PR for that but wanted to see that I'm on the right track first.

  1. I think we should describe the API in the same section as the fetch method. Could be called "Fetch methods" then.

Will do

  1. Deferred fetching itself could then precede the "Fetch API" section. Maybe it could even be a subsection of "Fetching" though I don't mind a new top-level section.

OK

@mingyc
Copy link

mingyc commented May 9, 2023

cc @mingyc @fergald @yoavweiss @clelland latest API shape proposal for PendingBeacon

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
@mingyc
Copy link

mingyc commented May 16, 2023

Deferred fetch body sizes are limited to 64KB per origin. Exceeding this would immediately reject with a QuotaExceeded.

Another note about "origin" of a beacon request: there were some previous discussion about using 3P storage partitioning key (not origin, which is stricter) to decide whether pending beacon requests in a page are sendable or not in terms of privacy concern, see WICG/pending-beacon#30 (comment) and comments there below. I am not sure how this should be spec.

@noamr
Copy link
Contributor Author

noamr commented May 16, 2023

Deferred fetch body sizes are limited to 64KB per origin. Exceeding this would immediately reject with a QuotaExceeded.

Another note about "origin" of a beacon request: there were some previous discussion about using 3P storage partitioning key (not origin, which is stricter) to decide whether pending beacon requests in a page are sendable or not in terms of privacy concern, see WICG/pending-beacon#30 (comment) and comments there below. I am not sure how this should be spec.

OK, perhaps the 64kb constraint can be per network partition key rather than origin.

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
Copy link

@mingyc mingyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL, added some more comments.

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Show resolved Hide resolved
Copy link

@mingyc mingyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noamr@ PTAL. I've added some more questions and comments. Really thanks for your help!

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
Copy link

@mingyc mingyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you noamr@! I've added some last comments. And also added the missing features from the original proposal belows. Please let us know if they are suitable here.

  1. It is suggested to allow only secure requests for new API in Do we want to enforce HTTPS request? WICG/pending-beacon#27. Should we enforce HTTPS-only requests on fetchLater()?
  2. Should this spec mention [Permission Policy]https://www.w3.org/TR/permissions-policy/? In [Fetch-Based API] Permissions Policy WICG/pending-beacon#77, the suggestion is to allow the API by default. But we might want to provide a way to manage 3rd party iframe's usage.
  3. Consider to support retry mechanism WICG/pending-beacon#40 Should this spec mention retry when fetchLater() fails to send/commit?
  4. The original PendingBeacon proposal also includes Crash recovery WICG/pending-beacon#34, not sure how it can be incorporated into fetch spec.

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
mingyc added a commit to WICG/pending-beacon that referenced this pull request Jul 4, 2023
This PR adds overview and example codes for the draft `fetchLater()` API spec from whatwg/fetch#1647

The API will address the discussions in #70 #72 #73 #74 #75 #76.
@noamr
Copy link
Contributor Author

noamr commented Jul 4, 2023

Thank you noamr@! I've added some last comments. And also added the missing features from the original proposal belows. Please let us know if they are suitable here.

  1. It is suggested to allow only secure requests for new API in Do we want to enforce HTTPS request? WICG/pending-beacon#27. Should we enforce HTTPS-only requests on fetchLater()?

Probably a good idea, from the point of view of enabling new features only for secure requests.

  1. Should this spec mention [Permission Policy]https://www.w3.org/TR/permissions-policy/? In [Fetch-Based API] Permissions Policy WICG/pending-beacon#77, the suggestion is to allow the API by default. But we might want to provide a way to manage 3rd party iframe's usage.

I don't think we should integrate with permission policy. But we should allow the user agent to deny a fetchLater and throw an error immediately. I'll add that to the PR.

  1. Consider to support retry mechanism WICG/pending-beacon#40 Should this spec mention retry when fetchLater() fails to send/commit?

Perhaps consider adding this later?

  1. The original PendingBeacon proposal also includes Crash recovery WICG/pending-beacon#34, not sure how it can be incorporated into fetch spec.

I don't think that changes anything in the spec.

@mingyc
Copy link

mingyc commented Jul 18, 2023

Deferred fetch body sizes are limited to 64KB per origin. Exceeding this would immediately reject with a QuotaExceeded.

Another note about "origin" of a beacon request: there were some previous discussion about using 3P storage partitioning key (not origin, which is stricter) to decide whether pending beacon requests in a page are sendable or not in terms of privacy concern, see WICG/pending-beacon#30 (comment) and comments there below. I am not sure how this should be spec.

OK, perhaps the 64kb constraint can be per network partition key rather than origin.

@noamr Following up on the sendable beacon discussion:

As mentioned in WICG/pending-beacon#30 (comment), there were discussions around whether a beacon (or deferred request) should be sent when network changes. I tried to summarize them in [this PR](WICG/pending-beacon@feb3cf9, but basically to process a beacon request when BackgroundSync is off, we need to see if another open document (tab/frame/etc) with the same storage partitioning key as the current document's one, to avoid unexpected sending the request after network changes.

Do you think the above makes sense to be integrated into Fetch spec?

fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Outdated Show resolved Hide resolved
fetch.bs Show resolved Hide resolved
mingyc added a commit to mingyc/pending-beacon that referenced this pull request Jul 26, 2023
mingyc added a commit to mingyc/pending-beacon that referenced this pull request Jul 26, 2023
mingyc added a commit to mingyc/pending-beacon that referenced this pull request Jul 26, 2023
mingyc added a commit to WICG/pending-beacon that referenced this pull request Jul 26, 2023
fetch.bs Outdated
<p>To <dfn export>reserve deferred-fetch quota</dfn> for a <a>navigable container</a>
<var>container</var> given an <a for=/>origin</a> <var>originToNavigateTo</var>:

<p class=note>This is called when <var>container</var> and the document that initiated the
Copy link

@mingyc mingyc Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called when container and the document that initiated the navigation (the "source document") are same origin.

It potentially reserves either 64kb or 8kb of quota for the frame, if it is not same origin with its parent and the permissions policy allow.

Could you please verify the triggering condition? The first sentence mentions same-origin only, while the second sentence still brings up if it is not same origin with its parent.

Even if the "not same origin" condition only applies to the 8kb case, when should it be triggered? Is it when different-origin container gets created or starts to navigate? But a different-origin container will get its initial policy copied from parent container, which it should not modifiy?

Given the following example

<!-- a.com/index.html -->

let iframe = document.createElement("iframe");
iframe.src = "b.com";
document.body.appendChild(iframe)

<!-- b.com/index.html -->
<script>
fetchLater("b.com");
</script>
  • When iframe is created, it does not have any parent container.
  • When iframe.src is set to b.com, it navigates to a cross-origin destination, with FramePolicy copied from parent document (?)
  • When should iframe triggers the reserve deferred policy quota algorithm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the informative note was a bit misleading.
This is called when the iframe is being navigated, provided that the sourceDocument of the navigation is the container document.

In the above case, navigation would only take place when the iframe is connected to the page. This happens here, in the iframe's post-connection steps.

Copy link

@mingyc mingyc Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called when the iframe is being navigated,

It might still be ambigous. This algorithm later relies on inherited policies, which is also only available on navigation.

Combined with my question in this commit, could you please clarify?

provided that the sourceDocument of the navigation is the container document.

What's the container document here, is it iframe or the main document?

To reserve deferred-fetch quota for a navigable container container given an origin originToNavigateTo:

Trying to summarize the triggering condition and place. This algorithm:

  • Is triggered by a navigable container
    • Given the above case, navigable container = iframe?
  • Should be triggered on navigation, when the source document of the navigation is the navigable's parent document.
    • Is source document of the navigation = iframe's parent document = main document?
    • Is navigable = iframe?
    • Is navigable's parent document = main document?
  • Combined the above, this algorithm should be run by iframe when its source document is main document and when it is navigating its content document?
    • But inherited policies of the content document will only be available during navigation...
    • If this is run by content document after it obtains inherited policies, content document might not be able to get access to iframe if it is cross-origin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called when the iframe is being navigated,

It might still be ambigous. This algorithm later relies on inherited policies, which is also only available on navigation.

Combined with my question in this commit, could you please clarify?

provided that the sourceDocument of the navigation is the container document.

What's the container document here, is it iframe or the main document?

Main document

To reserve deferred-fetch quota for a navigable container container given an origin originToNavigateTo:

Trying to summarize the triggering condition and place. This algorithm:

  • Is triggered by a navigable container

    • Given the above case, navigable container = iframe?

Yes, like the iframe element.

  • Should be triggered on navigation, when the source document of the navigation is the navigable's parent document.

    • Is source document of the navigation = iframe's parent document = main document?

Yes, when the iframe is being navigated by its container, e.g. by setting the iframe's src.

  • Is navigable = iframe?

Sort of, It's like the iframe's contents.

  • Is navigable's parent document = main document?

Yes, in this case

  • Combined the above, this algorithm should be run by iframe when its source document is main document and when it is navigating its content document?

Yes

  • But inherited policies of the content document will only be available during navigation...

Not exactly. It should run the https://w3c.github.io/webappsec-permissions-policy/#algo-define-inherited-policy-in-container algorithm which only needs the container and the target origin.

  • If this is run by content document after it obtains inherited policies, content document might not be able to get access to iframe if it is cross-origin.

See previous response

Copy link

@mingyc mingyc Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for answering!

Getting back to the above example where a cross-origin iframe is created first, and gets appended later.

  1. I tried potentially free deferred-fetch quota algorithm, which gets executed on Document creation. It looks like when running with the example, iframe will clear its navigable container (which is iframe?) deferred-fetch quota because

if document’s node navigable’s container document is not null, and its origin is same origin with document,

SameOrigin(iframe, main document) = SameOrigin("about:blank", "a.com") = true

  1. Later on beginning navigation to b.com, reserve deferred-fetch quota is called, iframe would theoretically get deferred-fetch-minimal => is this correct?

  2. However, when calling get the available deferred-fetch quota for any requests made within iframe, it will always return 0 from a condition not listed in Step 5:

  • isTopLevel == false && deferredFetchAllowed == false && deferredFetchMinimalAllowed == false (also asked here)

I suspects the policy doesn't get properly passed in-between process, i.e. the policy updated at (2) does not propagate to remote frame. Maybe this is not spec issue but I am not sure...


<li><p>If <var>quota</var> is equal or less than 0, then return 0.

<li><p>If <var>quota</var> is less than <var>quotaForRequestOrigin</var>, then return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If quota is less than quotaForOrigin, then return quota.

(Discussed offline in chat, but copying here)

Given the following examples:

// root (a.com) -> frame-1 (a.com) -> frame-2 (b.com)
//              -> frame-3 (b.com) -> frame-4 (a.com)

I'd like to verify quotas for the above 5 documents:

root: 64kb? (the result I got, but isn't 8kb taken by frame-2?)
frame-1: 54kb
frame-2: 8kb
frame-3: 8kb
frame-4: 0kb?

The questionable part is root and frame-1, they can't get more than 64kb quota due to the quoted return step at the beginning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, for a specific request they can never get more than 64kb at this point, even though their overall quota would be 512kb.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should frame-1 get 64kb or 54kb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64kb

fetch.bs Outdated Show resolved Hide resolved
delegate quota to the navigated frame, and the reserved quota would only apply in that case, and
would be ignored if it ends up being shared. If quota was reserved and the document ends up being
<a>same origin</a> with its parent, the quota would be
<a data-lt="potentially free deferred-fetch quota">freed</a>.
Copy link

@mingyc mingyc Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If quota was reserved and the document ends up being
same origin with its parent, the quota would be
freed

Could the triggering condition of free be more explicit? Does the algorithm looks like the following?

  1. container document (iframe) navigates
  2. container document inherits policies
  3. container document reserves 64kb or 8kb
  4. document (iframe's content document) gets created
  5. document frees quota

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be explicit in the HTML spec. I'll prepare a PR.
What you wrote is correct.

fetch.bs Show resolved Hide resolved
set <var>document</var>'s <a>node navigable</a>'s <a>navigable container</a>'s
<a>reserved deferred-fetch quota</a> to 0.

<p class=note>This is called when a {{Document}} is created. It ensures that same-origin nested
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called when a Document is created. It ensures that same-origin nested documents don’t reserve quota,

Is it true that cross-origin document will not have access to its navigable container's quota? If so, this algorithm automatically only works on same-origin content document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

noamr added a commit to whatwg/html that referenced this pull request Jan 8, 2025
The logic for deferred fetching (the `fetchLater` function), as
defined in the fetch spec, specifies a "quota" which is shared
with between a document and its direct same-origin descendants.

For this logic to work in a secure way, the quota needs to be:
- reserved when a frame-initiated navigation starts. This way,
  the container document can only reserve quota based on URLs
  it knows it navigates to.
- freed if the document ends up being same origin with its
  container, upon document creation.
  This ensures quota is handled correctly in the case of
  redirects.

This PR adds those two calls:
- Call "reserve" on navigation, based on `sourceDocument`.
- Call "potentially free" on document creation.

Depends on whatwg/fetch#1647, where
the quota logic itself is defined.
would be ignored if it ends up being shared. If quota was reserved and the document ends up being
<a>same origin</a> with its parent, the quota would be
the container and its navigable, if allowed permissions policy. It is not observable to the
cotnainer document whether the reserved quota was used in practice. This algorithm assumes that the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cotnainer

typo

<a>same origin</a> with its parent, the quota would be
the container and its navigable, if allowed permissions policy. It is not observable to the
cotnainer document whether the reserved quota was used in practice. This algorithm assumes that the
container's document might delegate quota to the navigated frame, and the reserved quota would only
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might delegate quota to the navigated frame

what's frame here? Is it container's navigable?

<a>allowed to use</a> the <a>policy-controlled feature</a> "{{PermissionsPolicy/deferred-fetch}}";
otherwise false.

<li><p>Let <var>deferredFetchMinimalAllowed</var> be true if <var>controlDocument</var> is
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deferredFetchMinimalAllowed is true

Is it possible to bypass all these 5 conditions without a value for quota, i.e. isTopLevel == false && deferredFetchAllowed == false && deferredFetchMinimalAllowed == false?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants