Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track request body size in XHR and Fetch instrumentations #4706

Merged
merged 41 commits into from
Nov 14, 2024

Conversation

MustafaHaddara
Copy link
Contributor

@MustafaHaddara MustafaHaddara commented May 14, 2024

Which problem is this PR solving?

The Fetch and XHR instrumentations expose http.response_content_length attributes but do not expose http.request_content_length attributes. This PR adds the http.request_content_length attributes to outgoing requests that have a body (ex. POST, PATCH, DELETE, etc.)

Short description of the changes

Ideally, there would be some browser API would could just read for this (similar to how we get the response content length via the PerformanceObserver API). However, no such API exists.

Second best would be if we could read the content-length request header. Unfortunately, the XMLHTTPRequest API does not offer any way to read request headers. Even if we could (ie. with the fetch API), this header seems to be set automatically by the browser before it actually sends the request, outside of user-space.

So, we have to compute the body length on our own. This PR implements that.

Detailed Description

The first few commits (e349fa4...eaf9786) are refactorings/updates, mainly to unit tests, to enable changes and tests that follow.

The primary changes are contained in these 3 commits:

  • d6149ca adds getXHRBodyLength and getFetchBodyLength utils to the opentelemetry-sdk-trace-web package.
    • getFetchBodyLength needs to call getXHRBodyLength, otherwise I would have defined these in their respective packages.
  • d97b02b calls getXHRBodyLength from the XHR instrumentation package and adds unit tests for the XHR instrumentation
  • 860557e calls getFetchBodyLength from the Fetch instrumentation package and adds unit tests for the Fetch instrumentation
  • bee76c8 makes this functionality opt-in

The getXHRBodyLength function is mostly straightforward; the XHR API is not too complicated and is fairly self-explanatory.

On the other hand, the getFetchBodyLength function is more complex. The root of the problem is that the fetch API doesn't expose clean ways for us to get the body content. In places where it is possible, it is often consumable only once, and often as aPromise that resolves to the body content. I had to take care to not consume the actual body content; we do not want this instrumentation to interfere with actual requests. It is possible that a bug in this implementation would result in the bodies on fetch requests getting consuming by this instrumentation and then not actually included in the network request.

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Added unit tests to opentelemetry-sdk-trace-web, opentelemetry-instrumentation-xml-http-request, and opentelemetry-instrumentation-fetch

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added

@MustafaHaddara MustafaHaddara requested a review from a team May 14, 2024 16:00
@MustafaHaddara MustafaHaddara force-pushed the request-body-size branch 3 times, most recently from a4cb688 to 14c9323 Compare May 15, 2024 14:54
@MustafaHaddara MustafaHaddara force-pushed the request-body-size branch 2 times, most recently from 0988e15 to c67e910 Compare May 22, 2024 19:45
Copy link

codecov bot commented May 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.21%. Comparing base (f1ef596) to head (ed6f37f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4706   +/-   ##
=======================================
  Coverage   93.21%   93.21%           
=======================================
  Files         315      315           
  Lines        8096     8096           
  Branches     1622     1622           
=======================================
  Hits         7547     7547           
  Misses        549      549           

@MustafaHaddara MustafaHaddara force-pushed the request-body-size branch 2 times, most recently from 7351d79 to 62d5b82 Compare June 12, 2024 20:11
@MustafaHaddara MustafaHaddara requested review from scheler and MSNev June 12, 2024 20:14
@MustafaHaddara
Copy link
Contributor Author

@scheler @MSNev I've made the changes we discussed and resolved merge conflicts. Please let me know if you have any other questions.

Comment on lines 271 to 276
size += key.length;
if (value instanceof Blob) {
size += value.size;
} else {
size += value.length;
}
Copy link

@tbrockman tbrockman Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the 👻 here, just been a bit busy with life and quitting #dayjob!

Just noticed, should these also use getByteLength?

Feels silly to suggest it given that FormData size varies (as you mentioned, browser/platform-specific implementation differences from things like boundaries and such, I just checked out Firefox for example), so I understand if it seems unnecessary at this point since it'd just be shaving a tiny bit of hypothetical inaccuracy off an estimate that will inherently be incorrect under the circumstances that it applies to (and feel free to ignore).

Suggested change
size += key.length;
if (value instanceof Blob) {
size += value.size;
} else {
size += value.length;
}
size += getByteLength(key.length);
if (value instanceof Blob) {
size += value.size;
} else {
size += getByteLength(value.length);
}

}

if (typeof body === 'string') {
return body.length;
Copy link

@tbrockman tbrockman Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MustafaHaddara For the most part I think that would be absolutely true! But I also wouldn't underestimate the potential for people to send large JSON payloads (either intentionally or unintentionally).

As a contemporary example, OpenAI's chat API accepts user image input as "[e]ither a URL of the image or the base64 encoded image data.", so you can imagine that for a large image this might amount to some meaningful overhead if someone sends their images inline.

Given some of the other inherent issues for calculating content length size with FormData, would you be open to allowing users to specify their own optional getXHRBodyLength (or maybe exposed as calculateBodyLength) function?

For context, I'm hoping to use this functionality in Browser Extension for OpenTelemetry, and I'd like to limit the overhead as much as possible when it already involves injecting big blobs of Javascript into pages which might be making requests that extension users may not have much control over. This way, I can supply my own implementation (which may be less maintainable/correct/costly) and also experiment with browser-specific calculations, without any maintenance burden being placed on this project.

@JamieDanielson
Copy link
Member

JamieDanielson commented Sep 25, 2024

from comment thread above, moving here for easier finding:

In the semantic conventions tooling meeting this morning a couple of things came up which affect what this PR should be doing

Follow-up notes from JS SIG meeting discussion:

  • This PR just uses http.request.body.size since the previous version http.request_content_length didn't exist previously anyway, which should be fine.
  • This PR does have an opt-in mechanism, and it is disabled by default.
  • Because this opt-in mechanism for certain http attributes includes more than just this attribute and this instrumentation, it may be preferred to move this configuration flag elsewhere so it can be shared by multiple instrumentations. For example, Java added an experimental-opt-in flag that includes these.

@dyladan @MSNev did I capture that correctly? And if so, what are the next steps here? I suspect we'll keep much of this logic already written in this PR and just change the enable/disable mechanism.

@MustafaHaddara
Copy link
Contributor Author

@dyladan @MSNev I just wanted to follow up on @JamieDanielson's message above-- what would you like me to do with this PR?

@MustafaHaddara MustafaHaddara requested a review from a team as a code owner October 16, 2024 15:06
@MustafaHaddara
Copy link
Contributor Author

We discussed this PR at the client-side SIG yesterday. The two remaining discussion threads:

from @JamieDanielson:

Because this opt-in mechanism for certain http attributes includes more than just this attribute and this instrumentation, it may be preferred to move this configuration flag elsewhere so it can be shared by multiple instrumentations. For example, Java added an experimental-opt-in flag that includes these.

My position is that we can update the opt-in mechanism in the future, if we decide to go that route and build one flag to control multiple instrumentations.

As for @tbrockman 's memory concerns:

But I also wouldn't underestimate the potential for people to send large JSON payloads (either intentionally or unintentionally).

[...]

Given some of the other inherent issues for calculating content length size with FormData, would you be open to allowing users to specify their own optional getXHRBodyLength (or maybe exposed as calculateBodyLength) function?

I'm not too worried about memory concerns since this entire instrumentation is opt-in, and I think making the suggested change is something that we can discuss in a follow up issue or PR.

@pichlermarc pichlermarc requested a review from a team October 23, 2024 09:53
@MustafaHaddara
Copy link
Contributor Author

@pichlermarc I've responded to your comments and made the changes you requested. Are you able to take a look?

Copy link
Member

@pichlermarc pichlermarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's duplicate the utils for now so that they don't end up being public in a stable package. We can discuss combining the packages after this PR is merged. 👍

Other than that, the PR looks good. 🙂

packages/opentelemetry-sdk-trace-web/src/utils.ts Outdated Show resolved Hide resolved
@MustafaHaddara
Copy link
Contributor Author

Sounds good @pichlermarc !

Thanks for the help! Everything should be good to go now.

Copy link
Member

@pichlermarc pichlermarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks 🙂

@pichlermarc pichlermarc added this pull request to the merge queue Nov 14, 2024
Merged via the queue into open-telemetry:main with commit c78a02f Nov 14, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec-feature This is a request to implement a new feature which is already specified by the OTel specification
Development

Successfully merging this pull request may close these issues.

7 participants