-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track request body size in XHR and Fetch instrumentations #4706
base: main
Are you sure you want to change the base?
Track request body size in XHR and Fetch instrumentations #4706
Conversation
a4cb688
to
14c9323
Compare
0988e15
to
c67e910
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4706 +/- ##
=======================================
Coverage 93.17% 93.17%
=======================================
Files 315 315
Lines 8086 8086
Branches 1617 1617
=======================================
Hits 7534 7534
Misses 552 552 |
7351d79
to
62d5b82
Compare
62d5b82
to
4540ad0
Compare
26ea752
to
92da8c9
Compare
- webworkers don't have DOMParser or Document APIs - hard-coding an expected length for Document length is fragile (and varies across platforms)
92da8c9
to
bee76c8
Compare
function getFormDataSize(formData: FormData): number { | ||
let size = 0; | ||
for (const [key, value] of formData.entries()) { | ||
size += key.length; | ||
if (value instanceof Blob) { | ||
size += value.size; | ||
} else { | ||
size += value.length; | ||
} | ||
} | ||
return size; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an improvement over the old implementation (which was stringifying the FormData, and that wouldn't actually stringify file objects correctly), but it still doesn't calculate the exact FormData size. Browsers will serialize FormData in their own way (ex including a separator for each of the fields, etc.)
I will keep investigating for more accurate approaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this seems good enough for now, and maybe if anyone is sufficiently motivated they can contribute the changes to calculate it more precisely for their platform 😬
cada7e9
to
8c42240
Compare
8c42240
to
d011829
Compare
I went ahead and used the new (unstable) attribute; let me know if you want me to revert that. @tbrockman @MSNev @JamieDanielson I've applied all of the requested changes and this PR should be ready for another review. Let me know if I missed anything. |
size += key.length; | ||
if (value instanceof Blob) { | ||
size += value.size; | ||
} else { | ||
size += value.length; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the 👻 here, just been a bit busy with life and quitting #dayjob!
Just noticed, should these also use getByteLength
?
Feels silly to suggest it given that FormData
size varies (as you mentioned, browser/platform-specific implementation differences from things like boundaries and such, I just checked out Firefox for example), so I understand if it seems unnecessary at this point since it'd just be shaving a tiny bit of hypothetical inaccuracy off an estimate that will inherently be incorrect under the circumstances that it applies to (and feel free to ignore).
size += key.length; | |
if (value instanceof Blob) { | |
size += value.size; | |
} else { | |
size += value.length; | |
} | |
size += getByteLength(key.length); | |
if (value instanceof Blob) { | |
size += value.size; | |
} else { | |
size += getByteLength(value.length); | |
} |
} | ||
|
||
if (typeof body === 'string') { | ||
return body.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MustafaHaddara For the most part I think that would be absolutely true! But I also wouldn't underestimate the potential for people to send large JSON payloads (either intentionally or unintentionally).
As a contemporary example, OpenAI's chat API accepts user image input as "[e]ither a URL of the image or the base64 encoded image data.", so you can imagine that for a large image this might amount to some meaningful overhead if someone sends their images inline.
Given some of the other inherent issues for calculating content length size with FormData
, would you be open to allowing users to specify their own optional getXHRBodyLength
(or maybe exposed as calculateBodyLength
) function?
For context, I'm hoping to use this functionality in Browser Extension for OpenTelemetry, and I'd like to limit the overhead as much as possible when it already involves injecting big blobs of Javascript into pages which might be making requests that extension users may not have much control over. This way, I can supply my own implementation (which may be less maintainable/correct/costly) and also experiment with browser-specific calculations, without any maintenance burden being placed on this project.
from comment thread above, moving here for easier finding:
Follow-up notes from JS SIG meeting discussion:
@dyladan @MSNev did I capture that correctly? And if so, what are the next steps here? I suspect we'll keep much of this logic already written in this PR and just change the enable/disable mechanism. |
@dyladan @MSNev I just wanted to follow up on @JamieDanielson's message above-- what would you like me to do with this PR? |
We discussed this PR at the client-side SIG yesterday. The two remaining discussion threads: from @JamieDanielson:
My position is that we can update the opt-in mechanism in the future, if we decide to go that route and build one flag to control multiple instrumentations. As for @tbrockman 's memory concerns:
I'm not too worried about memory concerns since this entire instrumentation is opt-in, and I think making the suggested change is something that we can discuss in a follow up issue or PR. |
| `http.scheme` | The URI scheme identifying the used protocol | | ||
| `http.url` | Full HTTP request URL | | ||
| `http.method` | HTTP request method | | ||
| `http.request.body.size` | Uncompressed size of the request body, if any body exists | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This instrumentation implement Semantic Conventions v1.7 so we'd have to emit http.request_content_length
instead to avoid inconsistency in the emitted telemetry.
I know we mentioned earlier that this should be fine but we have rejected/requested changes on PRs (for example #4962) in the past for similar kinds of inconsistency.
Requested changes: using http.request_content_length
over the latest semantic conventions (I think it was mentioned that the attribute did not exist in the semconv package in an earlier comment, if that's indeed the case we should introduce a local constant in the respective instrumentation package to use it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm fair point. I think this should actually be http.request_content_length_uncompressed
though? Since we measure the raw payload size, before it gets gzip'd by the browser.
I've used that attribute name, let me know if I should switch to the compressed attribute instead.
@@ -125,6 +129,155 @@ export function addSpanNetworkEvents( | |||
} | |||
} | |||
|
|||
function _getBodyNonDestructively(body: ReadableStream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that the reason this requires changes to @opentelemetry/sdk-trace-web
is that we're trying to:
- reduce added bundle size for when both instrumentations are added
- reduce duplicated code and tests for the same functionality.
I wonder though: is the added public API surface in a stable package worth the trade-off? I think since it's only duplicated twice the vast majority of the end-users will never use this function directly, I'd much rather have duplicated code than the added public stable API that we have to keep stable in perpetuity.
Q: given that these instrumentations share code so directly, and the concepts they instrument are so intertwined, would it make sense for these instrumentations to be located in a single package instead of two different ones? Both of the packages are experimental
so moving them around is not completely impossible. I'm also open to other ideas 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that the reason this requires changes to @opentelemetry/sdk-trace-web is that we're trying to:
- reduce added bundle size for when both instrumentations are added
- reduce duplicated code and tests for the same functionality.
Yeah that's exactly correct. I can duplicate the logic across both packages if you'd prefer that.
I think it could make sense to unify the two packages. Both packages deal with network instrumentation and, given the state of the web, I would assume that most projects have a mix of fetch and XHR calls in their code. I would be surprised if there were a lot of people intentionally using only one of the 2 instrumentations.
But unifying them is a higher level decision that I'll defer to you and the other maintainers :)
@pichlermarc I've responded to your comments and made the changes you requested. Are you able to take a look? |
…o request-body-size
Which problem is this PR solving?
The Fetch and XHR instrumentations expose
http.response_content_length
attributes but do not exposehttp.request_content_length
attributes. This PR adds thehttp.request_content_length
attributes to outgoing requests that have a body (ex. POST, PATCH, DELETE, etc.)Short description of the changes
Ideally, there would be some browser API would could just read for this (similar to how we get the response content length via the PerformanceObserver API). However, no such API exists.
Second best would be if we could read the
content-length
request header. Unfortunately, the XMLHTTPRequest API does not offer any way to read request headers. Even if we could (ie. with the fetch API), this header seems to be set automatically by the browser before it actually sends the request, outside of user-space.So, we have to compute the body length on our own. This PR implements that.
Detailed Description
The first few commits (e349fa4...eaf9786) are refactorings/updates, mainly to unit tests, to enable changes and tests that follow.
The primary changes are contained in these 3 commits:
getXHRBodyLength
andgetFetchBodyLength
utils to theopentelemetry-sdk-trace-web
package.getFetchBodyLength
needs to callgetXHRBodyLength
, otherwise I would have defined these in their respective packages.getXHRBodyLength
from the XHR instrumentation package and adds unit tests for the XHR instrumentationgetFetchBodyLength
from the Fetch instrumentation package and adds unit tests for the Fetch instrumentationThe
getXHRBodyLength
function is mostly straightforward; the XHR API is not too complicated and is fairly self-explanatory.On the other hand, the
getFetchBodyLength
function is more complex. The root of the problem is that the fetch API doesn't expose clean ways for us to get the body content. In places where it is possible, it is often consumable only once, and often as aPromise that resolves to the body content. I had to take care to not consume the actual body content; we do not want this instrumentation to interfere with actual requests. It is possible that a bug in this implementation would result in the bodies on fetch requests getting consuming by this instrumentation and then not actually included in the network request.Type of change
How Has This Been Tested?
opentelemetry-sdk-trace-web
,opentelemetry-instrumentation-xml-http-request
, andopentelemetry-instrumentation-fetch
Checklist: