Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow worker to load from a URL instead of a blob #6739

Closed
billyvg opened this issue Jan 11, 2023 · 14 comments · Fixed by #9409
Closed

Allow worker to load from a URL instead of a blob #6739

billyvg opened this issue Jan 11, 2023 · 14 comments · Fixed by #9409
Labels
Package: replay Issues related to the Sentry Replay SDK Type: Improvement

Comments

@billyvg
Copy link
Member

billyvg commented Jan 11, 2023

The issue with using blobs to load our Worker is that stricter CSPs will not allow it. You cannot control the "source" of the blob, so allowing our worker blob would allow all blobs.

Instead, the proposed solution would be to release the worker to our CDN and add an option to the replay SDK to load from the CDN instead of a blob. This would mean that users could then allow worker-src for our CDN.

@billyvg billyvg added the Package: replay Issues related to the Sentry Replay SDK label Jan 11, 2023
@mydea
Copy link
Member

mydea commented Jan 13, 2023

I think the way forward for this is, as a first step, to probably land #6665.
Then the next step could be to make this release to the CDN.
And the final step is to figure out a way to opt in to either using this, or the blob - ideally with tree shaking (?). Maybe the default should be to use the CDN, and users can opt-in to use the blob. WDYT? cc @Lms24

@mydea
Copy link
Member

mydea commented Jan 17, 2023

OK, so we've been talking about this some more, and would propose the following changes:

  1. We extract the worker into @sentry-internal/replay-worker (first step, no actual changes for usage)
  2. We change the worker to output an actual JS file (not a string form)
  3. We deploy this to the CDN
  4. We make the replay package use the CDN version exclusively. When that fails, we fall back to no compression

Advantages:

  • CSP works better. Users can whitelist the Sentry CDN specifically
  • We reduce the size of the replay bundle, making the worker size impact async (the worker itself is already async, so usage should not be affected) & hopefully reducing LCP as well

If there are no objections against this, we'll start work on this cc @Lms24 @bruno-garcia ?

@bruno-garcia
Copy link
Member

My impression is that this seems rather complex. Relying on the CDN might always be a failure mode in some scenarios, so we'll be relying on the fallback to no compression anyway but always trying to fetch the resource from the CDN.
Or maybe I'm just overthinking here.

On the other hand the advantages seem to pay off the added complexity.

How impactful is this issue right now? Does this affect a large number of users?

@billyvg
Copy link
Member Author

billyvg commented Jan 18, 2023

Right now? I would say not too impactful, but this is just a guess. we would have to see if we can determine the # of uncompressed vs compressed segments.

However, I think this is a better solution and worth the complexity as it gives the users a more secure way to load our worker.

@mydea
Copy link
Member

mydea commented Feb 9, 2023

So, we have started working on this. However, there is a core question:

  1. We could try to get this shipped before GA. This is rather unlikely to happen, as we'd need to really rush this.
  2. We can ship it after GA, but it can kind of be argued that this is a breaking change? It shouldn't affect that many people, and we do have a good fallback (=we just use uncompressed payloads if the worker fails), but it could be a bit unexpected.
  3. Or we wait for v8 to ship this change, but this is an undefined time horizon as of now.

What do you think, how should we approach this?

One alternative version for 2. would be to update the docs & changelog to indicate that you need to whitelist our CDN url for compression to work (even though the actual code hasn't landed yet), then we can do this change after GA and technically not be breaking. cc @billyvg @Lms24 @bruno-garcia

@billyvg
Copy link
Member Author

billyvg commented Feb 9, 2023

Let's not try to rush this for GA, that's too risky. Waiting for v8 is too long of a wait. I think updating our docs to call this out for GA is a great idea and releasing the change asap after GA.

The security implications of allowing all blobs to be run in a worker are great enough to justify the potential breaking changes IMO.

@billyvg
Copy link
Member Author

billyvg commented Feb 10, 2023

@mydea started a draft PR here: getsentry/sentry-docs#6269

@mydea
Copy link
Member

mydea commented Feb 10, 2023

Yeah, I agree and this would also be my vote!

@Lms24
Copy link
Member

Lms24 commented Feb 10, 2023

As we talked about this yesterday, I also agree that it's okay to do this between GA and v8. Especially because it's "soft-breaking" in the sense that we'd fall back to sending uncompressed payloads.

@billyvg billyvg changed the title [Replay] Allow worker to load from a URL instead of a blob Allow worker to load from a URL instead of a blob Feb 15, 2023
@mydea
Copy link
Member

mydea commented Mar 23, 2023

So after some research, I figured out that actually it is impossible to serve a web worker from a different origin 😬 so we cannot serve this from a CDN.

The only thing that would be possible is to serve the worker from the CDN, and have a minimal wrapper served as a blob:

importScripts('https://our-cdn-url.com/replay-worker.js');

This should work, and would reduce our bundle size, but would not address the CSP issue - actually, it would make it worse because users would need to whitelist both the blob as well as the CDN URL.

@tj-kev
Copy link

tj-kev commented Apr 5, 2023

Is there any update on the preferred solution for this? I have a strict CSP and cannot allow "blob:" as this could potentially open up areas of attack. I know there is a fallback but I am now reluctant to turn on CSP reporting to sentry as this will create a lot of entries.

@mydea
Copy link
Member

mydea commented Apr 5, 2023

Hi @tj-kev!

Sadly, as of now we do not have a good solution for this other than to disable compression (new Replay({ useCompression: false })).

We are exploring other ways to work around this without having to disable compression - see e.g. #7755

I will close this issue for now as I believe loading the worker from a URL is not the solution for this problem.

@mydea mydea closed this as not planned Won't fix, can't repro, duplicate, stale Apr 5, 2023
@rdsedmundo
Copy link

I think this should remain open, it's a valid issue and disabling compression is not a good call.

Datadog added support for this on their SDK per user request by allowing one to self-host the Worker: DataDog/browser-sdk#1578 (comment)

@mydea mydea reopened this Oct 19, 2023
@mydea
Copy link
Member

mydea commented Oct 19, 2023

I guess we could allow to define a custom URL, but like for datadog, it would be up to users to implement a compression worker. We could provide a blueprint for this, I guess 🤔 cc @billyvg

mydea added a commit that referenced this issue Oct 31, 2023
…9409)

This PR does two things:

1. Allow to configure a `workerUrl` in replay config, which is expected
to be an URL of a self-hosted worker script.
a. Added an example worker script, which is a built version of the
pako-based compression worker
a. Users can basically host this file themselves and point to it in
`workerUrl`, as long as it is on the same origin as the website itself.
  a. We can eventually document this in docs
1. Allows to configure `__SENTRY_EXCLUDE_REPLAY_WORKER__` in your build
to strip the default included web worker. You can configure this if
you're disabling compression anyhow, or if you want to configure a
custom web worker as in the above step.

Fixes #6739, and
allows to reduce bundle size further. Once merged/released we can also
add this to the bundler plugins `bundleSizeOptimizations` options.

Note that we _do not recommend_ to disable the web worker completely. We
only recommend to tree shake the worker code if you provide a custom
worker URL - else, replay payloads will not be compressed, resulting in
much larger payloads sent over the network, which is bad for your
applications performance.

Also note that when providing a custom worker, it is your own
responsibility to keep it up to date - we try to keep the worker
interface stable, and the worker is generally not updated often, but you
should still check regularly when updating the SDK if the example worker
has changed.

---------

Co-authored-by: Billy Vong <billyvg@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Package: replay Issues related to the Sentry Replay SDK Type: Improvement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants