Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assets + stylesheet assets #1475

Open
wants to merge 176 commits into
base: master
Choose a base branch
from

Conversation

eoghanmurray
Copy link
Contributor

@eoghanmurray eoghanmurray commented May 14, 2024

Medium to large enhancement to rrweb, building upon #1239

Please review #1437 first as this also builds upon that PR (which can be merged before #1239)


Asset Events

Assets are a new type of event that embody a serialized version of a http resource captured during snapshotting. Some examples are images, media files and stylesheets. Resources can be fetched externally (from cache) in the case of a href, or internally for blob: urls and same-origin stylesheets. Asset events are emitted subsequent to either a FullSnapshot or an IncrementalSnapshot (mutation), and although they may have a later timestamp, during replay they are rebuilt as part of the snapshot that they are associated with. In the case where e.g. a stylesheet is referenced at the time of a FullSnapshot, but hasn't been downloaded yet, there can be a subsequent mutation event with a later timestamp which, along with the asset event, can recreate the experience of a network-delayed load of the stylesheet.

Assets to mitigate stylesheet processing cost

In the case of stylesheets, rrweb does some record-time processing in order to serialize the css rules which had a negative effect on the initial page loading times and how quickly the FullSnapshot was taken (see https://pagespeed.web.dev/). These are now taken out of the main thread and processed asynchronously to be emitted (up to processStylesheetsWithin ms) later. There is no corresponding delay on the replay side so long as the stylesheet has been successfully emitted.

Asset Capture Configuration

The captureAssets configuration option allows you to customize the asset capture process. It is an object with the following properties:

  • objectURLs (default: true): This property specifies whether to capture same-origin blob: assets using object URLs. Object URLs are created using the URL.createObjectURL() method. Setting objectURLs to true enables the capture of object URLs.

  • origins (default: false): This property determines which origins to capture assets from. It can have the following values:

    • false or []: Disables capturing any assets apart from object URLs, stylesheets (unless set to false) and images (if that setting is turned on).
    • true: Captures assets from all origins.
    • [origin1, origin2, ...]: Captures assets only from the specified origins. For example, origins: ['https://s3.example.com/'] captures all assets from the origin https://s3.example.com/.
  • images (default: false or true if inlineImages is true in rrweb.record config): When set, this option turns on asset capturing for all images irrespective of their origin. Unless this configuration option is explicitly set to false, images may still be captured if their src url matches the origins setting above.

  • stylesheets (default: 'without-fetch'): When set to true, this turns on capturing of all stylesheets and style elements via the asset system irrespective of origin. The default of 'without-fetch' is designed to match with the previous inlineStylesheet behaviour, whereas the true value allows capturing of stylesheets which are otherwise inaccessible due to CORS restrictions to be captured via a fetch call, which will normally use the browser cache. Unless this is explicitly set to false, a stylesheet will be captured if it matches via the origins config above.

  • stylesheetsRuleThreshold (default: 0): only invoke the asset system for stylesheets with more than this number of rules. Defaults to zero (rather than say 100) as it only looks at the 'outer' rules (e.g. could have a single media rule which nests 1000s of sub rules). This default may be increased based on feedback.

  • processStylesheetsWithin (default: 2000): This property defines the maximum time in milliseconds that the browser should delay before processing stylesheets. Inline <style> elements will be processed within half this value. Lower this value if you wish to improve the odds that short 'bounce' visits will emit the asset before visitor unloads page. Set to zero or a negative number to process stylesheets synchronously, which can cause poor scores on e.g. https://pagespeed.web.dev/ ("Third-party code blocked the main thread").

Copy link

changeset-bot bot commented May 14, 2024

🦋 Changeset detected

Latest commit: 2641cde

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 19 packages
Name Type
rrweb-snapshot Major
rrweb Major
rrdom Major
@rrweb/types Major
@rrweb/rrweb-plugin-canvas-webrtc-record Major
@rrweb/rrweb-plugin-canvas-webrtc-replay Major
@rrweb/rrweb-plugin-console-record Major
@rrweb/rrweb-plugin-console-replay Major
@rrweb/rrweb-plugin-sequential-id-record Major
@rrweb/rrweb-plugin-sequential-id-replay Major
rrdom-nodejs Major
rrweb-player Major
@rrweb/all Major
@rrweb/replay Major
@rrweb/record Major
@rrweb/packer Major
@rrweb/utils Major
@rrweb/web-extension Major
rrvideo Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

) {
let cssText = stringifyCssRules(styleRules);
if (cssText) {
cssText = absoluteToStylesheet(cssText, sheetBaseHref);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a style tag that imports a stylesheet that has a import rule. The href here would be incorrect for any urls imported in the nested stylesheet.

In the example below I would expect absoluteToStylesheet to return something like url(\"https://local.pendo.io:8081/browser-tests/temp-resources/Vorname.otf\")

but instead I get something like url(\"https://local.pendo.io:8081/browser-tests/guides/Vorname.ttf\")
Screenshot 2024-07-11 at 1 43 34 PM
Screenshot 2024-07-11 at 1 42 29 PM
Screenshot 2024-07-11 at 1 42 58 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a great catch; would it be possible to port over a simplified version of that test?

I can't quite figure out whether this was an issue before (and I haven't come across the code to import nested stylesheets during this PR — if you know where that is, please let me know!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Digging a little more, I honestly think this is an existing issue. From my understanding when we call absoluteToStylesheet we make the assumption that any URLs inside of a style tag must share the same filepath as the document the style tag is on. sheetBaseHref in the PR here, getHref() on master.

Unless you'd like to handle it here, I would be more than happy to open up a bug to continue digging into this and possibly even take a crack at fixing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I apologize, I misspoke. I still think the issue has been around for awhile, but is surfaced here because we started checking all style tags instead of just empty ones. Just wanted to clarify!

Screenshot 2024-07-11 at 2 57 09 PM Screenshot 2024-07-11 at 2 58 40 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've always been inlining <style> tags, just that previously the serialization happened in serializeTextNode, whereas this PR consolidates all the logic up to the serializeElementNode level.

OK so I believe this all happens because we can recurse into rule.styleSheet when we encounter an @import rule ... but we call absoluteToStylesheet with a single outer href after the stringification.

I reckon this is a long standing issue, so best to open a new issue, or submit a test case as a new PR independently of this one to see if it fails on current master.

It'd be easier (for me) to fix it after merging this PR due to the changes to the code (e.g. stringifyStylesheet has been renamed to stringifyCssRules) however we can do the fix in both branches if this one takes too long to merge.

If I've identified the problem correctly, the fix would be to pass down a current baseHref to stringifyCssRules, so that it does the url() rewrite during stringification, and also so that we can change the href as we recurse through the tree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you nailed and that we're on the same page here. Making that change I'm getting the correct expected response. I've opened up a PR, but still need to create a proper test case.

eoghanmurray added a commit that referenced this pull request Aug 6, 2024
Support a contrived/rare case where a <style> element has multiple text node children (this is usually only possible to recreate via javascript append) ... this PR fixes cases where there are subsequent text mutations to these nodes; previously these would have been lost

* In this scenario, a new CSS comment may now be inserted into the captured `_cssText` for a <style> element to show where it should be broken up into text elements upon replay: `/* rr_split */`
* The new 'can record and replay style mutations' test is the principal way to the problematic scenarios, and is a detailed 'catch-all' test with many checks to cover most of the ways things can fail
* There are new tests for splitting/rebuilding the css using the rr_split marker
* The prior 'dynamic stylesheet' route is now the main route for serializing a stylesheet; dynamic stylesheet were missed out in #1533 but that case is now covered with this PR

This PR was originally extracted from #1475 so the  initial motivation was to change the approach on stringifying <style> elements to do so in a single place.  This is also the motivating factor for always serializing <style> elements via the `_cssText` attribute rather than in it's childNodes; in #1475 we will be delaying populating `_cssText` for performance and instead recorrding them as assets.

Thanks for the detailed review to  Justin Halsall <[email protected]> & Yun Feng <https://github.com/YunFeng0817>
@eoghanmurray eoghanmurray force-pushed the stylesheet-assets branch 2 times, most recently from c593165 to efaae6d Compare August 23, 2024 15:48
Copy link
Contributor

@Juice10 Juice10 Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, we should pull these type changes into their own PR, to ease the release & maintenance of this PR. Especially where things have been moved from rrweb-snapshot to @rrweb/types

eoghanmurray and others added 17 commits October 14, 2024 16:19
… a `true` (ignore by default) - an empty (or maybe partial) captureAssets config where origins was unspecified was producing the error
    // presence of onAssetDetected means we should get
    // rr_captured_href (with contents promised later - i.e. using rrweb/record)

Not sure when this test started failing
… be a fixup to 'Capture <style> element css via an asset event...'
…esume that as the machines are quicker there, the tests complete before stylesheets are processed
…apshots until we receive the stylesheet assets to avoid a flash of unstyled content (fouc)
eoghanmurray and others added 9 commits October 25, 2024 18:51
…n the recording, and the ordering of asset arrival in the replayer was not regular. This change also means we only wait for stylesheets (which have a `timeout` associate with their status), which is good as other assets can be handled asyncrounously by the replayer asset manager without any negative effects (images get a placeholder image, whereas a missing stylesheet affect the entire page rendering)
… I'm about to remove the reset in favour of keeping assets around
…multi-page session).

 - instead of resetting between FullSnapshots, we instead record assets against a timestamp
 - prefer assets with timestamps subseqent to a snapshot (as if a reset happened)
 - if none can be found, find the most recent prior asset for a url

The asset manager now requires a rudimentary idea of where the replayer is at prior to applying an asset
…e types from rrweb-snapshot to @rrweb/types' but I don't know why it wasn't needed before
…that we don't delay fullsnapshot rendering awaiting for them
ShayMalchi added a commit to SaolaAI/rrweb that referenced this pull request Nov 6, 2024
* Skip mask check on leaf elements (rrweb-io#1512)

* Minor fixup for rrweb-io#1349; the 'we can avoid the check on leaf elements' optimisation wasn't being applied as `n.childNodes` was always truthy even when there were no childNodes.

Changing it to `n.childNodes.length` directly there (see rrweb-io#1402) actually caused a bug as during a mutation, we serialize the text node directly, and need to jump to the parentElement to do the check.
This is why I've reimplemented this optimisation inside `needMaskingText` where we are already had an `isElement` test

Thanks to @Paulhejia (https://github.com/Paulhejia/rrweb/) for spotting that `Boolean(n.childNodes)` is aways true.

* Assuming all jest should have been removed in rrweb-io#1033 (rrweb-io#1511)

* all references to jest should have been removed in rrweb-io#1033
* clarify that `cross-env` is used to ensure that environmental variables get applied on Windows (previous usage of cross-env was removed in rrweb-io#1033)

* Fix async assertions in test files (rrweb-io#1510)

* fix: await assertSnapshot in test files for async assertions

* Fix maskInputFn is ignored during the creation of the full snapshot (rrweb-io#1386)

Fix that the optional `maskInputFn` was being accidentally ignored during the creation of the full snapshot

* Improve development tooling (rrweb-io#1516)

- Running `yarn build` in a `packages/*/` directory will trigger build of all dependencies too, and cache them if possible.
- Fix for `yarn dev` breaking for `rrweb` package whenever changing files in `rrweb` package
- Update typescript, turbo, vite and vite-plugin-dts
- Require `workspaces-to-typescript-project-references` from `prepublish`

* Version Packages (alpha) (rrweb-io#1513)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Keep all packages in sync

* feat: add new css parser - postcss (rrweb-io#1458)

* feat: add new css parser

* make selectors change

* selectors and tests

* media changes

* remove old css references

* better variable name

* use postcss and port tests

* fix media test

* inline plugins

* fix failing multiline selector

* correct test result

* move tests to correct file

* cleanup all tests

* remove unused css-tree

* update bundle

* cleanup dependencies

* revert config files to master

* remove d.ts files

* update snapshot

* reset rebuilt test

* apply fuzzy css matching

* remove extra test

* Fix imports

* Newer versions of nswapi break rrdom-nodejs tests.
Example:
 FAIL  test/document-nodejs.test.ts > RRDocument for nodejs environment > RRDocument API > querySelectorAll
TypeError: e[api] is not a function
 ❯ byTag ../../node_modules/nwsapi/src/nwsapi.js:390:37
 ❯ Array.<anonymous> ../../node_modules/nwsapi/src/nwsapi.js:327:113
 ❯ collect ../../node_modules/nwsapi/src/nwsapi.js:1578:32
 ❯ Object._querySelectorAll [as select] ../../node_modules/nwsapi/src/nwsapi.js:1533:36
 ❯ RRDocument.querySelectorAll src/document-nodejs.ts:96:24

* Migrate from jest to vitest

* Order of selectors has changed with postcss

* Remove unused eslint

---------

Co-authored-by: Justin Halsall <[email protected]>

* fix: console assert only logs when arg 0 is falsy (rrweb-io#1530)

* fix: console assert only logs when arg 0 is falsy

* [Feature] Include takeFullSnapshot function in rrweb (rrweb-io#1527)

* export takeFullSnapshot function in rrweb

* chore: reduce flakey test due to '[vite] connected' message (rrweb-io#1525)

* fix: duplicate textContent for style element cause incremental style mutation invalid (rrweb-io#1417)

fix style element corner case
 - historically we have recorded duplicated css content in certain cases (demonstrated by the attached replayer test). This fix ensures that the replayer doesn't doubly add the content, which can cause problems when further mutations occur
---------
Review and further tests contributed by: Eoghan Murray <[email protected]>

* Added support for deprecated addRule & removeRule methods (rrweb-io#1515)

* Added support for deprecated addRule & removeRule methods

* Respect addRule default value

* fix: nested stylesheets should have absolute URLs (rrweb-io#1533)

* Replace relative URLs with absolute URLs when stringifying stylesheets

* Add test to show desired behavior for imported stylesheets from seperate directory

* Rename `absoluteToStylesheet` to `absolutifyURLs` and call it once after stringifying imported stylesheet

* Don't create the intermediary array of the spread operator

* Formalize that `stringifyRule` should expect a sheet href

* Ensure a <style> element can also import and gets it's url absolutized

* Handle case where non imported stylesheet has relative urls that need to be absolutified

* Clarify in test files where jpegs are expected to appear in absolutified urls

* Move absolutifyURLs call for import rules out of trycatch

* Add a benchmarking test for stringifyStylesheet

* Avoid the duplication on how to fall back

---------

Co-authored-by: Eoghan Murray <[email protected]>
Co-authored-by: eoghanmurray <[email protected]>

* Support top-layer <dialog> recording & replay (rrweb-io#1503)

* chore: its important to run `yarn build:all` before running `yarn dev`

* feat: trigger showModal from rrdom and rrweb

* feat: Add support for replaying modal and non modal dialog elements

* chore: Update dev script to remove CLEAR_DIST_DIR flag

* Get modal recording and replay working

* DRY up dialog test and dedupe snapshot images

* feat: Refactor dialog test to use updated attribute name

* feat: Update dialog test to include rr_open attribute

* chore: Add npm dependency [email protected]

* Add more test cases for dialog

* Clean up naming

* Refactor dialog open code

* Revert changed code that doesn't do anything

* Add documentation for unimplemented type

* chore: Remove unnecessary comments in dialog.test.ts

* rename rr_open to rr_openMode

* Replace todo with a skipped test

* Add better logging for CI

* Rename rr_openMode to rr_open_mode

rrdom downcases all attribute names which made `rr_openMode` tricky to deal with

* Remove unused images

* Move after iframe append based on @YunFeng0817's comment
rrweb-io#1503 (comment)

* Remove redundant dialog handling from rrdom.

rrdom already handles dialog element creation it's self

* Rename variables for dialog handling in rrweb replay module

* Update packages/rrdom/src/document.ts

---------

Co-authored-by: Eoghan Murray <[email protected]>

* Added session downloader for chrome extension (rrweb-io#1522)

* Added session downloader for chrome extension

- The session list now has a button to download sessions as .json files for use with rrweb-player
- Improved styling for the delete and download buttons

* Reverse monkey patch built in methods to support LWC (rrweb-io#1509)

* Get around monkey patched Nodes

* inlineImages: Setting of `image.crossOrigin` is not always necessary (rrweb-io#1468)

Setting of the `crossorigin` attribute is not necessary for same-origin images, and causes an immediate image reload (albeit from cache) necessitating the use of a load event listener which subsequently mutates the snapshot.  This change allows us to  avoid the mutation of the snapshot for the same-origin case.

* Modify inlineImages test to remove delay and show that we can inline images without mutation

* Add an explicit test for when the `image.crossOrigin = 'anonymous';` method is necessary.  Uses a combination of about:blank and our test server to simulate a cross-origin context

* Other test changes: there were some spurious rrweb mutations being generated by the addition of the crossorigin attribute that are now elimnated from the rrweb/__snapshots__/integration.test.ts.snap after this PR - this is good

* Move `childNodes` to @rrweb/utils

* Use non-monkey patched versions of the `childNodes`, `parentNode` `parentElement` `textContent` accessors

* Add getRootNode and contains, and add comprehensive todo list

* chore: Update turbo.json tasks for better build process

* Update caniuse-lite

* chore: Update eslint-plugin-compat to version 5.0.0

* chore: Bump @rrweb/utils version to 2.0.0-alpha.15

* delete unused yarn.lock files

* Set correct @rrweb/utils version in package.json

* Migrate over some accessors to reverse-monkey-patched version

* Add missing functions

* Fix illegal invocation error

* Revert closer to what it was.

This feels incorrect to me (Justin Halsall), but some of the tests break without it so I'm restoring this to be closer to its original here:
https://github.com/rrweb-io/rrweb/blame/cfd686d488a9b88dba6b6f8880b5e4375dd8062c/packages/rrweb-snapshot/src/snapshot.ts#L1011

* Reverse monkey patch all methods LWC hijacks

* Make tests more stable

* Safely handle rrdom nodes in hasShadowRoot

* Remove duplicated test

* Use variable `serverURL` in test

* Use monorepo default browserlist

* Fix typing issue for new typescript

* Remove unused package

* Remove unused code

* Add prefix to reverse-monkey-patched methods to make them more explicit

* Add default exports to @rrweb/utils

---------

Co-authored-by: Eoghan Murray <[email protected]>

* Version Packages (alpha) (rrweb-io#1526)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Single style capture (rrweb-io#1437)

Support a contrived/rare case where a <style> element has multiple text node children (this is usually only possible to recreate via javascript append) ... this PR fixes cases where there are subsequent text mutations to these nodes; previously these would have been lost

* In this scenario, a new CSS comment may now be inserted into the captured `_cssText` for a <style> element to show where it should be broken up into text elements upon replay: `/* rr_split */`
* The new 'can record and replay style mutations' test is the principal way to the problematic scenarios, and is a detailed 'catch-all' test with many checks to cover most of the ways things can fail
* There are new tests for splitting/rebuilding the css using the rr_split marker
* The prior 'dynamic stylesheet' route is now the main route for serializing a stylesheet; dynamic stylesheet were missed out in rrweb-io#1533 but that case is now covered with this PR

This PR was originally extracted from rrweb-io#1475 so the  initial motivation was to change the approach on stringifying <style> elements to do so in a single place.  This is also the motivating factor for always serializing <style> elements via the `_cssText` attribute rather than in it's childNodes; in rrweb-io#1475 we will be delaying populating `_cssText` for performance and instead recorrding them as assets.

Thanks for the detailed review to  Justin Halsall <[email protected]> & Yun Feng <https://github.com/YunFeng0817>

* Simplify the hover replacement function (rrweb-io#1535)

Simplify the hover replacement function, which has been borrowed from postcss-pseudo-classes

Note: 'parses nested commas in selectors correctly' was failing after this PR, however I don't think that the previous behaviour was desirable, so have added a new test to formalize this expectation

* fix some typos in optimize-storage.md (rrweb-io#1565)

* fix some typos in optimize-storage.md

* Update docs/recipes/optimize-storage.md

* Create metal-mugs-mate.md

---------

Co-authored-by: Justin Halsall <[email protected]>

* fix(rrdom): Ignore invalid DOM attributes when diffing (rrweb-io#1561)

* fix(rrdom): Ignore invalid DOM attributes when diffing (rrweb-io#213)

We encountered an issue where replays with invalid attributes (e.g.
`@click`) would break rendering the replay after seeking. The exception
bubbles up to
[here](https://github.com/rrweb-io/rrweb/blob/62093d4385a09eb0980c2ac02d97eea5ce2882be/packages/rrweb/src/replay/index.ts#L270-L279),
which means the replay will continue to play, but the replay mirror will
be incomplete.

Closes https://github.com/getsentry/team-replay/issues/458

* add changeset

* fix(snapshot): dimensions for blocked element not being applied (rrweb-io#1331)

fix for replay of a blocked element when using 'fast forward' (rrdom)

 - Dimensions were not being properly applied when you seek to a position in the replay. Need to use `setProperty` rather than trying to set the width/height directly

* ref: isParentRemoved to cache subtree (rrweb-io#1543)

* ref: isParentRemoved to cache subtree
* ref: cache at insertion too
* ref: remove wrapper function

---------

Co-authored-by: Justin Halsall <[email protected]>

* changeset to 2.0.13

* fix snapshot build

---------

Co-authored-by: Eoghan Murray <[email protected]>
Co-authored-by: Justin Halsall <[email protected]>
Co-authored-by: Alexey Babik <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: David Newell <[email protected]>
Co-authored-by: Paul D'Ambra <[email protected]>
Co-authored-by: Christopher Arredondo <[email protected]>
Co-authored-by: Yun Feng <[email protected]>
Co-authored-by: minja malešević <[email protected]>
Co-authored-by: Jeff Nguyen <[email protected]>
Co-authored-by: eoghanmurray <[email protected]>
Co-authored-by: Arun Kunigiri <[email protected]>
Co-authored-by: Riadh Mouamnia <[email protected]>
Co-authored-by: Billy Vong <[email protected]>
Co-authored-by: Jonas <[email protected]>
Co-authored-by: Shay Malchi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants