Faster SSR #5701

Rich-Harris · 2020-11-20T04:20:45Z

Before submitting the PR, please make sure you do the following

It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
This message body should clearly illustrate what problems it solves.
~~Ideally, include a test that fails without this PR but passes with it.~~

Inspired by ryansolid/dom-expressions#27, this replaces the current escape function, used in SSR to escape <, " and & characters, with a much faster version. It's not the same function in that PR, it's a smaller and (in my measurements) faster one that skips ahead to the next escapable character rather than incrementing one character at a time.

It has quite a remarkable impact on performance. Using the marko-js/isomorphic-ui-benchmarks repo, we see the following results:

	3.29.7	this PR
search-results benchmark	2,496 ops/sec	4,748 ops/sec (1.90x faster)
color-picker benchmark	7,288 ops/sec	22,196 ops/sec (3.05x faster)

It no longer escapes > and ' characters because as far as I can tell that's not necessary. No tests needed to change.

Tests

Run the tests with npm test and lint the project with npm run lint

src/runtime/internal/ssr.ts

Conduitry · 2020-11-20T13:47:00Z

The build's currently failing - it looks like because of a typescript type mismatch.

src/runtime/internal/ssr.ts

pushkine · 2020-11-20T14:16:09Z

The argument is server side, but is there a reason to use numbers instead of booleans anywhere at all ?

I doubt the difference of one single easily gzip-able ascii character from !0 to 1 justifies the runtime cost of initializing a 32bit number, and evaluating truthiness is probably magnitudes faster using booleans

The cost is so tiny it shouldn't warrant an argument, yet it also impedes on readability in the codebase, hence my question

ehrencrona · 2020-11-20T15:12:22Z

src/runtime/internal/ssr.ts

+
+	while (pattern.test(html)) {
+		const i = pattern.lastIndex - 1;
+		escaped += html.slice(last, i) + escapes[html[i]];


for what it's worth, replacing escapes[html[i]] with const ch = html[i] and then (ch === '&' ? ch === '"' ? '"' : '&' : '<') seems to shave another 10% off the execution time when I test it.

but i'm surprised the regexp performs so badly; interestingly enough, it's the function as second parameter that's the slowdown. html.replace(/&/g, '&').replace(/</g, '<') is pretty much exactly the same speed as the new code.

Wow. Maybe we should just do that instead then? How are you getting those numbers, and can you share them?

I just took a 500 kb HTML document and ran

const html = readFileSync('bigdocument.html').toString(); const start = performance.now(); for (let i = 0; i < 2000; i++) { escape(html); } console.log(performance.now() - start, 'ms');

Our documents of course tend to be smaller so there is a risk this doesn't correspond to real-life performance.

The .replace.replace version runs in 3.5s, the code in the PR with escapes[html[i]] in 3.6s, if you replace it by the ternary operator in about 3.2s, and the original code with a function as second parameter to replace in 14.5s.

In my testing a switch on character code (charCodeAt) is WAY faster than a simple object lookup. Don't ask me why. IE: (from another implementation)

while (match = matchHtmlRegExp.exec(str)) { switch (str.charCodeAt(match.index)) { case 34: // " escape = '"' break

I forget but it was something like 240 ops/sec vs 280 ops/sec (on a large file). I just submitted a 30% speed improvement patch to replace-html library.

(ch === '&' ? ch === '"' ? '"' : '&' : '<')

I believe this is broken and it needs to be (ch === '&' ? '&' : (ch === '"' ? '"' : '<')). It does appear faster than the original version in this PR

The switch method is slower than both the ternary and the original PR implementation in my testing with @ehrencrona's benchmark above

The switch method is slower than both the ternary and the original PR implementation in my testing

Another thing we have to keep in mind here is that almost 2 years have passed. It's very hard IMHO to micro-tune JS performance (over the long-term). Engines vary, browsers vary, things change with time. The thing that was fastest 2 years ago might bench worse today. The regex engines used in the major browsers are not all created equal.

We should be clear if we're talking benchmarks which browser, which engine... in the Svelt context perhaps we only care about Node? Just a consideration.

Yes, I tested on Node. This is for server-side code, so no need to test in browsers.

Thanks for sharing your findings! It may well have been different in the past like you said and regardless it was valuable to test various ideas and ensure we're doing as well as possible.

Oh for sure... we can certainly chase the peak performance (and probably should)... but we just may have to do so again in another 2 years - and make sure we're always comparing apples to apples. :). If we only need to test on Node that certainly helps a lot. Though it might not surprise me to learn if there were say differences between Node on ARM vs Node on x86_64...

Co-authored-by: Conduitry <[email protected]>

Rich-Harris · 2020-11-20T15:36:59Z

I doubt the difference of one single easily gzip-able ascii character from !0 to 1 justifies the runtime cost of initializing a 32bit number, and evaluating truthiness is probably magnitudes faster using booleans

According to an untrustworthy benchmark I just cobbled together...

iterations = 1e7;

function number() {
  console.time('number');
  let total = 0;
  let i = iterations;
  while (i--) {
    const n = i % 2 ? 0 : 1;
    if (n) {
      total += 1;
    }
  }
  console.timeEnd('number');

  return total;
}

function boolean() {
  console.time('boolean');
  let total = 0;
  let i = iterations;
  while (i--) {
    const b = i % 2 ? false : true;
    if (b) {
      total += 1;
    }
  }
  console.timeEnd('boolean');

  return total;
}

...the boolean is indeed reliably faster. TIL! My assumption had always been that we'd still be better off, since minifiers typically rewrite true and false as !0 and !1, meaning that you pay for the extra byte and for the coercion, but it turns out not to be the case:

function coerced_number() {
  console.time('coerced_number');
  let total = 0;
  let i = iterations;
  while (i--) {
    const n = i % 2 ? !1 : !0;
    if (n) {
      total += 1;
    }
  }
  console.timeEnd('coerced_number');

  return total;
}

For whatever reason, coerced_number takes about the same time as boolean, and both are reliably faster than number. Weird. So yeah, I guess we probably should use booleans everywhere.

lukeed · 2020-11-20T19:01:03Z

Few minor things:

boolean coercion is expensive (!0 <<< true)
Original escape_chars dict should probably be Object.create(null) with keys added. Key-lookup are cheap but not free. Falls inline with @ehrencrona's findings
If this approach is to stay, then substring is typically faster than slicing

stale · 2021-12-07T00:12:42Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

src/runtime/internal/ssr.ts

benmccann

I've updated this PR and addressed all the comments. LGTM now

mrkishi · 2022-05-06T18:27:27Z

It seems Svelte doesn't escape invalid surrogate pairs — I wonder if this should go in this PR?

To be clear, it seems niche enough that it was never an issue, so the alternative could be to just not deal with it.

benmccann · 2022-05-06T18:32:39Z

I'm not entirely sure I understand the issue, but it sounds outside the scope of this PR

src/runtime/internal/ssr.ts

stalkerg · 2022-05-27T16:37:46Z

Interesting, is it possible to use indexOf instead RegExp here? Let's see...

intrnl · 2022-07-24T04:18:14Z

This is a very late comment but wouldn't this be even faster if regular expressions are skipped entirely?

https://jsbench.me/g0l5yr4wge/2

EDIT:
I've done more benchmarking on this.
https://gist.github.com/intrnl/07816a7669d0ad4620390bd69e59b9d4

bluwy · 2022-07-25T03:44:13Z

@intrnl Looks like escape_no_re works better for shorter strings, while escape_new is better for longer strings. I guess we'd want to cover cases for longer strings more so escape_new could still be good, though I'm not certain how this would play out in practice.

stalkerg · 2022-07-26T09:38:18Z

@intrnl @bluwy I suppose it will be better to use indexOf instead iterate over the string.

benmccann · 2022-07-27T04:50:10Z

Maybe the best is to choose a method based on string length? Though I imagine that could vary based on hardware

intrnl · 2022-07-27T05:03:35Z

@intrnl I suppose it will be better to use indexOf instead iterate over the string.

I'm not sure about the indexOf approach, as that would mean reiterating through the entire string twice (& and either < or "), will have to see.

I brought this up again mainly because I'm rather interested on seeing what people commonly interpolate, is it short? is it long? does it have a ton of instances of &<"? I do agree that this might be something that's not worth fussing a lot over though as this is too situational 😅

Rich-Harris added 2 commits November 19, 2020 20:30

faster SSR, modelled on ryansolid/dom-expressions#27

aa15e9a

simplify

e58d658

benmccann reviewed Nov 20, 2020

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

remove redundant parens

57bb045

benmccann reviewed Nov 20, 2020

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

benmccann reviewed Nov 20, 2020

View reviewed changes

src/runtime/internal/ssr.ts Show resolved Hide resolved

boolean argument

29d4695

Conduitry reviewed Nov 20, 2020

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

ehrencrona reviewed Nov 20, 2020

View reviewed changes

Update src/runtime/internal/ssr.ts

3a94f2a

Co-authored-by: Conduitry <[email protected]>

benmccann added the perf label May 10, 2021

stale bot added the stale-bot label Dec 7, 2021

benmccann removed the stale-bot label Dec 7, 2021

benmccann added 2 commits May 6, 2022 09:30

merge latest changes from master

bc4bda9

fix merge

30df740

benmccann reviewed May 6, 2022

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

benmccann added 3 commits May 6, 2022 10:59

use ternary

88e2146

slice -> substring

1b0e836

add comment

61e180b

benmccann approved these changes May 6, 2022

View reviewed changes

mrkishi reviewed May 6, 2022

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

mrkishi suggested changes May 6, 2022

View reviewed changes

src/runtime/internal/ssr.ts Show resolved Hide resolved

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

mrkishi reviewed May 6, 2022

View reviewed changes

src/runtime/internal/ssr.ts Outdated Show resolved Hide resolved

mrkishi approved these changes May 6, 2022

View reviewed changes

benmccann force-pushed the faster-ssr branch from f09c22c to 61e180b Compare May 6, 2022 22:58

restore String conversion

b32049b

benmccann merged commit e7a2350 into master May 13, 2022

mrkishi mentioned this pull request May 13, 2022

[fix] harden attribute escaping during ssr #7530

Merged

5 tasks

Conduitry deleted the faster-ssr branch May 27, 2022 16:47

cgranttm mentioned this pull request Jul 23, 2022

HTML escaper performance could be improved by adopting the same html escaping code as Svelte and Solid JS angular/angular#46950

Closed

baseballyama mentioned this pull request Mar 30, 2023

fix: escape <textarea value={...}> attribute properly #8434

Merged

5 tasks

Uzume mentioned this pull request Sep 15, 2024

fix: escape < in attribute strings #12989

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster SSR #5701

Faster SSR #5701

Rich-Harris commented Nov 20, 2020

Conduitry commented Nov 20, 2020

pushkine commented Nov 20, 2020 •

edited

Loading

ehrencrona Nov 20, 2020 •

edited

Loading

Rich-Harris Nov 20, 2020

ehrencrona Nov 20, 2020

ehrencrona Nov 20, 2020 •

edited

Loading

joshgoebel Nov 23, 2020 •

edited

Loading

benmccann May 6, 2022

benmccann May 6, 2022 •

edited

Loading

joshgoebel May 6, 2022

benmccann May 6, 2022

joshgoebel May 7, 2022

Rich-Harris commented Nov 20, 2020

lukeed commented Nov 20, 2020

stale bot commented Dec 7, 2021

benmccann left a comment

mrkishi commented May 6, 2022 •

edited

Loading

benmccann commented May 6, 2022

stalkerg commented May 27, 2022

intrnl commented Jul 24, 2022 •

edited

Loading

bluwy commented Jul 25, 2022

stalkerg commented Jul 26, 2022

benmccann commented Jul 27, 2022 •

edited

Loading

intrnl commented Jul 27, 2022 •

edited

Loading

Faster SSR #5701

Faster SSR #5701

Conversation

Rich-Harris commented Nov 20, 2020

Before submitting the PR, please make sure you do the following

Tests

Conduitry commented Nov 20, 2020

pushkine commented Nov 20, 2020 • edited Loading

ehrencrona Nov 20, 2020 • edited Loading

Choose a reason for hiding this comment

Rich-Harris Nov 20, 2020

Choose a reason for hiding this comment

ehrencrona Nov 20, 2020

Choose a reason for hiding this comment

ehrencrona Nov 20, 2020 • edited Loading

Choose a reason for hiding this comment

joshgoebel Nov 23, 2020 • edited Loading

Choose a reason for hiding this comment

benmccann May 6, 2022

Choose a reason for hiding this comment

benmccann May 6, 2022 • edited Loading

Choose a reason for hiding this comment

joshgoebel May 6, 2022

Choose a reason for hiding this comment

benmccann May 6, 2022

Choose a reason for hiding this comment

joshgoebel May 7, 2022

Choose a reason for hiding this comment

Rich-Harris commented Nov 20, 2020

lukeed commented Nov 20, 2020

stale bot commented Dec 7, 2021

benmccann left a comment

Choose a reason for hiding this comment

mrkishi commented May 6, 2022 • edited Loading

benmccann commented May 6, 2022

stalkerg commented May 27, 2022

intrnl commented Jul 24, 2022 • edited Loading

bluwy commented Jul 25, 2022

stalkerg commented Jul 26, 2022

benmccann commented Jul 27, 2022 • edited Loading

intrnl commented Jul 27, 2022 • edited Loading

pushkine commented Nov 20, 2020 •

edited

Loading

ehrencrona Nov 20, 2020 •

edited

Loading

ehrencrona Nov 20, 2020 •

edited

Loading

joshgoebel Nov 23, 2020 •

edited

Loading

benmccann May 6, 2022 •

edited

Loading

mrkishi commented May 6, 2022 •

edited

Loading

intrnl commented Jul 24, 2022 •

edited

Loading

benmccann commented Jul 27, 2022 •

edited

Loading

intrnl commented Jul 27, 2022 •

edited

Loading