From e89d0bc43c72c5e6305fbde4a5c3b98dfb426711 Mon Sep 17 00:00:00 2001 From: LinGuang Date: Wed, 31 Jul 2019 22:23:02 +0800 Subject: [PATCH] =?UTF-8?q?postMessage=20=E5=BE=88=E6=85=A2=E5=90=97=3F=20?= =?UTF-8?q?(#6224)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * postMessage 很慢吗? * postMessage 很慢吗? * postMessage 很慢吗? 第一次校对 * postMessage 很慢吗? 第二次校对 * postMessage 很慢吗? 第三次校对 * postMessage 很慢吗? 第四次校对 * Update is-postmessage-slow.md * Update is-postmessage-slow.md --- TODO1/is-postmessage-slow.md | 160 +++++++++++++++++------------------ 1 file changed, 80 insertions(+), 80 deletions(-) diff --git a/TODO1/is-postmessage-slow.md b/TODO1/is-postmessage-slow.md index 62d5f0cb500..2774acc6966 100644 --- a/TODO1/is-postmessage-slow.md +++ b/TODO1/is-postmessage-slow.md @@ -2,51 +2,51 @@ > * 原文作者:[Surma](https://dassur.ma) > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO1/is-postmessage-slow.md](https://github.com/xitu/gold-miner/blob/master/TODO1/is-postmessage-slow.md) -> * 译者: -> * 校对者: +> * 译者:[linxiaowu66](https://github.com/linxiaowu66) +> * 校对者:[MarchYuanx](https://github.com/MarchYuanx), [TiaossuP](https://github.com/TiaossuP) -# Is postMessage slow? +# postMessage 很慢吗? -No, not really. (It depends.) +不,不一定(视情况而定) -What does “slow” mean? [I said it before](/things/less-snakeoil/), and I will say it again: If you didn’t measure it, it is not slow, and even if you measure it, the numbers are meaningless without context. +这里的“慢”是什么意思呢?[我之前在这里提及过](https://dassur.ma/things/less-snakeoil/),在这里再说一遍:如果你不度量它,它并不慢,即使你度量它,但是没有上下文,数字也是没有意义的。 -That being said, the fact that people will not even consider adopting [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Worker) because of their concerns about the performance of `postMessage()`, means that this is worth investigating. [My last blog post](/things/when-workers/) on workers got [responses](https://twitter.com/dfabu/status/1139567716052930561) along these lines, too. Let’s put actual numbers to the performance of `postMessage()` and see at what point you risk blowing your budgets. What can you do if vanilla `postMessage()` is too slow for your use-case? +话虽如此,人们甚至不会考虑采用 [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Worker),因为他们担心 `postMessage()` 的性能,这意味着这是值得研究的。[我的上一篇博客文章](https://dassur.ma/things/when-workers/)也得到了类似的[回复](https://twitter.com/dfabu/status/1139567716052930561)。让我们将实际的数字放在 `postMessage()` 的性能上,看看你会在什么时候冒着超出承受能力的风险。如果连普通的 `postMessage()` 在你的使用场景下都太慢,那么你还可以做什么呢? -Ready? Go. +准备好了吗?继续往下阅读吧。 -## How postMessage works +## postMessage 是怎么工作的? -Before we start measuring, we need to understand **what** `postMessage()` is and which part of it we want to measure. Otherwise, [we’ll end up gathering meaningless data](/things/deep-copy/) and drawing meaningless conclusions. +在开始度量之前,我们需要了解**什么是** `postMessage()`,以及我们想度量它的哪一部分。否则,[我们最终将收集无意义的数据](https://dassur.ma/things/deep-copy/)并得出无意义的结论。 -`postMessage()` is part of the [HTML spec](https://html.spec.whatwg.org/multipage/) (not [ECMA-262](http://www.ecma-international.org/ecma-262/10.0/index.html#Title)!). As I mentioned in my [deep-copy post](/things/deep-copy/), `postMessage()` relies on structured cloning to copy the message from one JavaScript realm to another. Taking a closer look at [the specification of `postMessage()`](https://html.spec.whatwg.org/multipage/web-messaging.html#message-port-post-message-steps), it turns out that structured cloning is a two-step process: +`postMessage()` 是 [HTML规范](https://html.spec.whatwg.org/multipage/) 的一部分(而不是 [ECMA-262](http://www.ecma-international.org/ecma-262/10.0/index.html#Title)!)正如我在 [deep-copy 一文](https://dassur.ma/things/deep-copy/)中提到的,`postMessage()` 依赖于结构化克隆数据,将消息从一个 JavaScript 空间复制到另一个 JavaScript 空间。仔细研究一下 [`postMessage()` 的规范](https://html.spec.whatwg.org/multipage/webmessaging.html#message-port-post-message-steps),就会发现结构化克隆是一个分两步的过程: -### Structured Clone algorithm +### 结构化克隆算法 -1. Run `StructuredSerialize()` on the message -2. Queue a task in the receiving realm, that will execute the following steps: - 1. Run `StructuredDeserialize()` on the serialized message - 2. Create a `MessageEvent` and dispatch a `MessageEvent` with the deserialized message on the receiving port +1. 在消息上执行 `StructuredSerialize()` +2. 在接收方中任务队列中加入一个任务,该任务将执行以下步骤: + 1. 在序列化的消息上执行 `StructuredDeserialize()` + 2. 创建一个 `MessageEvent` 并派发一个带有该反序列化消息的 `MessageEvent` 事件到接收端口上 -This is a simplified version of the algorithm so we can focus on the parts that matter for this blog post. It’s **technically** incorrect, but catches the spirit, if you will. For example, `StructuredSerialize()` and `StructuredDeserialize()` are not real functions in the sense that they are not exposed via JavaScript ([yet](https://github.com/whatwg/html/pull/3414)). What do these two functions actually do? For now, **you can think of `StructuredSerialize()` and `StructuredDeserialize()` as smarter versions of `JSON.stringify()` and `JSON.parse()`**, respectively. They are smarter in the sense that they handle cyclical data structures, built-in data types like `Map`, `Set` and `ArrayBuffer` etc. But do these smarts come at a cost? We’ll get back to that later. +这是算法的一个简化版本,因此我们可以关注这篇博客文章中重要的部分。虽然这在**技术上**是不正确的,但它却抓住了精髓。例如,`StructuredSerialize()` 和 `StructuredDeserialize()` 在实际场景中并不是真正的函数,因为它们不是通过 JavaScript([不过有一个 HTML 提案打算将它们暴露出去](https://github.com/whatwg/html/pull/3414))暴露出去的。那这两个函数实际上是做什么的呢?现在,**你可以将 `StructuredSerialize()` 和 `StructuredDeserialize()` 视为 `JSON.stringify()` 和 `JSON.parse()` 的智能版本**。从处理循环数据结构、内置数据类型(如 `Map`、`Set`和`ArrayBuffer`)等方面来说,它们更聪明。但是,这些聪明是有代价的吗?我们稍后再讨论这个问题。 -Something that the algorithm above doesn’t spell out explicitly is the fact that **serialization blocks the sending realm, while deserialization blocks the receiving realm.** And there’s more: It turns out that both Chrome and Safari defer running `StructuredDeserialize()` until you actually access the `.data` property on the `MessageEvent`. Firefox on the other hand deserializes before dispatching the event. +上面的算法没有明确说明的是,**序列化会阻塞发送方,而反序列化会阻塞接收方。** 另外还有:Chrome 和 Safari 都推迟了运行 `StructuredDeserialize()`,直到你实际访问了 `MessageEvent` 上的 `.data` 属性。另一方面,Firefox 在派发事件之前会反序列化。 -> **Note:** Both of these behaviors **are** spec-compatible and perfectly valid. [I opened a bug with Mozilla](https://bugzilla.mozilla.org/show_bug.cgi?id=1564880), asking if they are willing to align their implementation, as it puts the developer in control when to take the “performance hit” of deserializing big payloads. +> **注意:** 这两个行为**都是**兼容规范的,并且完全有效。[我在 Mozilla 上提了一个bug](https://bugzilla.mozilla.org/show_bug.cgi?id=1564880),询问他们是否愿意调整他们的实现,因为这可以让开发人员去控制什么时候应该受到反序列化大负载的“性能冲击”。 -With that in mind, we have to make a choice **what** to benchmark: We could measure end-to-end, so measuring how much time it takes to send a message from a worker to the main thread. However, that number would capture the sum of serialization and deserialization, each of which are happening in different realms. Remember: **This whole spiel with workers is motivated by wanting to keep the main thread free and responsive.** Alternatively, we could limit the benchmarks to Chrome and Safari and measure how long it takes to access the `.data` property to measure `StructuredDeserialize()` in isolation, which would exclude Firefox from the benchmark. I also haven’t found a way to measure `StructuredSerialize()` in isolation, short of running a trace. Neither of these choices are ideal, but in the spirit of building resilient web apps, **I decided to run the end-to-end benchmark to provide an **upper bound** for the cost of `postMessage()`.** +考虑到这一点,我们必须选择对**什么**来进行基准测试:我们可以端到端进行度量,所以可以度量一个 worker 发送消息到主线程所花费的时间。然而,这个数字将捕获序列化和反序列化的时间总和,但是它们却分别发生在不同的空间下。记住:**与 worker 的整个通信的都是主动的,这是为了保持主线程自由和响应性。** 或者,我们可以将基准测试限制在 Chrome 和 Safari 上,并单独测量从 `StructuredDeserialize()` 到访问 `.data` 属性的时间,这个需要把 Firefox 排除在基准测试之外。我还没有找到一种方法来单独测量 `StructuredSerialize()`,除非运行的时候调试跟踪代码。这两种选择都不理想,但本着构建弹性 web 应用程序的精神,**我决定运行端到端基准测试,为 `postMessage()` 提供一个上限。** -Armed with a conceptual understanding of `postMessage()` and the determination to measure, I shall use microbenchmarks. Please mind the gap between these numbers and reality. +有了对 `postMessage()` 的概念理解和评测的决心,我将使用 ☠️ 微基准 ☠️。请注意这些数字与现实之间的差距。 -## Benchmark 1: How long does sending a message take? +## 基准测试 1:发送一条消息需要花费多少时间? ![Two JSON objects showing depth and breadth](https://dassur.ma/things/is-postmessage-slow/breadth-depth.svg) -Depth and breadth vary between 1 and 6. For each permutation, 1000 objects will be generated. +深度和宽度在 1 到 6 之间变化。对于每个置换,将生成 1000 个对象。 -The benchmark will generate an object with a specific “breadth” and “depth”. The values for breadth and depth lie between 1 and 6. **For each combination of breadth and depth, 1000 unique objects will be `postMessage()`’d from a worker to the main thread**. The property names of these objects are random 16-digit hexadecimal numbers as a string, the values are either a random boolean, a random float or again a random string in the from of a 16-digit hexadecimal number. **The benchmark will measure the transfer time and calculate the 95th percentile.** +基准将生成具有特定“宽度”和“深度”的对象。宽度和深度的值介于 1 和 6 之间。**对于宽度和深度的每个组合,1000 个唯一的对象将从一个 worker `postMessage()` 到主线程**。这些对象的属性名都是随机的 16 位十六进制数字符串,这些值要么是一个随机布尔值,要么是一个随机浮点数,或者是一个来自 16 位十六进制数的随机字符串。**基准测试将测量传输时间并计算第 95 个百分位数。** -### Results +### 测量结果 ![](https://dassur.ma/things/is-postmessage-slow/nokia2-chrome.svg) @@ -58,46 +58,46 @@ The benchmark will generate an object with a specific “breadth” and “depth ![](https://dassur.ma/things/is-postmessage-slow/macbook-safari.svg) -The benchmark was run on Firefox, Safari and Chrome on a 2018 MacBook Pro, on Chrome on a Pixel 3XL and on Chrome on a Nokia 2. +这一基准测试是在 2018 款的 MacBook Pro上的 Firefox、 Safari、和 Chrome 上运行,在 Pixel 3XL 上的 Chrome 上运行,在 诺基亚 2 上的 Chrome 上运行。 -> **Note:** You can find the benchmark data, to code to generate it and the code for the visualization in [this gist](https://gist.github.com/surma/08923b78c42fab88065461f9f507ee96). Also, this was the first time in my life writing Python. Don’t be too harsh on me. +> **注意:** 你可以在 [gist](https://gist.github.com/surma/08923b78c42fab88065461f9f507ee96) 中找到基准数据、生成基准数据的代码和可视化代码。而且,这是我人生中第一次编写 Python。别对我太苛刻。 -The benchmark data from the Pixel 3 and especially Safari might look a bit suspicious to you. When Spectre & Meltdown were discovered, all browsers disabled [`SharedArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) and reduced the precision of timers like [`performance.now()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now), which I use to measure. Only Chrome was able to revert some these changes since they shipped [Site Isolation](https://www.chromium.org/Home/chromium-security/site-isolation) to Chrome on desktop. More concretely that means that browsers clamp the precision of `performance.now()` to the following values: +Pixel 3 的基准测试数据,尤其是 Safari 的数据,对你来说可能有点可疑。当 [Spectre & Meltdown](https://zhuanlan.zhihu.com/p/32784852) 被发现的时候,所有的浏览器会禁用 [SharedArrayBuffer](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) 并将我要测量使用的 [performance.now()](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now) 函数实行计时器的精度减少。只有 Chrome 能够还原这些更改,因为它们将[站点隔离](https://www.chromium.org/home/chromisecurity/site-isolation)发布到 Chrome 桌面版。更具体地说,这意味着浏览器将 `performance.now()` 的精度限制在以下值上: -* Chrome (desktop): 5µs -* Chrome (Android): 100µs -* Firefox (desktop): 1ms (clamping can be disabled flag, which I did) -* Safari (desktop): 1ms +* Chrome(桌面版):5µs +* Chrome(安卓系统):100µs +* Firefox(桌面版):1ms(该限制可以禁用掉,我就是禁用掉的) +* Safari(桌面版):1ms -The data shows that the complexity of the object is a strong factor in how long it takes to serialize and deserialize an object. This should not be surprising: Both the serialization and deserialization process have to traverse the entire object one way or another. The data indicates furthermore that the size of the JSON representation of an object is a good predictor for how long it takes to transfer that object. +数据显示,对象的复杂性是决定对象序列化和反序列化所需时间的重要因素。这并不奇怪:序列化和反序列化过程都必须以某种方式遍历整个对象。数据还表明,对象 JSON 化后的大小可以很好地预测传输该对象所需的时间。 -## Benchmark 2: What makes postMessage slow? +## 基准测试 2:什么导致 postMessage 变慢了? -To verify, I modified the benchmark: I generate all permutations of breadth and depth between 1 and 6, but in addition all leaf properties have a string value with a length between 16 bytes and 2KiB. +为了验证这个,我修改了基准测试:我生成了宽度和深度在 1 到 6 之间的所有排列,但除此之外,所有叶子属性都有一个长度在 16 字节到 2 KiB 之间的字符串值。 -### Results +### 测试结果 ![A graph showing the correlation between payload size and transfer time for postMessage](https://dassur.ma/things/is-postmessage-slow/correlation.svg) -Transfer time has a strong correlation with the length of the string returned by `JSON.stringify()`. +传输时间与 `JSON.stringify()` 返回的字符串长度有很强的相关性。 -I think the correlation is strong enough to issue a rule of thumb: **The stringified JSON representation of an object is roughly proportional to its transfer time.** However, even more important to note is the fact that **this correlation only becomes relevant for big objects**, and by big I mean anything over 100KiB. While the correlation holds mathematically, the divergence is much more visible at smaller payloads. +我认为这种相关性足够强,可以给出一个经验法则:**对象的 JSON 字符串化后的大小大致与它的传输时间成正比。** 然而,更需要注意的事实是,**这种相关性只与大对象相关**,我说的大是指超过 100 KiB 的任何对象。虽然这种相关性在数学上是成立的,但在较小的有效载荷下,这种差异更为明显(译者注:怀疑这句话作者应该是写错了,应该表述为差异不明显)。 -## Evaluation: It’s about sending a message +## 评估:发送一条信息 -We have data, but it’s meaningless if we don’t contextualize it. If we want to draw **meaningful** conclusions, we need to define “slow”. Budgets are a helpful tool here, and I will once again go back to the [RAIL](https://developers.google.com/web/fundamentals/performance/rail) guidelines to establish our budgets. +我们有数据,但如果我们不把它上下文化,它就没有意义。如果我们想得出**有意义的**结论,我们需要定义“慢”。预算在这里是一个有用的工具,我将再次回到 [RAIL](https://developer.google.com/web/fundamentals/performance/rail) 指南来确定我们的预期。 -In my experience, a web worker’s core responsibility is, at the very least, managing your app’s state object. State often only changes when the user interacts with your app. According to RAIL, we have 100ms to react to user interactions, which means that **even on the slowest devices, you can `postMessage()` objects up to 100KiB and stay within your budget.** +根据我的经验,一个 web worker 的核心职责至少是管理应用程序的状态对象。状态通常只在用户与你的应用程序交互时才会发生变化。根据 RAIL 的说法,我们有 100 ms 来响应用户交互,这意味着**即使在最慢的设备上,你也可以 `postMessage()` 高达 100 KiB 的对象,并保持在你的预期之内。** -This changes when you have JS-driven animations running. The RAIL budget for animations is 16ms, as the visuals need to get updated every frame. If we send a message from the worker that would block the main thread for longer than that, we are in trouble. Looking at the numbers from our benchmarks, everything up to 10KiB will not pose a risk to your animation budget. That being said, **this is a strong reason to prefer CSS animations and transitions over main-thread JS-driven animations.** CSS animations and transitions runs on a separate thread — the compositor thread — and are not affected by a blocked main thread. +当运行 JS 驱动的动画时,这种情况会发生变化。动画的 RAIL 预算是 16 ms,因为每一帧的视觉效果都需要更新。如果我们从 worker 那里发送一条消息,该消息会阻塞主线程的时间超过这个时间,那么我们就有麻烦了。从我们的基准数据来看,任何超过 10 KiB 的动画都不会对你的动画预算构成风险。也就是说,**这就是我们更喜欢用 CSS animation 和 transition 而不是 JS 驱动主线程绘制动画的一个重要原因。** CSS animation 和 transition 运行在一个单独的线程 - 合成线程 - 不受阻塞的主线程的影响。 -## Must. send. moar. data. +## 必须发送更多的数据 -In my experience, `postMessage()` is not the bottleneck for most apps that are adopting an off-main-thread architecture. I will admit, however, that there might be setups where your messages are either really big or you need to send a lot of them at a high frequency. What can you do if vanilla `postMessage()` is too slow for you? +以我的经验,对于大多数采用非主线程架构的应用程序来说,`postMessage()` 并不是瓶颈。不过,我承认,在某些设置中,你的消息可能非常大,或者需要以很高的频率发送大量消息。如果普通 `postMessage()` 对你来说太慢的话,你还可以做什么? -### Patching +### 打补丁 -In the case of state objects, the objects themselves can be quite big, but it’s often only a handful of deeply nested properties that change. We encountered this problem in [PROXX](https://proxx.app), our PWA Minesweeper clone: The game state consists of a 2-dimensional array for the game grid. Each cell stores whether it’s a mine, if it’s been revealed or if it’s been flagged: +在状态对象的情况下,对象本身可能非常大,但通常只有少数几个嵌套很深的属性会发生变化。我们在 [PROXX](https://proxx.app) 中遇到了这个问题,我们的 PWA 版本扫雷:游戏状态由游戏网格的二维数组组成。每个单元格存储这些字段:是否有雷,以及是被发现的还是被标记的。 ```typescript interface Cell { @@ -109,12 +109,12 @@ interface Cell { } ``` -That means the biggest possible grid of 40 by 40 cells adds up to ~134KiB of JSON. Sending an entire state object is out of the question. **Instead of sending the entire new state object whenever something changes, we chose to record the changes and send a patchset instead.** While we didn’t use [ImmerJS](https://github.com/immerjs/immer), a library for working with immutable objects, it does provide a quick way to generate and apply such patchsets: +这意味着最大的网格( 40 × 40 个单元格)加起来的 JSON 大小约等于 134 KiB。发送整个状态对象是不可能的。**我们选择记录更改并发送一个补丁集,而不是在更改时发送整个新的状态对象。** 虽然我们没有使用 [ImmerJS](https://github.com/immerjs/immer),这是一个处理不可变对象的库,但它提供了一种快速生成和应用补丁集的方法: ```js // worker.js immer.produce(stateObject, draftState => { - // Manipulate `draftState` here + // 在这里操作 `draftState` }, patches => { postMessage(patches); }); @@ -122,11 +122,11 @@ immer.produce(stateObject, draftState => { // main.js worker.addEventListener("message", ({data}) => { state = immer.applyPatches(state, data); - // React to new state + // 对新状态的反应 } ``` -The patches that ImmerJS generates look like this: +ImmerJS 生成的补丁如下所示: ```json [ @@ -147,45 +147,45 @@ The patches that ImmerJS generates look like this: ] ``` -This means that the amount that needs to get transferred is proportional to the size of your changes, not the size of the object. +这意味着需要传输的数据量与更改的大小成比例,而不是与对象的大小成比例。 -### Chunking +### 分块 -As I said, for state objects it’s **often** only a handful of properties that change. But not always. In fact, [PROXX](https://proxx.app) has a scenario where patchsets could turn out quite big: The first reveal can affect up to 80% of the game field, which adds up to a patchset of about ~70KiB. When targeting feature phones, that is too much, especially as we might have JS-driven WebGL animations running. +正如我所说,对于状态对象,**通常**只有少数几个属性会改变。但并非总是如此。事实上,[PROXX](https://proxx.app) 有这样一个场景,补丁集可能会变得非常大:第一个展示可能会影响多达 80% 的游戏字段,这意味着补丁集有大约 70 KiB 的大小。当目标定位于功能手机时,这就太多了,特别是当我们可能运行 JS 驱动的 WebGL 动画时。 -We asked ourselves an architectural question: Can our app support partial updates? Patchsets are a collection of patches. **Instead of sending all patches in the patchset at once, you can “chunk” the patchset into smaller partitions and apply them sequentially.** Send patches 1-10 in the first message, 11-20 on the next, and so on. If you take this to the extreme, you are effectively **streaming** your patches, allowing you to use all the patterns you might know and love from reactive programming. +我们问自己一个架构上的问题:我们的应用程序能支持部分更新吗?Patchsets 是补丁的集合。**你可以将补丁集“分块”到更小的分区中,并按顺序应用补丁,而不是一次性发送补丁集中的所有补丁。** 在第一个消息中发送补丁 1 - 10,在下一个消息中发送补丁 11 - 20,以此类推。如果你将这一点发挥到极致,那么你就可以有效地让你的补丁**流式化**,从而允许你使用你可能知道的设计模式以及喜爱的响应式编程。 -Of course, this can result in incomplete or even broken visuals if you don’t pay attention. However, you are in control of how the chunking happens and could reorder the patches to avoid any undesired effects. For example, you could make sure that the first chunk contains all patches affecting on-screen elements, and put the remaining patches in to a couple of patchsets to give the main thread room to breathe. +当然,如果你不注意,这可能会导致不完整甚至破碎的视觉效果。然而,你可以控制分块如何进行,并可以重新排列补丁以避免任何不希望的效果。例如,你可以确保第一个块包含所有影响屏幕元素的补丁,并将其余的补丁放在几个补丁集中,以给主线程留出喘息的空间。 -We do chunking in [PROXX](https://proxx.app). When the user taps a field, the worker iterates over the entire grid to figure out which fields need to be updated and collects them in a list. If that list grows over a certain threshold, we send what we have so far to the main thread, empty the list and continue iterating the game field. These patchsets are small enough that even on a feature phone the cost of `postMessage()` is negligible and we still have enough main thread budget time to update our game’s UI. The iteration algorithm works from the first tile outwards, meaning our patches are ordered in the same fashion. If the main thread can only fit one message into its frame budget (like on the Nokia 8110), the partial updates disguise as a reveal animation. If we are on a powerful machine, the main thread will keep processing message events until it runs out of budget just by nature JavaScript’s event loop. +我们在 [PROXX](https://proxx.app) 上做分块。当用户点击一个字段时,worker 遍历整个网格,确定需要更新哪些字段,并将它们收集到一个列表中。如果列表增长超过某个阈值,我们就将目前拥有的内容发送到主线程,清空列表并继续迭代游戏字段。这些补丁集足够小,即使在功能手机上, `postMessage()` 的成本也可以忽略不计,我们仍然有足够的主线程预算时间来更新我们的游戏 UI。迭代算法从第一个瓦片向外工作,这意味着我们的补丁以相同的方式排列。如果主线程只能在帧预算中容纳一条消息(就像 Nokia 8110),那么部分更新就会伪装成一个显示动画。如果我们在一台功能强大的机器上,主线程将继续处理消息事件,直到超出预算为止,这是 JavaScript 的事件循环的自然结果。 视频链接:https://dassur.ma/things/is-postmessage-slow/proxx-reveal.mp4 -Classic sleight of hand: In [PROXX], the chunking of the patchset looks like an animation. This is especially visible on low-end mobile phones or on desktop with 6x CPU throttling enabled. +经典手法:在 [PROXX] 中,补丁集的分块看起来像一个动画。这在支持 6x CPU 节流的台式机或低端手机上尤其明显。 -### Maybe JSON? +### 也许应该 JSON? -`JSON.parse()` and `JSON.stringify()` are incredibly fast. JSON is a small subset of JavaScript, so the parser has fewer cases to handle. Because of their frequent usage, they have also been heavily optimized. [Mathias recently pointed out](https://twitter.com/mathias/status/1143551692732030979), that you can sometimes reduce parse time of your JavaScript by wrapping big objects into `JSON.parse()`. **Maybe we can use JSON to speed up `postMessage()` as well? Sadly, the answer seems to be no:** +`JSON.parse()` 和 `JSON.stringify()` 非常快。JSON 是 JavaScript 的一个小子集,所以解析器需要处理的案例更少。由于它们的频繁使用,它们也得到了极大的优化。[Mathias 最近指出](https://twitter.com/mathias/status/1143551692732030979),有时可以通过将大对象封装到 `JSON.parse()` 中来缩短 JavaScript 的解析时间。**也许我们也可以使用 JSON 来加速 `postMessage()` ?遗憾的是,答案似乎是否定的:** -![A graph comparing the duration of sending an object to serializing, sending, and deserializing an object.](https://dassur.ma/things/is-postmessage-slow/serialize.svg) +![将发送对象的持续时间与序列化、发送和反序列化对象进行比较的图](https://dassur.ma/things/is-postmessage-slow/serialize.svg) -Comparing the performance of manual JSON serialization to vanilla `postMessage()` yields no clear result. +将手工 JSON 序列化的性能与普通的 `postMessage()` 进行比较,没有得到明确的结果。 -While there is no clear winner, vanilla `postMessage()` seems to perform better in the best case, and equally bad in the worst case. +虽然没有明显的赢家,但是普通的 `postMessage()` 在最好的情况下表现得更好,在最坏的情况下表现得同样糟糕。 -### Binary formats +### 二进制格式 -Another way to deal with the performance impact of structured cloning is to not use it at all. Apart from structured cloning objects, `postMessage()` can also **transfer** certain types. `ArrayBuffer` is one of these [transferable](https://developer.mozilla.org/en-US/docs/Web/API/Transferable) types. As the name implies, transferring an `ArrayBuffer` does not involve copying. The sending realm actually loses access to the buffer and it is now owned by the receiving realm. **Transferring an `ArrayBuffer` is extremely fast and independent of the size of the `ArrayBuffer`**. The downside is that `ArrayBuffer` are just a continuous chunk of memory. We are not working with objects and properties anymore. For an `ArrayBuffer` to be useful we have to decide how our data is marshalled ourselves. This in itself has a cost, but by knowing the shape or structure of our data at build time we can potentially tap into many optimizations that are unavailable to a generic cloning algorithm. +处理结构化克隆对性能影响的另一种方法是完全不使用它。除了结构化克隆对象外,`postMessage()` 还可以**传输**某些类型。`ArrayBuffer` 是这些[可转换](https://developer.mozilla.org/en-US/docs/Web/API/Transferable)类型之一。顾名思义,传输 `ArrayBuffer` 不涉及复制。发送方实际上失去了对缓冲区的访问,现在是属于接收方的。**传输一个 `ArrayBuffer` 非常快,并且独立于 `ArrayBuffer`的大小。** 缺点是 `ArrayBuffer` 只是一个连续的内存块。我们就不能再处理对象和属性。为了让 `ArrayBuffer` 发挥作用,我们必须自己决定如何对数据进行编组。这本身是有代价的,但是通过了解构建时数据的形状或结构,我们可以潜在地进行许多优化,而这些优化是一般克隆算法无法实现的。 -One format that allows you to tap into these optimizations are [FlatBuffers](https://google.github.io/flatbuffers/). FlatBuffers have compilers for JavaScript (and other languages) that turn schema descriptions into code. That code contains functions to serialize and deserialize your data. Even more interestingly: FlatBuffers don’t need to parse (or “unpack”) the entire `ArrayBuffer` to return a value it contains. +一种允许你使用这些优化的格式是 [FlatBuffers](https://google.github.io/flatbuffers/)。Flatbuffers 有 JavaScript (和其他语言)对应的编译器,可以将模式描述转换为代码。该代码包含用于序列化和反序列化数据的函数。更有趣的是:Flatbuffers 不需要解析(或“解包”)整个 `ArrayBuffer` 来返回它包含的值。 ### WebAssembly -What about everyone’s favorite: WebAssembly? One approach is to use WebAssembly to look at serialization libraries in the ecosystems of other languages. [CBOR](https://cbor.io), a JSON-inspired binary object format, has been implemented in many languages. [ProtoBuffers](https://developers.google.com/protocol-buffers/) and the aforementioned [FlatBuffers](https://google.github.io/flatbuffers/) have wide language support as well. +那么使用每个人都喜欢的 WebAssembly 呢?一种方法是使用 WebAssembly 查看其他语言生态系统中的序列化库。[CBOR](https://cbor.io) 是一种受 json 启发的二进制对象格式,已经在许多语言中实现。[ProtoBuffers](https://developer.google.com/protocol-buffers/) 和前面提到的 [FlatBuffers](https://google.github.io/flatbuffers/) 也有广泛的语言支持。 -However, we can be more cheeky here: We can rely on the memory layout of the language as our serialization format. I wrote [a little example](./binary-state-rust) using [Rust](https://www.rust-lang.org): It defines a `State` struct (symbolic for whatever your app’s state looks like) with some getter and setter methods so I can inspect and manipulate the state from JavaScript. To “serialize” the state object, I just copy the chunk of memory occupied by the struct. To deserialize, I allocate a new `State` object, and overwrite it with the data passed to the deserialization function. Since I’m using the same WebAssembly module in both cases, the memory layout will be identical. +然而,我们可以在这里更厚颜无耻:我们可以依赖该语言的内存布局作为序列化格式。我用 [Rust](https://www.rust-lang.org) 编写了[一个小例子](https://dassur.ma/things/is-postmessage-slow/binary-state-rust):它用一些 getter 和 setter 方法定义了一个 `State` 结构体(无论你的应用程序的状态如何,它都是符号),这样我就可以通过 JavaScript 检查和操作状态。要“序列化”状态对象,只需复制结构所占用的内存块。为了反序列化,我分配一个新的 `State` 对象,并用传递给反序列化函数的数据覆盖它。由于我在这两种情况下使用相同的 WebAssembly 模块,内存布局将是相同的。 -> This is just a proof-of-concept. You can easily tap into undefined behavior if your struct contains pointers (like `Vec` and `String` do). There’s also some unnecessary copying going on. Code responsibly! +> 这只是一个概念的证明。如果你的结构包含指针(如 `Vec` 和 `String`),那么你就很容易陷入未定义的行为错误中。同时还有一些不必要的复制。所以请对代码负责任! ```rust pub struct State { @@ -194,7 +194,7 @@ pub struct State { #[wasm_bindgen] impl State { - // Constructors, getters and setter... + // 构造器, getters and setter... pub fn serialize(&self) -> Vec { let size = size_of::(); @@ -230,29 +230,29 @@ pub fn deserialize(vec: Vec) -> Option { } ``` -> **Note:** [Ingvar](https://twitter.com/rreverser) pointed me to [Abomonation](https://github.com/TimelyDataflow/abomonation), a seriously questionable serialization library that works even **with** pointers. His advice: “Do \[not\] try this!”. +> **注意:** [Ingvar](https://twitter.com/rreverser) 向我指出了 [Abomonation](https://github.com/TimelyDataflow/abomonation),是一个严重有问题的序列化库,虽然可以使用指针的概念。他的建议:“不要使用这个库!”。 -The WebAssembly module ends up at about 3KiB gzip’d, most of which stems from memory management and some core library functions. The entire state object is sent whenever something changes, but due to the transferability of `ArrayBuffers`, this is extremely cheap. In other words: **This technique should have near-constant transfer time, regardless of state size.** It will, however, be more costly to access state data. There’s always a tradeoff! +WebAssembly 模块最终 gzip 格式大小约为 3 KiB,其中大部分来自内存管理和一些核心库函数。当某些东西发生变化时,就会发送整个状态对象,但是由于 `ArrayBuffers` 的可移植性,其成本非常低。换句话说:**该技术应该具有几乎恒定的传输时间,而不管状态大小。** 然而,访问状态数据的成本会更高。总是要权衡的! -This technique also requires that the state struct does not make any use of indirection like pointers, as those values will become invalid when copied to a new WebAssembly module instance. As a result, you will probably struggle to use this approach with higher-level languages. My recommendations are C, Rust and AssemblyScript, as you are in full control and have sufficient insight into memory layout. +这种技术还要求状态结构不使用指针之类的间接方法,因为当将这些值复制到新的 WebAssembly 模块实例时,这些值是无效。因此,你可能很难在高级语言中使用这种方法。我的建议是 C、 Rust 和 AssemblyScript,因为你可以完全控制内存并对内存布局有足够的了解。 -### SABs & WebAssembly +### SAB 和 WebAssembly -> **Heads up:** This section works with `SharedArrayBuffer`, which have been disabled in all browsers except Chrome on desktop. This is being worked on, but no ETA can be given on this. +> **提示:** 本节适用于 `SharedArrayBuffer`,它在除桌面端的 Chrome 外的所有浏览器中都已禁用。这正在进行中,但是不能给出 ETA。 -Especially from game developers, I have heard multiple requests to give JavaScript the capability to share objects across multiple threads. I think this is unlikely to ever be added to JavaScript itself, as it breaks one of the fundamentals assumptions of JavaScript engines. However, there is an exception to this called [`SharedArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) (“SABs”). SABs behave exactly like `ArrayBuffers`, but instead of one realm losing access when being transferred , they can be cloned and **both** realms will have access to the same underlying chunk of memory. **SABs allows the JavaScript realms to adopt a shared memory model.** For synchronization between realms, there’s [`Atomics`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics) which provide Mutexes and atomic operations. +特别是从游戏开发人员那里,我听到了多个请求,要求 JavaScript 能够跨多个线程共享对象。我认为这不太可能添加到 JavaScript 本身,因为它打破了 JavaScript 引擎的一个基本假设。但是,有一个例外叫做 [`SharedArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) ("SABs")。SABs 的行为完全类似于 `ArrayBuffers`,但是在传输时,不像 `ArrayBuffers` 那样会导致其中一方失去访问权, SAB 可以克隆它们,并且**双方**都可以访问到相同的底层内存块。**SABs 允许 JavaScript 空间采用共享内存模型。** 对于多个空间之间的同步,有 [`Atomics`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics) 提供互斥和原子操作。 -With SABs, you’d only have to transfer a chunk of memory once at the start of your app. However, in addition to the binary representation problem, you’d have to use `Atomics` to prevent one realm from reading the state object while the other realm is still writing and vice-versa. This can have a considerable performance impact. +使用 SABs,你只需在应用程序启动时传输一块内存。然而,除了二进制表示问题之外,你还必须使用 `Atomics` 来防止其中一方在另一方还在写入的时候读取状态对象,反之亦然。这可能会对性能产生相当大的影响。 -As an alternative to using SABs and serializing/deserializing data manually, you could embrace **threaded** WebAssembly. WebAssembly has standardized support for threads, but is gated on the availability of SABs. **With threaded WebAssembly you can write code with the exact same patterns you are used to from threaded programming languages.** This, of course, comes at the cost of development complexity, orchestration and potentially bigger and monolithic modules that need to get shipped. +除了使用 SABs 和手动序列化/反序列化数据之外,你还可以使用**线程化**的 WebAssembly。WebAssembly 已经标准化了对线程的支持,但是依赖于 SABs 的可用性。**使用线程化的 WebAssembly,你可以使用与使用线程编程语言相同的模式编写代码**。当然,这是以开发复杂性、编排以及可能需要交付的更大、更完整的模块为代价的。 -## Conclusion +## 结论 -Here’s my verdict: Even on the slowest devices, you can `postMessage()` objects up to 100KiB and stay within your 100ms response budget. If you have JS-driven animations, payloads up to 10KiB are risk-free. This should be sufficient for most apps. **`postMessage()` does have a cost, but not the extent that it makes off-main-thread architectures unviable.** +我的结论是:即使在最慢的设备上,你也可以使用 `postMessage()` 最大 100 KiB 的对象,并保持在 100 ms 响应预算之内。如果你有 JS 驱动的动画,有效载荷高达 10 KiB 是无风险的。对于大多数应用程序来说,这应该足够了。**`postMessage()` 确实有一定的代价,但还不到让非主线程架构变得不可行的程度。** -If your payloads are bigger than this, you can try sending patches or switching to a binary format. **Considering state layout, transferability and patchability as an architectural decision from the start can help your app run on a wider spectrum of devices.** If you feel like a shared memory model is your best bet, WebAssembly will pave that way for you in the near future. +如果你的有效负载大于此值,你可以尝试发送补丁或切换到二进制格式。**从一开始就将状态布局、可移植性和可补丁性作为架构决策,可以帮助你的应用程序在更广泛的设备上运行。** 如果你觉得共享内存模型是你最好的选择,WebAssembly 将在不久的将来为你铺平道路。 -As I already hinted at in [an older blog post](/things/actormodel/) about the Actor Model, I strongly believe we can implement performant off-main-thread architectures on the web **today**, but this requires us leaving our comfort zone of threaded languages and the web’s all-on-main-by-default. We need to explore alternative architectures and models that **embrace** the constraints of the Web and JavaScript. The benefits are worth it. +我已经在[一篇旧的博文](https://dassur.ma/things/actormodel/)上暗示 Actor Model,我坚信我们可以在**如今**的 web 上实现高性能的非主线程架构,但这需要我们离开线程化语言的舒适区以及 web 中那种默认在所有主线程工作的模式。我们需要探索另一种架构和模型,**拥抱** Web 和 JavaScript 的约束。这些好处是值得的。 > 如果发现译文存在错误或其他需要改进的地方,欢迎到 [掘金翻译计划](https://github.com/xitu/gold-miner) 对译文进行修改并 PR,也可获得相应奖励积分。文章开头的 **本文永久链接** 即为本文在 GitHub 上的 MarkDown 链接。