Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous instantiation #1976

Open
moshevds opened this issue Jan 23, 2020 · 26 comments
Open

Synchronous instantiation #1976

moshevds opened this issue Jan 23, 2020 · 26 comments

Comments

@moshevds
Copy link

Motivation

I have been looking into a way to have Rust code initialize my service worker. This would include the addEventListener('fetch', ...); call, but this can only be done during the worker script’s initial evaluation. WebAssembly.instantiate(...) and WebAssembly.instantiateStreaming(..) both span beyond that time window, so new WebAssembly.Instance(...) is required.

Proposed Solution

Generated code for the web and no-modules targets should include a way to instantiate synchronously. I have a small change that works: https://github.com/moshevds/wasm-bindgen/commit/1ba370edf16a2f3dd47c25889a07430023eb2a10
But I assume that changing the order of arguments is not a good idea, so I'd like to discuss what options there are.

Alternatives

  • It could be possible to add a completely new target, instead of changing the init function that gen_init builds for the web and no-modules targets.
  • Another option would be to expose the imports object somehow and let the caller ignore init() entirely.
@Pauan
Copy link
Contributor

Pauan commented Jan 23, 2020

Hmmm, I wonder if the 4 KB size limit applies to web workers and service workers. If not, then we could have a new worker target which uses importScripts and loads everything synchronously.

@alexcrichton
Copy link
Contributor

Urgh the restrictions of the web sometimes baffle me :(

Is it for certain that some functions can only be called when a script is originally loaded? Is there workarounds that JS has for that perhaps? (this seems like more of a wasm thing rather than a rust/wasm thing almost)

@Pauan
Copy link
Contributor

Pauan commented Jan 24, 2020

@alexcrichton I can't actually find any information on it. But if the code is loaded asynchronously then I think it's possible for it to miss the install or fetch events (which would be bad).

On the other hand, I assume that in the future service workers will support ES6 modules (and therefore will work with async). So this problem might fix itself in the future.

I just now tried compiling with the Webpack target set to "worker", and it does use importScripts, but it still loads the .wasm asynchronously. So rather than adding a new target, we can instead ask Webpack to change their behavior to load the .wasm file synchronously on the "worker" target.

@Pauan
Copy link
Contributor

Pauan commented Jan 24, 2020

Okay, I actually went and read the spec. Two important things:

  1. I was right, they will have support for ES6 modules (Run Service Worker 7.12):

    Let evaluationStatus be the result of running the classic script script if script is a classic script, otherwise, the result of running the module script script if script is a module script.

    That means it's possible to use navigator.serviceWorker.register("/foo.js", { type: "module" }) to load a service worker as an ES6 module. So after wasm-esm-integration arrives, the problem will be solved.

  2. Right now non-ES6 modules do indeed need to run synchronously (Run Service Worker 7.16):

    Note: If the global object’s associated list of event listeners does not have any event listener added at this moment, the service worker’s set of event types to handle remains an empty set. The user agents are encouraged to show a warning that the event listeners must be added on the very first evaluation of the worker script.

So, the conclusion is: this problem will be eventually fixed by the browsers, but right now it is necessary for .wasm files to be loaded synchronously.

@Pauan
Copy link
Contributor

Pauan commented Jan 24, 2020

Hmm, I just realized that the .wasm fetching must also happen synchronously, not just the instantiation.

So we have these three options:

  1. Do nothing, wait for either esm-integration or top-level await to be standardized and implemented.

    Esm-integration is still a ways off, but top-level await is stage 3 and should hopefully arrive pretty soon, so this is actually a reasonable option!

    Also, it's possible to workaround this problem by setting the event listeners in JS, like this:

    import init, { install, fetch } from "./foo.js";
    
    const wasm = init();
    
    self.addEventListener('install', (event) => {
        event.waitUntil(wasm.then(() => {
            return install(event);
        }));
    });
    
    self.addEventListener('fetch', (event) => {
        event.respondWith(wasm.then(() => {
            return fetch(event);
        }));
    });

    This should work today, no changes needed in the browsers or wasm-bindgen.

  2. Convince Webpack to add in synchronous fetch + synchronous instantiation for the "worker" target.

    It's reasonable for us to ask, but they might reject it, and it will take time for it to be implemented.

  3. Create a new wasm-bindgen target called worker (or possibly a flag?) which loads the wasm like this:

    var request = new XMLHttpRequest();
    request.open("GET", url, false);
    request.responseType = "arraybuffer";
    request.send(null);
    
    if (request.status === 200) {
        var module = new WebAssembly.Module(request.response);
        var instance = new WebAssembly.Instance(module, imports);
    
        wasm = instance.exports;
        init.__wbindgen_wasm_module = module;
        wasm.__wbindgen_start();
    
    } else {
        // Handle errors somehow
    }

    I have verified that this works, and it does not have the 4 KB limit that .wasm files normally have.

1 is the best long-term option (and it has a workaround which works today).

2 is reasonable, but it ties us even more strongly to Webpack.

3 is the worst in my opinion, because it is now possible to load workers with ES6 modules, so I don't think we should be spending too much effort on supporting classic-script style workers.

So I think we should go with option 1 (do nothing and let the user workaround it by setting the event listeners in JS).

And after top-level await arrives it will get even nicer, the user can then replace the workaround with this:

import init from "./foo.js";
await init();

@moshevds
Copy link
Author

@Pauan and @alexcrichton, thank you for the thorough responses.

Regarding the workaround:
This makes an assumption about the application that is not always correct:

The respondWith() method of FetchEvent prevents the browser's default fetch handling, and allows you to provide a promise for a Response yourself. (MDN)

The browser's default fetch handling is sometimes what a service worker explicitly wants.

It might be possible to use waitUntil() instead of respondWith(). But that will still pause the network communication in ways that may be undesirable. It may be possible to work around that by careful use of the install and activate events. I suspect that this is not trivial.
Another problem is that the number of desired handlers might change during application development. Adding addEventListener("push", ...); support in the future, for example.
All of this adds rather a lot of Javascript code that needs to be maintained. The very thing I am trying to not do :-)

That said, there is some precedent for the workaround: This is exactly how Cloudflare handles rust in the example and boilerplate code of their "Cloudflare workers". A product that is heavily based on the Service Workers API.

Regarding the "perhaps we can wait for browser support" option:
It may be useful to give an outline of the requirements that I work with. I suspect they are more luxurious than the typical web developer has them.
Being able to work offline is a nice-to-have for all devices, but it will become a requirement for the tablets we distribute ourselves. We intend to use cloudflare workers (or similar) to provide a fallback for all devices that do not have the features we want to use in our service worker. The tablets we distribute will need to be supported for a long time, and it will be hard to get some users to do a browser update.

So, as long as there is one tablet browser, in addition to cloudflare, that can run our code, my requirements are met. Today, cloudflare does not support esm-integration or top-level await either, so the waiting option does not really appeal to me.

Regarding webpack:
I prefer to skip a webpack build phase, and prefer to simply use importScripts or concatenate a small bit of javascript to the file. But if using webpack will result in a working service worker without javascript code that we need to maintain, I will take it.

Regarding the XMLHttpRequest call:
It would make sense to include this code because the asynchronous instantiation can also download the wasm file. But I would not object to doing this myself as part of the loader that calls init(). Unlike the addEventListener calls, there is no additional application logic involved here.

Regarding loading workers with ES6 modules:
Doing navigator.serviceWorker.register("/foo.js", { type: "module" }) is one of the things that I was testing yesterday. This didn't work in Firefox but I didn't try Chrome. However, it is good to see that it is indeed part of the Service Worker API, and I wasn't imagining that this is something that should have worked.

How to proceed?
A flag seems to be the least effort to implement. Do you agree with that? If there are any objections to do it as a flag, I think having a worker target would be another good option. I will probably be able implement the flag as part of my project, but implementing a complete wasm-bindgen target might be a bit much for me to do. Obviously, I will gladly test such a target if someone else volunteers to implement it.

@Pauan
Copy link
Contributor

Pauan commented Jan 24, 2020

@moshevds As far as I know, the browser's default fetch behavior is to simply call self.fetch(event.request), so you can do that in the Rust code.

However, as an alternative, you mentioned waitUntil, which should work as well:

import init, { install, fetch } from "./foo.js";

const wasm = init();

self.addEventListener('install', (event) => {
    event.waitUntil(wasm.then(() => {
        return install(event);
    }));
});

self.addEventListener('fetch', (event) => {
    event.waitUntil(wasm.then(() => {
        return fetch(event);
    }));
});

Now the Rust fetch function can choose to call event.respondWith, or not.

As for "network pauses", that won't happen, because it won't instantiate the service worker until the install event has finished, and the install event won't finish until after the .wasm file is loaded. And after the .wasm file is loaded, waitUntil won't pause anymore. So that's very similar to the behavior of top-level await, or synchronous loading.

Given that, you might be able to safely remove the waitUntil from the fetch event (though it shouldn't hurt anything if you leave it in).

To be clear, the "pausing" is only on the Promise microtask queue, it does not pause the macrotask queue, so there is no extra latency caused by it (assuming the .wasm file is already loaded). So that means that even though it's async, there's no pausing or delay, it is processed immediately.

As for maintenance, you only need to maintain a tiny .js glue file (as I've shown above) which only installs the event listeners and nothing else, everything else (including business logic) can remain in Rust. The amount of code and maintenance burden is incredibly small (a dozen or so lines which rarely change).

With regard to "waiting for browser support", that's only for the "define everything in Rust" approach. With the workaround of defining the event listeners in JS it works today, no need to wait.

A flag seems to be the least effort to implement. Do you agree with that?

That's not the case. The biggest problem is that we have to maintain the flag for a potentially long time, and the flag would be obsolete very quickly, since it would be for classic-style workers (not ES6 modules).

Whereas the workaround I have posted requires almost no effort from anybody (including you), and is forwards compatible with the "correct" solution (top-level await).

@moshevds
Copy link
Author

The pause that I meant was the one where there is no install handler at all. In that case, all requests will pause until the wasm instance is available. I agree that this can be prevented with an install handler, and I agree that microtask pausing is not an important concern.

I disagree, however, that it is a trivial glue in the way that you propose.

  • Since Cloudflare does not have the install event, I have no intention to do any real work in it. But the glue will have to also include a (practically empty) Rust fn to resolve it successfully.
  • If we ever want to use the push or message events (or features that don't exist yet), the responsible thing would be to do a feature detection before adding a listener, something that would require even more Javascript. For push and message, just adding the listener will probably work. But considered strictly, this introduces undefined behavior in environments that don't support it (such as Cloudflare).
  • Just as wasm-bindgen will have to maintain any feature indefinitely, we will also have to maintain the workaround indefinitely. The fact that browsers will have these features soon, does not mean that our users will have these features soon. The current reality is that our users keep their device until it breaks, and will not reliably do updates.

Again, I agree that the workaround is totally doable, just not that there are no real considerations when implementing it. I'm also not saying that it would be a large burden. But for me, it would be a worse option compared to internally patching wasm-bindgen (what I currently did). In turn, this is worse than having it work out of the box.

If a consensus is reached that supporting synchronous instantiation is not worth the effort, is supporting a "bare imports object" target something that could be considered instead?

@Pauan
Copy link
Contributor

Pauan commented Jan 24, 2020

@moshevds That sounds like a bug in Cloudflare, the install event is the perfect place to put in initialization like Wasm.

In any case, even without the install event, it will only pause the first time the .wasm is loaded, after that it won't pause. That is exactly the same thing that would happen with top-level await or synchronous loading.

You seem to have two contradictory requirements:

  • You want to write all your code in Rust, not JS. This requires loading the .wasm file before doing anything else.

  • You want to handle fetch requests before the .wasm file is loaded.

So I'm rather confused: with synchronous loading it will also pause until the .wasm file is loaded before it handles any fetch events. The only way to avoid that pause is to write the fetch handler entirely in JS (no calls to Rust). Maybe you could explain more about your precise requirements.

But for me, it would be a worse option compared to internally patching wasm-bindgen (what I currently did). In turn, this is worse than having it work out of the box.

I'm still not sure how maintaining a wasm-bindgen fork with this code...

var request = new XMLHttpRequest();
request.open("GET", url, false);
request.responseType = "arraybuffer";
request.send(null);

if (request.status === 200) {
    var module = new WebAssembly.Module(request.response);
    var instance = new WebAssembly.Instance(module, imports);

    // This needs to be updated when wasm-bindgen changes
    wasm = instance.exports;
    init.__wbindgen_wasm_module = module;
    wasm.__wbindgen_start();

} else {
    throw new Error("Network request failed with status: " + request.status);
}
importScripts("./path/to/wasm.js");
wasm_bindgen("./path/to/wasm_bg.wasm");

...is less burdensome compared to maintaining code like this:

// This could even be moved into a utility .js file so it doesn't clutter up your code
function events(...names) {
    importScripts("./path/to/wasm.js");
    const wasm = wasm_bindgen("./path/to/wasm_bg.wasm");

    names.forEach((name) => {
        const f = wasm_bindgen[name];

        // Add in whatever feature detection code you need here
        self.addEventListener(name, (event) => {
            event.waitUntil(wasm.then(() => f(event)));
        });
    });
}

// This is the only part you actually need to change, the rest stays the same
events("fetch", "push", "message");

The maintenance burden is literally just 1 line of code (adding/removing names to the events(...) call).

But I don't really understand your requirements, so I'm probably missing something.

Also, to be clear, your patch doesn't work, because it's still loading the .wasm file asynchronously. You need to use synchronous XMLHttpRequest in order for it to load synchronously (which is what the above code does). So it's a much bigger change than your patch, it means completely replacing the fetching and instantiation logic.

is supporting a "bare imports object" target something that could be considered instead?

I think that's reasonable, yeah. In fact, we could just expose the imports object by default, so you could just access wasm_bindgen.imports or something like that. @alexcrichton what do you think?

@moshevds
Copy link
Author

Hi @Pauan, I'm sorry about the confusion. As I understand it, default fetch behaviour is inhibited the moment the first evaluation ends with an event listener added. In the synchronous case, this happens much later (after the wasm is fully instantiated) and therefore is not a problem.

My patch really does work. The wasm module is provided as an argument to init(). Within Cloudflare, the module is constructed by Cloudflare and it can be passed that way. Within browsers, I can do this either with your XMLHttpRequest code or (the way I did it while testing:) embedded as a blob within the javascript file.
On a tangent: I actually think using XMLHttpRequest has other potential issues in the case where a service worker is abruptly terminated and restarted while in the activated live cycle stage, but without internet. The embedded blob approach has another useful characteristic in that it is analogous to the way Cloudflare does it (by providing a pre-compiled wasm module).

The reason I find all of that less burdensome to maintain is that registering the events in javascript has what can be called a much larger interface surface area. Especially in the examples that I mentioned, those few lines of code interact with details of the service worker API + Cloudflare's rendition of that API + wasm-bindgen + whatever business logic I would have that requires the event handlers.
The synchronous instantiation on the other hand, it just requires me to provide a compiled wasm module, and I can get a wasm instance with it. Even if it involves more code and occasionally recompiling wasm-bindgen, I won't have to think as hard about what I am doing 🙂.

I'm not saying those interacting interfaces go away. But with Rust and wasm-bindgen, we have awsome tools that help with some of these issues. And I would prefer not giving those up, when I don't have to.
If the imports object is exposed (or there is a synchronously instantiating target), I will be able to figure out what needs to be done with those event listeners with the tools that I prefer to do that with.

@moshevds
Copy link
Author

It occurred to me that normal workers have this same issue. The only way a browser guarantees that no messages between worker and mainthread are lost, is:

  • The worker sets up its listener during the first synchronous block of execution.
  • The worker never sends a message before it has received the first message from the mainthread.
  • The mainthread sets up its listener before sending the first message.

I am now wondering:
If a future browser has Interface Types and ES Module Integration, does new Worker('foo.wasm', {type: 'module'}); work?
(Provided that foo.wasm has WASM_INTERFACE_TYPES=1 and #[wasm_bindgen(start)].)

Would it make sense for wasm-bindgen to have a target that is equivalent to that, but works today?

@Pauan
Copy link
Contributor

Pauan commented Jan 26, 2020

As I understand it, default fetch behaviour is inhibited the moment the first evaluation ends with an event listener added. In the synchronous case, this happens much later (after the wasm is fully instantiated) and therefore is not a problem.

Ahh, I see, that is a good point, I hadn't considered that.

However, after running some tests, it seems that synchronous loading also blocks fetch requests. The reason this happens is: after the service worker is registered, all fetch requests will be routed to it. So the browser will wait for the service worker to load before handling any fetch requests.

So there really is basically no difference between synchronous loading and asynchronous loading, as far as I can tell.

Here is the code I'm using to test this.

My patch really does work.

It only works if the user handles all loading themself. Calling init("./path/to/wasm", true) will not be synchronous, which is quite confusing. So to make it follow user expectations, it would have to completely change the loading logic to use XMLHttpRequest.

The embedded blob approach has another useful characteristic in that it is analogous to the way Cloudflare does it (by providing a pre-compiled wasm module).

That's an even bigger change, which also bloats up the file, since base64 encoding increases the file size by ~37%.

Especially in the examples that I mentioned, those few lines of code interact with details of the service worker API + Cloudflare's rendition of that API + wasm-bindgen + whatever business logic I would have that requires the event handlers.

But you have to deal with all of that no matter what. So you're either dealing with it in your Rust code (which is actually clunkier than JS for defining event listeners and feature detection), or you're dealing with it in JS.

I fully understand the desire to do everything in Rust (I create SPAs and Chrome/Firefox extensions in pure Rust), but I think in this case the amount of JS code is quite small, easy to maintain, and cleanly separated from Rust, because all of the logic is still defined in Rust.

If a future browser has Interface Types and ES Module Integration, does new Worker('foo.wasm', {type: 'module'}); work?

Yes that will work.

Would it make sense for wasm-bindgen to have a target that is equivalent to that, but works today?

Webpack has support for esm-integration, so you can use that. It's not really possible to implement it in wasm-bindgen, since it requires rewriting the .wasm file and any .js files which import the .wasm file, so it requires some sort of bundler (like Webpack).

After top-level await lands, we will be able to output .js files which seamlessly loads the .wasm file internally, and so the end result will be similar to esm-integration.

@moshevds
Copy link
Author

So there really is basically no difference between synchronous loading and asynchronous loading, as far as I can tell.

Here is the code I'm using to test this.

What browser did you test this with? I am seeing the effect that I was expecting in Firefox and Chrome:
If you reload the page during the first minute (while loadSync is active), this works normally. If you reload the page during the second minute (while loadingWasm is active), the reload pauses.

N.B.: The timed fetches (on the first load) are not handled by the service worker if the service worker does not claim them, and this example does not claim.

It only works if the user handles all loading themself.

That's right. I am not proposing that patch as a pull request, but as a proof of concept that synchronous instantiation works and is useful.

The embedded blob approach has another useful characteristic in that it is analogous to the way Cloudflare does it (by providing a pre-compiled wasm module).

That's an even bigger change, which also bloats up the file, since base64 encoding increases the file size by ~37%.

More like 3% for real world (and therefore compressed) scenario's.

I think in this case the amount of JS code is quite small, easy to maintain, and cleanly separated from Rust, because all of the logic is still defined in Rust.

This is reasonable to make as a personal consideration in relation to a specific use-case, imho. But I do not agree that it is true generally, and I do not agree that it is true for my use-case.

I have been reading some of the discussions regarding the interplay of async loading and service workers. Many of the assumptions we had together, turn out to be wrong. See for example the detailed discussion about how to disallow top-level await in serviceworkers.

Honestly, information like this makes me much more hesitant to just wing it using some javascript code. It is easy to make incorrect assumptions like this. If I do this in Rust, I'll be able to analyse it a bit easier, and make clear test-cases for any caveats that I find.

If a future browser has Interface Types and ES Module Integration, does new Worker('foo.wasm', {type: 'module'}); work?

Yes that will work.

Some people in the discussion that I linked raise doubts about this, unfortunately.

@Pauan
Copy link
Contributor

Pauan commented Jan 26, 2020

I am seeing the effect that I was expecting in Firefox and Chrome: If you reload the page during the first minute (while loadSync is active), this works normally.

It works normally because the service worker hasn't been loaded yet (note how it doesn't log RESPONDING). Service workers are only loaded after you close the tab and re-open it, they are not loaded in already-existing tabs:

"The Service worker will now control pages, but only those opened after the register() is successful. i.e. a document starts life with or without a Service worker and maintains that for its lifetime. So documents will have to be reloaded to actually be controlled."

The way to test it is:

  1. Load the page and wait for 60+ seconds (so that way the service worker gets registered).

  2. Close the tab.

  3. Wait for the service worker to unload (you can view this in chrome://serviceworker-internals in Chrome or about:debugging#/runtime/this-firefox in Firefox).

  4. Re-open the tab. The page itself will take 60 seconds to load, because the synchronous loading of the service worker is blocking the page from loading.

You really do have to close the tab and re-open it, merely refreshing isn't enough.

The timed fetches (on the first load) are not handled by the service worker if the service worker does not claim them, and this example does not claim.

You seem to be misunderstanding how service workers work. I don't blame you, I also had similar misunderstandings until I looked into it more deeply.

You seem to think that the browser works like this:

  1. When the page is loaded, it will register the service worker.

  2. The browser then loads the service worker, and in the meantime all fetch events are handled normally.

  3. When the service worker is finished loading, the browser then routes fetch events to the service worker.

That is not how service workers operate. Instead, they operate like this:

  1. When the page is loaded, it will register the service worker.

  2. The browser then runs the service worker and determines whether the fetch event is defined or not.

  3. But the service worker isn't loaded in the page, because the page always uses the service worker which was registered at the time the page was opened, not the newly installed service worker.

  4. You close the tab and re-open it.

  5. It now loads the registered service worker. The browser already knows that the previously-registered service worker registered a fetch event, so the browser blocks all fetches (even the fetch for the current page!) until the service worker has loaded.

Similarly, when you upgrade the service worker, the current pages will continue to use the old service worker. It is only after they are closed and re-opened that they will start using the new service worker.

Or to put it another way, the lifetime of the service worker is intrinsically tied to the lifetime of the page. The browser will never "hot-swap" service workers into existing pages.

Therefore, the browser always knows ahead of time whether the service worker "claims" fetch or not. The browser knows this even before the service worker is loaded! So the browser will block all fetches until the service worker is loaded.

See for example the detailed discussion about how to disallow top-level await in serviceworkers.

That's very interesting, I hadn't seen that. I'm really perplexed at their decision. It essentially seems to be, "well, somebody might write some slow async code, and that will delay the service worker from being installed!" But... you can already do that with the install event.

And in any case, trying to stop bad developers from writing bad code doesn't work. Bad (or malicious) developers will just find other things to exploit/abuse. I can fully understand putting in some sort of quota, like "any service worker that takes longer than 30 seconds is killed". But that's not what they're doing, they're just flat-out banning all await, even very fast awaits.

Judging from that thread, it also looks like they want to ban sync XMLHttpRequest in service workers, and since they seem to be trying to ban all long-running synchronous things, they'll probably end up banning synchronous WebAssembly too.

So then your only option would be to write the event listeners in JS. This just makes me even more convinced that wasm-bindgen shouldn't have a worker target or synchronous loading.

@moshevds
Copy link
Author

A quick reading of your explanation is indeed how I understood it to be the case: That sync loading will block a fetch when the service worker is in the activated life-cycle stage, but not loaded. This is one of the situations that I was referring in an earlier post.

And that is indeed a situation that has to be taken into account. But I will also have to take into account that I will claim the currently loaded page (using clients.claim()), because this will enable a user to disable their internet connection while the application remains loaded. So that really is something that has to be considered.

I am not particularly worried about loading times. But I am worried about situations where the application gets stuck loading a network resource that is required for offline use. Such as what your example can get into.

If it would be the case that I am confused about how this works, that would be an even more clear reason that I should not trust myself to do this in Javascript. Don't you agree with that?

Regarding the banning of synchronous WebAssembly:
Synchronous WebAssembly works today. It would be really weird if it gets banned in the future (for any not-security related).

@Pauan
Copy link
Contributor

Pauan commented Jan 26, 2020

So that really is something that has to be considered.

Oh, interesting, I didn't know that claim() is a thing. And you are right that it has to be considered, but the solution in that case is caching, which is exactly what service workers are designed to do, so just cache the .wasm file:

// This can be used to fetch + cache *any* file
async function load(url) {
    const cache = await caches.open("offline");

    try {
        const response = await fetch(url);

        if (response.ok) {
            await cache.put(url, response);
            return response;

        } else {
            throw new Error("Invalid status code: " + response.status);
        }

    } catch (e) {
        const response = await cache.match(url);

        if (response) {
            return response;

        } else {
            throw e;
        }
    }
}

async function loadWasm(url) {
    const response = await load(url);
    const bytes = await response.arrayBuffer();
    await wasm_bindgen(bytes);
}

The above code only needs to be written once, then stuffed into some library or utility .js file.

Now you can use it like this:

const wasm = loadWasm("./path/to/wasm_bg.wasm");

If it would be the case that I am confused about how this works, that would be an even more clear reason that I should not trust myself to do this in Javascript. Don't you agree with that?

The behavior of service workers is the same in both Rust and JS. The APIs are the same. The types are the same. web-sys is only a very thin layer on top of JS, so you will need an understanding of JS in order to do things in Rust.

I agree that Rust is a much nicer language than JavaScript, but the more that I look at this the more it seems that a small amount of JS glue is necessary.

If you're not comfortable with writing JS code, then you might want to look into using a JS library which is designed to make service workers easier. That way it's somebody else writing the code (and hopefully doing it correctly).

Synchronous WebAssembly works today. It would be really weird if it gets banned in the future (for any not-security related).

Synchronous XMLHttpRequest was available for decades, but years ago it was banned in the main thread, and they are now considering banning it in service workers.

And synchronous WebAssembly is already banned on the main thread, and is heavily discouraged. They have a long history of banning synchronous things for performance reasons (not security).

And given how they banned top-level await even though they know that makes it impossible to use WebAssembly as a service worker, I would be more surprised if they don't ban synchronous WebAssembly.

@moshevds
Copy link
Author

On a technical level, I really object to do part of my caching logic in Javascript. I have outlined a system to do our caching and prefetching, and I want to build a prototype in Rust. Imho, it would be really bad to have a second caching layer that works differently and is written in Javascript.

I think we need to reconsider the core question:
Is there a trivial Javascript workaround for the use-case that wasm-bindgen doesn't currently support?

I totally agree that all of this can be done in Javascript. In fact, everything that can be done with wasm-bindgen, can be done with Javascript. But when should we consider a workaround to be trivial?

My answer:
I really don't mind some Javascript code if it works correctly and keeps out of my way. That would be trivial. IMHO:

  • Using any library other than wasm-bindgen automatically counts as non-trivial.
  • Doing anything that someone at my company would reasonable expect automated testing for counts as non-trivial.
  • Something that restricts the options that I have for writing the business logic, counts as not-trivial.

This whole discussion comes across as if initially I only had a small problem because I couldn't do the API call that I wanted using Rust, and a solution was easily found. But now wasm-bindgen is saying "Just use Javascript for your project, its easy enough". I really just want to discuss whatever you think should be done in my case. I am assuming that "don't use wasm-bindgen" isn't the final answer if we carefully consider all the options. Right?

PS: You are correct that freezing the browser window is not a security issue. The 4K limit is, I assume, also to prevent freezing the browser window. Workers don't freeze the browser window and do not have that limit. I would consider it really weird if they remove a feature without a similarly good reason. But indeed, not all good reasons to remove a feature are security related 🙂.

@Pauan
Copy link
Contributor

Pauan commented Jan 26, 2020

But now wasm-bindgen is saying "Just use Javascript for your project, its easy enough"

That is not the case. All of the JS code I have discussed has been solely about 1) loading the .wasm, and 2) adding the event listeners. The vast majority (99+%) of your code can still remain in Rust.

I agree that unfortunately the "loading the .wasm" part has gotten more complicated, due to your need for caching.

So I have a question for you: assuming a hypothetical ideal situation where you can load the .wasm file without needing JS glue code... how would you handle the .wasm caching without using JS?

Also, how would you expect wasm-bindgen to help? We're certainly not going to be adding in ad-hoc caching into wasm-bindgen, that's clearly the responsibility of the end user, since different users will have different caching needs and use cases.

I really just want to discuss whatever you think should be done in my case. I am assuming that "don't use wasm-bindgen" isn't the final answer if we carefully consider all the options. Right?

I've already pretty clearly described the workaround for this problem. It's a few dozen lines of JS code which basically never needs to change, you just stash it into a file and call it a day. I don't think you're going to get much better than that, because of the strict limitations of service workers (which we can't do anything about).

The only other option is for wasm-bindgen to expose the imports object (and maybe a couple other things), and then you write some code to load everything synchronously (which will probably stop working at some point in the future). And you'll still have to add in caching, because the synchronous fetch can still fail. And the caching is inherently asynchronous (it uses Promises). So in the end it'll be just as complicated or more complicated than the workaround.

To be clear, this is not wasm-bindgen's fault, it is caused by restrictions in service workers. And they have made it crystal clear that they have no desire to support .wasm in service workers. Therefore, hacky workarounds are needed. Hacky workarounds are just a daily unavoidable necessity in the JS ecosystem. There's not much we can do to fix that.

Workers don't freeze the browser window and do not have that limit. I would consider it really weird if they remove a feature without a similarly good reason.

They have already removed top-level await (which doesn't freeze the browser) in workers simply because some people might use it for "slow operations", and in the thread you linked they clearly stated a desire to remove synchronous XMLHttpRequest.

Whether you think that's reasonable or not, you should tell them, not me. I don't make those decisions.

@moshevds
Copy link
Author

I don't think this discussion is bringing any of us closer to a solution. It might still be worthwhile to discuss more technical details about my project, but it seems a bit strange that we are almost discussing what kind of code I would accept into our codebase, instead of what kind of code you would except into wasm-bindgen.
(I have said before that I dislike the JS event listeners because of the interface surface area that it adds, and I dislike this new caching code even more because it adds tight coupling between Javascript and Rust via the CacheStorage object. Let's discuss this separately if you are interested in knowing more about why I think this is not good.)

If "no trivial workaround possible" is the bar to consider possible solutions. Then discussing what a trivial workaround entails seems more sensible to me.

Anyway:

assuming a hypothetical ideal situation where you can load the .wasm file without needing JS glue code... how would you handle the .wasm caching without using JS?

The browser handles it. The Service Worker spec calls this the script resource map. Basicly, the browser will retain a copy of the service worker script and its importScripts, separate from its normal cache handling.

Also, how would you expect wasm-bindgen to help?

Any solution that doesn't impose WebAssembly.instantiate on the caller is fine.

We're certainly not going to be adding in ad-hoc caching into wasm-bindgen, that's clearly the responsibility of the end user, since different users will have different caching needs and use cases.

This is exactly one of my arguments for why this kind of solution isn't a trivial workaround at all 🙃.

And then you write some code to load everything synchronously (which will probably stop working at some point in the future).

This is a serious response: wasm-to-js transpiler or polyfill.

(The performance of this will probably still be better than hand-written JS. But performance is not my concern at all, and not the reason why I want to use Rust here.)

Also: I think the chances of such a change happening is exceedingly unlikely anyway.

To be clear, this is not wasm-bindgen's fault, it is caused by restrictions in service workers.

I actually don't agree with you that those restrictions are ill-considered.

If by "it is not wasm-bindgen's fault" you mean that it is outside of what wasm-bindgen aims to solve because the browsers are doing something wrong here: I also don't think that is the case.
But I don't see how we can get agreement about that, if we only discuss whether I'll accept something into our codebase, or not.

@Pauan
Copy link
Contributor

Pauan commented Jan 27, 2020

The browser handles it.

Okay, let me clarify: how would you handle it in the situation where wasm-bindgen synchronously loads the .wasm file? Since we can't fix the browsers, that's the best that wasm-bindgen could do right now.

This is a serious response: wasm-to-js transpiler or polyfill.

Of course you're free to use a wasm-to-js transpiler on your own project (we even have a tutorial for it). But a wasm-to-js transpiler won't work for new wasm-only features (which we plan to support), so we're unlikely to add that into wasm-bindgen, so you'll have to use an external tool for that.

And I'm not sure what you mean by "polyfill", what would it be polyfilling?

If by "it is not wasm-bindgen's fault" you mean that it is outside of what wasm-bindgen aims to solve because the browsers are doing something wrong here: I also don't think that is the case.

No, that is not what I mean. What I mean is that it is basically impossible for wasm-bindgen to solve this.

Let me recap the situation:

  1. Service workers must register the event listeners synchronously. There is no way around this requirement.

  2. The W3C has decided to ban top-level await, which means that async APIs cannot be used to register the event listeners.

    That also means that esm-integration in service workers will not work, since that relies on top-level await. So you cannot load .wasm files directly in service workers.

  3. Therefore, the only way to register the event listeners with .wasm is to use sync XHR + sync WebAssembly.

    But they have already stated a desire to remove sync XHR, and likely will remove sync WebAssembly as well in the future.

    To be clear, even if they don't remove sync WebAssembly, they still plan to remove sync XHR, and if either of those two things are banned then you cannot load .wasm synchronously.

    Therefore, synchronous loading doesn't work, not because of wasm-bindgen's choice, but because of the browsers choosing to ban both top-level await and sync XHR. We have no control over that.

  4. Therefore the only possible synchronous solution is to encode the .wasm file as base64 and embed it directly inside the .js file.

    However, that still relies on sync WebAssembly working, which I doubt it will in the future. We're not really comfortable with creating a new flag/target to support a new compilation format, especially when there's a good chance that compilation format will stop working in the future.

  5. Therefore, the most future-proof option (both for you and for wasm-bindgen) is to use some small JS glue code which creates the event listeners, asynchronously loads the .wasm file, and uses waitUntil in the event listeners.

    This is guaranteed to work under all conditions, both now and in the future.

So, if you want support for this in wasm-bindgen, you should talk to the W3C (probably in the thread you linked), explaining your use case and why you want to use .wasm in service workers.

You can also ask them if they plan to support sync WebAssembly in service workers. If they are willing to support sync WebAssembly in service workers, then we can discuss changing wasm-bindgen to base64-encode the .wasm file.

Note that none of this is within wasm-bindgen's control, we are at the mercy of the browsers. If they choose to ban sync WebAssembly then there is quite literally nothing we can do, even if our lives depended on it.

You seem to think that this is a matter of wasm-bindgen being stubborn, but that is not the case. It is the W3C's decision whether to ban sync WebAssembly or not, so you need to argue with them, not me.

@moshevds
Copy link
Author

You seem to think that this is a matter of wasm-bindgen being stubborn, but that is not the case.

Thank you. I don't think you are stubborn necessarily, but I do think that you are misunderstanding some of what I am saying. Because you keep assuming requirements for my use-case that I don't see as requirements at all. I hope we can still achieve a shared understanding here.

It is the W3C's decision whether to ban sync WebAssembly or not, so you need to argue with them, not me.

Sure. But I don't read a decision to ban WebAssembly in those discussions. I would agree that more communication between people working on WebAssembly specs and Service Worker specs might be helpful, but that is about it.
I can certainly understand hesitation from wasm-bindgen to implement something before there is a clearly indicated direction from the w3c working groups. If that turns out to be the hold-up, that might indeed incentivize me to ask them for such a thing.

The browser handles it.

Okay, let me clarify: how would you handle it in the situation where wasm-bindgen synchronously loads the .wasm file? Since we can't fix the browsers, that's the best that wasm-bindgen could do right now.

This might be related, but it is not at all what I am asking.

Right now, the async .wasm loading that wasm-bindgen implements is also broken in Service Workers. Because the fetch in init() will never work in offline mode.

This means that manually providing a WebAssembly.Module is required for all Service Workers. (Even when doing the event listeners in JS.) I am fine with that, and I don't think it needs to be fixed for my project.

If you think this is something that needs to be fixed in the async case, then I would propose whatever solution that you come up for async to work analogously for sync. But, just like you, I see no clear path for doing so.

(One possible option could be to allow the caller to specify a fetch-compatible function that takes into account whatever caching the user wants to do. The analogous sync solution to that would be to allow it to return any thennable, not just promises.)

And I'm not sure what you mean by "polyfill", what would it be polyfilling?

The WebAssembly polyfill that everybody says is a good idea, but doesn't actually exist today.

Let me recap the situation:

  1. Service workers must register the event listeners synchronously. There is no way around this requirement.

We have agreement here.

  1. The W3C has decided to ban top-level await, which means that async APIs cannot be used to register the event listeners.

Even if it were allowed, registering probably needs to happen before the first await. IMHO, this is a sensible requirement for Service Workers.

That also means that esm-integration in service workers will not work, since that relies on top-level await. So you cannot load .wasm files directly in service workers.

This is indeed unfortunate, and something that the working groups should perhaps work on.

  1. Therefore, the only way to register the event listeners with .wasm is to use sync XHR + sync WebAssembly.

Or have the WebAssembly.Module magically appear like Cloudflare does (or less magically using an embedded blob). If you mean an actual remote file specifically: That will never work in offline mode.

To be clear, even if they don't remove sync WebAssembly, they still plan to remove sync XHR, and if either of those two things are banned then you cannot load .wasm synchronously.

I can't even do that now because it wouldn't work in offline mode.

Therefore, synchronous loading doesn't work, not because of wasm-bindgen's choice, but because of the browsers choosing to ban both top-level await and sync XHR. We have no control over that.

And again, I am not asking for synchronous loading. That won't work just like asynchronous loading won't work either. What I am asking for is synchronous instantiation.

  1. Therefore the only possible synchronous solution is to encode the .wasm file as base64 and embed it directly inside the .js file.

There are other options, but the same is true for any asynchronous solution. There are a bit more ways to store data locally with async API's, but the problem remains the same.

However, that still relies on sync WebAssembly working, which I doubt it will in the future. We're not really comfortable with creating a new flag/target to support a new compilation format, especially when there's a good chance that compilation format will stop working in the future.

I have not seen an indication that this is the case. I actually take the 4K limit and the fact that it is only applied specifically to the main thread, as an indication that support for synchronous instantiation is unlikely to go away from Workers.
This makes sense, because workers are always background work anyway. And asynchronous makes much less sense for background work compared to interactive work.

Anyway: If this is a serious concern that you have, what kind of assurances would you look for to consider this worthwhile to support? From the working groups? From browser creators? I am willing to go and ask for some kind of assurance if there is a specific roadblock here.

  1. Therefore, the most future-proof option (both for you and for wasm-bindgen) is to use some small JS glue code which creates the event listeners, asynchronously loads the .wasm file, and uses waitUntil in the event listeners.
    This is guaranteed to work under all conditions, both now and in the future.

Offline mode is quite an important aspect of my requirements, and that won't work the way you propose here. This is unrelated to the sync/async discussion.

So, if you want support for this in wasm-bindgen, you should talk to the W3C (probably in the thread you linked), explaining your use case and why you want to use .wasm in service workers.

I'm not sure if that issue is the correct place, but contacting some of those people seems like a sensible option.

I would like to first see if you and me can get on the same page about this. It seems that should be possible 🙂.

@Pauan
Copy link
Contributor

Pauan commented Jan 27, 2020

Because you keep assuming requirements for my use-case that I don't see as requirements at all. I hope we can still achieve a shared understanding here.

The only assumption that I made in my previous post is that you want to be able to define the event listeners in Rust. You have stated that requirement multiple times. Everything else is just hard facts based on that single requirement.

If that is not a requirement, then I have already provided excellent solutions, here and here.

I would agree that more communication between people working on WebAssembly specs and Service Worker specs might be helpful, but that is about it.

This is a web issue, the W3C spec is a super-set of other specs, so they can make these decisions without any changes to the WebAssembly spec, no communication is needed.

The WebAssembly polyfill that everybody says is a good idea, but doesn't actually exist today.

Ah, I see, well that's just wasm2js (which exists).

Because the fetch in init() will never work in offline mode.
[...]
Offline mode is quite an important aspect of my requirements, and that won't work the way you propose here. This is unrelated to the sync/async discussion.

No, offline mode works just fine with the caching code I posted before. The caching happens automatically and transparently.

Or have the WebAssembly.Module magically appear like Cloudflare does (or less magically using an embedded blob).
[...]
And again, I am not asking for synchronous loading. That won't work just like asynchronous loading won't work either. What I am asking for is synchronous instantiation.

Nothing is magic. I covered that option in point 4 (embedding as base64), but that still requires sync WebAssembly.

There are other options, but the same is true for any asynchronous solution.

No, that is the only synchronous solution. No other synchronous solution works (because sync XHR is planned to be banned, and wasm2js won't work for new wasm features).

And since the event listeners must be defined synchronously, therefore it is the only solution that fulfills your requirement of defining the event listeners in Rust.

But I don't read a decision to ban WebAssembly in those discussions.

It's heavily implied from their decision to ban sync XHR and top-level await, but I agree we don't know for sure, which is what's making me uneasy: we would want some sort of assurance that sync WebAssembly will work in the future.

I have not seen an indication that this is the case. I actually take the 4K limit and the fact that it is only applied specifically to the main thread, as an indication that support for synchronous instantiation is unlikely to go away from Workers.

But they restricted sync XHR in service workers, even though service workers run off of the main thread. It's clear that they are treating service workers differently from regular workers, so you can't assume that "because it works in Workers it'll work in Service Workers".

If what you are saying was true, then they would have no reason to ban top-level await, yet they did ban top-level await! So the fact that they banned top-level await and sync XHR means that it's very likely that they will ban sync WebAssembly as well. One of their decisions heavily implies the other decision. I cannot say this any more clearly.

In any case, this is all speculation, the best course of action is to ask them.

Anyway: If this is a serious concern that you have, what kind of assurances would you look for to consider this worthwhile to support? From the working groups? From browser creators? I am willing to go and ask for some kind of assurance if there is a specific roadblock here.

What matters is the W3C, since they create the spec that the browsers follow. Also, all the browsers have representatives in the W3C, so asking the W3C is basically the same as asking the browsers themself.

@moshevds
Copy link
Author

moshevds commented Jan 27, 2020

The only assumption that I made in my previous post is that you want to be able to define the event listeners in Rust. You have stated that requirement multiple times.

This is correct.

Everything else is just hard facts based on that single requirement.

I'm not sure what you are referring to. If I understand you correctly, you had 2 core points:

  • That your code that doesn't meet my 1 requirement, is still good enough for my use case.
  • That WebAssembly support will disappear from Service Workers.

The first is more of an opinion that I happen to disagree with. The second is a prediction that I happen to think is unlikely. (You also now indicate that this one is speculation on both our ends, and I fully agree with that notion.)

Both are fine, we don't have to agree about those points if we can find a way forward towards discussing what solutions there are that do meet "my requirement" (or my use-case more generally).
I think we have now found one: That we can continue to discuss possible wasm-bindgen changes if we get some assurance that WebAssembly.Instance (and WebAssembly.Module) will not go away.

I'll ask that somewhere at the W3C, and report back.

@Pauan
Copy link
Contributor

Pauan commented Jan 27, 2020

That your code that doesn't meet my 1 requirement, is still good enough for my use case.

My point has been that you might not have a choice, due to browser restrictions, so your requirements might have to change. We can't do anything about that since we don't make decisions for the W3C.

You also now indicate that this one is speculation on both our ends, and I fully agree with that notion.

I have never said or implied that it was a fact, I always made it clear that it was my prediction. But it's a prediction that is based on my deep understanding of both the browsers and how the W3C operates, it isn't just a blind guess.

I'll ask that somewhere at the W3C, and report back.

I went ahead and asked already: w3c/ServiceWorker#1499

As I predicted, they are currently leaning toward blocking it, though we'll see how things pan out.

@moshevds
Copy link
Author

Great! I think we are on the same page about those particular points.

I have added my questions at the issue you opened. I think we can continue here when/if we get a clear picture about how the Service Worker people think this should be implemented.

Thanks.

@YaaMe
Copy link

YaaMe commented Aug 11, 2021

Tracing Cloudflare Worker + wasm_bindgen too...here's a holy long story with a to be continue tag..got a lot information from the discussion, thanks
by the way, cf-worker rust template updated recently, the code now can work.(top level await still banned)

const { hello } = wasm_bindgen;
const instance = wasm_bindgen(hello);
instance.then(() => hello())

so synchronous in cfworker is no longer a requirement now.( Pauan +1 point :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants