Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New "setup" lifecycle for service worker #1576

Closed
hanguokai opened this issue Apr 6, 2021 · 36 comments
Closed

New "setup" lifecycle for service worker #1576

hanguokai opened this issue Apr 6, 2021 · 36 comments

Comments

@hanguokai
Copy link
Member

hanguokai commented Apr 6, 2021

Summary: Current, service worker only has install and active lifecycle. I suggest adding a setup lifecycle event, like setup() method in unit testing. When SW wake up by events, it first waits for the setup method complete, then start to process events. setup method only execute once when SW wake up.

Maybe you know service worker now used to process Chrome extension's background event, e.g. browser tabs updates.

var initWaitList = [];
var start = false;
var cache;
function init() {
  return new Promise(async function(resolve, reject) {
    if(cache) {
      resolve();
    } else {
      initWaitList.push(resolve);
      if(!start) {
        start = true;
        cache = await readDataFromStorage();
        for(let r of initWaitList) {
          r();
        }
        initWaitList = [];
      }
    }
  });
}

chrome.tabs.onUpdated.addListener(async function(parameters) {
  await init();
  // then use cache data process events
});

What I understand is that service worker usually keep alive for a while(e.g. 10 seconds) when no task, so use global variable as cache can improve performance, because maybe there will be more events triggered. Cache is the same life cycle with service worker. In above example code, it is a data that async read from storage(IDB or chrome.storage). In my environment(chrome extension), service worker will wake up by multiple events in very short time. The above code(init method) is used to prevent from reading data from storage multiple times, reading multiple times is not necessary.

If service worker support a setup stage before process events, developers don't need to consider concurrent situation before it start to process events.

@wanderview
Copy link
Member

I'm not sure I understand. Can't your snippet above be used today without the browser supporting an additional "setup" phase on worker launch? The top level evaluation of the script is in many ways this "setup" phase. Its just synchronous and not async today.

@hanguokai
Copy link
Member Author

hanguokai commented Apr 6, 2021

Above snippet is current implementation without "setup" phase.

If service worker support "setup" phase, the snippet will look like below, very simple:

var cache;

self.addEventListener('setup', async event => {
  cache = await readDataFromStorage();
});

chrome.tabs.onUpdated.addListener(function(parameters) {
  // use cache data process events
});

@wanderview
Copy link
Member

That is more concise, but doesn't seem that much better than:

var cache;
var init_promise = async function() {
  cache = await readDataFromStorage();
}();

self.addEventListener('fetch', evt =>{
  evt.respondWith(async function() {
    await init_promise;
    // use cache data
  }());
});

@wanderview
Copy link
Member

To be clear, there is an expensive cost to adding more phases to service worker startup in terms of both c++ development time, spec writing, and also runtime performance. For example, this phase could delay processing events, etc. I think we would need a fairly large benefit or problem solved to take on something like this. I hope that makes sense.

@asutherland
Copy link

asutherland commented Apr 6, 2021

It's also worth calling out that WebExtensions can layer whatever additional semantics and state machines it wants onto its own API surfaces. There's nothing stopping WebExtensions from, specified in its own standard, dispatching a functional event "webextSetup" which must complete before WebExtensions starts dispatching events at the tabs API.

I think the major issue is the "fetch" event, and at least in terms of Firefox's planned implementation of WebExt Service Workers, that won't be going through the normal fetch specification and "handle fetch" series of hooks. Instead WebExtensions will directly dispatch a fetch event at the ServiceWorker. The WebExtensions network layer ("channel" in Firefox/Necko) has complete control over when it dispatches this event and can internally wait for "webextSetup" or whatever it wants to happen before handing them off to the ServiceWorkers machinery.

Obviously, all of this could change with any standardization effort for WebExtensions, but @wanderview's complexity concerns would still hold in that case. (And in general I do expect there to be some semantic impedance mismatches resulting from WebExtensions' creation and evolution to use ServiceWorkers outside of any public standards efforts.)

@hanguokai
Copy link
Member Author

My initial problem is that if multiple events come at the same time, await readDataFromStorage() will be called multiple times if I don't use my initial snippet in this post.

Your solution, in that condition, the init_promise will be awaited multiple times. await a resolved promise is Ok, although not common.

Use your method, my code will look like:

var cache;
var init_promise = async function() {
  cache = await readDataFromStorage();
}();

chrome.tabs.onUpdated.addListener(function(parameters) {
  await init_promise();
  // use cache data process events
});

Your solution is more simple than my initial solution. But if service worker support 'setup' phase, no need to use an init_promise, and everything's ready, that is more easy to understand.

I think 'setup phase' like 'top level await' in service worker, they do the same thing. For example, the code can like below

var cache = await readDataFromStorage();
// then listen events

So 'setup phase' vs 'top level await', which is better?

@hanguokai
Copy link
Member Author

@asutherland For Chrome extensions, there are many kinds of events, not just one fetch event. And you usually need to read data from storage(e.g. user settings) to process the event.

@asutherland
Copy link

@hanguokai Understood and agreed.

My general point here is that there's no need for the ServiceWorkers spec to add explicit waits to its event dispatching semantics along the lines of Step 19 in Handle Fetch that says If activeWorker’s state is "activating", wait for activeWorker’s state to become "activated". That is something that can entirely be dealt with by the WebExtensions consumption of ServiceWorkers in WebExtensions space if that's something WebExtensions want.

I understand the case you are making for use case commonality for (web) Service Workers, but I share @wanderview's complexity concerns and a general performance concern, as the general feedback we have received is that ServiceWorkers need to have the lowest latency possible, and introducing an intentional pipeline stall that precludes parallelism seems counter to that (and the UX that users want, as opposed to the DX that developers want). WebExtensions operate in a somewhat different problem space where there's inherent user intent via the act of installation that justifies speculatively spinning up ServiceWorkers in response to potential actions like moving towards/hovering on a button and/or starting them at browser startup and keeping them alive for extended periods of time that would not make sense for websites.

I mentioned "fetch" because it was my impression that WebExtensions re-uses "fetch" rather than having introduced a distinct event type with its own semantics. I may be wrong about this.

@hanguokai
Copy link
Member Author

No matter if service worker provides 'setup phase' or 'top level await', thank @wanderview for your simpler solution.

Here I post a more complete code snippet, which include updating the cache. Maybe other extension developers will refer to this solution.

// I use uppercase to identify global variables
var Cache;
function init() {
  return new Promise(function(resolve, reject) {
    chrome.storage.sync.get(null, function(data) {
      // setup cache, for example:
      Cache = data;
      resolve();
    });
  });
}
var InitPromise = init();

// listen browser events, for example:
chrome.tabs.onUpdated.addListener(async function(parameters) {
  await InitPromise;
  // now you can use cache to process the event
});

// if data changed, e.g. user changed settings, you should update cache.
// listen messages or listen storage change event. for example:
chrome.runtime.onMessage.addListener(function(request, sender, sendResponse) {
  if(request.type == "update") {
    InitPromise = init();
  }
});

@jakearchibald
Copy link
Contributor

@hanguokai

await a resolved promise is Ok, although not common.

I think it's pretty common pattern, and it's one of the reasons promises don't expose their state. It's encouraged to "just await the promise", rather than if (resolved) doThing else awaitThingThenDoThing. Eg, the example here https://www.npmjs.com/package/idb#examples, where dbPromise will be already-settled for most of the calls.

@jakearchibald
Copy link
Contributor

@hanguokai I think this pattern is more promise-like:

const dataPromise = new Promise((resolve, reject) => {
  chrome.storage.sync.get(null, (data) => resolve(data));
});

// listen browser events, for example:
chrome.tabs.onUpdated.addListener(async (parameters) => {
  const data = await dataPromise;
});

This also means a given function is working with the same data throughout its execution. Whereas in your model, Cache could change half-way through a function, if init() is called by some other function.

@hanguokai
Copy link
Member Author

hanguokai commented Apr 7, 2021

const dataPromise = new Promise(...) works only if you don't need to update data. But in my case, if user change data, I need to update cache. So InitPromise would be re-assigned.

@jakearchibald
Copy link
Contributor

Yes, if it needs changing:

let dataPromise;

function updateDataPromise() {
  dataPromise = new Promise((resolve, reject) => {
    chrome.storage.sync.get(null, (data) => resolve(data));
  });
}

updateDataPromise();

// listen browser events, for example:
chrome.tabs.onUpdated.addListener(async (parameters) => {
  const data = await dataPromise;
});

@hanguokai
Copy link
Member Author

These codes are equivalent, just the style is different. If you wait for data, then resolve(data). If you wait for the preparation to be completed, then resolve().

@hanguokai
Copy link
Member Author

await a resolved promise is pretty common pattern

Perhaps you're right. This is simpler than maintaining a waiting queue (like my initial solution).

@hanguokai
Copy link
Member Author

hanguokai commented Apr 7, 2021

These codes are equivalent, just the style is different. If you wait for data, then resolve(data). If you wait for the preparation to be completed, then resolve().

Let me explain more. In my real code, I parse the whole data, and set two global variables. So I think of the init method as a setup phrase, not a single data. If it is a single data, your code is more reasonable.

@jakearchibald
Copy link
Contributor

In your example the init method has side effects that might land half-way through a function, which could cause unexpected bugs. If that's intended, I'd refactor a bit to make it obvious that it's the expected behaviour.

@hanguokai
Copy link
Member Author

Right, data may be inconsistent when updating. It depends on how the data is used. If use cache data synchronously at once after await the promise, no more async process before using it, it won't have a problem. Use old data before update, use updated data after update, this is acceptable as cache. If that code as a general example for others to copy, that may not be appropriate.

@jakearchibald
Copy link
Contributor

In that case I'd have addToCache and getFromCache functions that do the promise juggling so it doesn't need to be done in each event.

@tophf
Copy link

tophf commented Apr 7, 2021

When ES modules will be enabled for service workers by default, which can happen pretty soon, it'll be possible to use top-level await to read the storage at the beginning of the script in a "blocking" fashion so it completes before the subsequent code in the script runs.

@jakearchibald
Copy link
Contributor

We don't allow top-level await in a service worker scripts because that's usually an anti-pattern.

@tophf
Copy link

tophf commented Apr 7, 2021

We don't allow top-level await in a service worker scripts because that's usually an anti-pattern.

Is it spec-compliant? What will happen when a service worker module statically imports a module that uses a top-level await inside?

@jakearchibald
Copy link
Contributor

It will throw. Top level await in dynamically imported scripts is fine.

@jakearchibald
Copy link
Contributor

See #1407

@tophf
Copy link

tophf commented Apr 7, 2021

Thanks. Looks like it's still in the process of standardizing. I hope it'll be allowed in an extension service worker because those concerns don't seem to apply to a ManifestV3 browser extension - it doesn't have any blocking events like fetch, moreover an extension is a fully local and privileged thing, explicitly installed by the user to extend the browser not like a hidden service worker installed in drive by mode by a random site.

@jakearchibald
Copy link
Contributor

Is there a spec for extension service worker?

@tophf
Copy link

tophf commented Apr 7, 2021

I've never seen it published but maybe it exists privately, try asking someone from chromium. All I know is that it's just a standard service worker (without lifecycle events) and custom code for extension-specific stuff.

@tomayac
Copy link
Contributor

tomayac commented Apr 7, 2021

Is there a spec for extension service worker?

Extension service workers are supposed to be just "regular" service workers according to the docs.

@jakearchibald
Copy link
Contributor

But they don't have events that are performance sensitive?

@tophf
Copy link

tophf commented Apr 7, 2021

ManifestV3 has removed all synchronous (blocking) events. All events are asynchronous now so there's no difference (performance-wise) where the time will be spent to "set up" the state. Although I don't know if it'll be technically possible for extension API listeners to support the top-level await as currently they must be re-registered in the initial event loop task... Well, we'll see soon.

@jakearchibald
Copy link
Contributor

@tophf huh, so there's no way to intercept and modify things like requests in ManifestV3?

@tophf
Copy link

tophf commented Apr 7, 2021

Only via declarativeNetRequest API. Processing is performed inside the browser so the capabilities are quite limited.

@jakearchibald
Copy link
Contributor

Seems like the extension service worker is different enough that it could have its own rules, if it wanted. Or, if it wanted to stick with service worker rules, this format works in both:

const setup = Promise.all([
  asyncThing1(),
  asyncThing2(),
]);

addEventListener('fetch', (event) => {
  event.respondWith((async () => {
    const [thing1, thing2] = await setup;
    // …
  })());
});

The above pattern has advantages: Particular events wouldn't need to wait for setup if they don't need it, or setup requirements could be more granular, where some events don't need to await everything in the setup.

@hanguokai
Copy link
Member Author

Seems like the extension service worker is different enough

Although service worker was originally inspired by the extension event page(background page), the problems they face and the context they run differ greatly.

The conclusion is that use a 'setup promise' implement the 'setup phase' in service worker currently. It seems this trick would be used to resolve 'top level await' problem too.

@dotproto
Copy link
Member

Seems like the extension service worker is different enough that it could have its own rules, if it wanted.

In general the Chrome extensions platform aims to be 'the web plus additional capabilities.' I don't think we'd want to do something like change top level await semantics in extension service workers as that could introduce a surprising set incompatibilities between these environments.

While I haven't personally tested it, extension service workers should be able to intercept fetch events for their own origin using standard service worker fetch interception just as you would on the open web.

@hanguokai
Copy link
Member Author

For normal website service worker, it only focuses on three events: self.addEventListener( install | activate | fetch, handler). But they have no meaning for extension service worker. Extensions focus on (lots of) extension events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants