Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency: StorageArea callback order #292

Open
lapcat opened this issue Oct 13, 2022 · 23 comments
Open

Inconsistency: StorageArea callback order #292

lapcat opened this issue Oct 13, 2022 · 23 comments

Comments

@lapcat
Copy link

lapcat commented Oct 13, 2022

In Chromium and Firefox, StorageArea.get() and StorageArea.set() have FIFO behavior. In Safari, they do not.

The MDN documentation assumes FIFO:

// store the objects
browser.storage.local.set({kitten, monster})
  .then(setItem, onError);

browser.storage.local.get("kitten")
  .then(gotKitten, onError);
browser.storage.local.get("monster")
  .then(gotMonster, onError);

The sample code on MDN works as expected in Firefox:

OK content.js:2:11
Moggy has 2 eyes content.js:6:11
Kraken has 10 eyes content.js:10:11

But not in Safari:

OK (content.js, line 2)
Unhandled Promise Rejection: TypeError: undefined is not an object (evaluating 'item.kitten.name') gotKitten (content.js:6)
Unhandled Promise Rejection: TypeError: undefined is not an object (evaluating 'item.monster.name') gotMonster (content.js:10)

(It's not clear from the console log, but gotKitten and gotMonster are actually called before setItem.)

The Chrome documentation also seems to assume FIFO:

chrome.storage.local.set({key: value}, function() {
  console.log('Value is set to ' + value);
});

chrome.storage.local.get(['key'], function(result) {
  console.log('Value currently is ' + result.key);
});

If you use for example const value = "testing"; then the sample code works in Chrome:

content.js:4 Value is set to testing
content.js:8 Value currently is testing

But not in Safari:

Value currently is undefined (content.js, line 8)
Value is set to testing (content.js, line 4)

The Safari team has said that this works as designed, and there's no guarantee of order, but Safari is inconsistent with Chrome and Firefox, and Safari is broken with the instructional sample code for the API on MDN. This situation can be confusing and unexpected to extension developers.

Whatever the community decides about this issue, the documentation and sample code ought to be updated to make the behavior clear, specifying whether it's undefined or defined FIFO.

@xeenon
Copy link
Collaborator

xeenon commented Oct 13, 2022

In Safari the storage is multi-threaded, database backed — truly async. Given the async nature of all of these calls, I don't think FIFO should be guaranteed.

This is a simple fix on the extension side by doing:

await browser.storage.local.set({kitten, monster});
let kitten = await browser.storage.local.get("kitten");
let monster = await browser.storage.local.get("monster");

Or:

browser.storage.local.set({kitten, monster}).then(() => {
    setItem();
    browser.storage.local.get("kitten").then(gotKitten, onError);
    browser.storage.local.get("monster").then(gotMonster, onError);
}, onError);

@xeenon xeenon added inconsistency Inconsistent behavior across browsers discussion Needs further discussion agenda Discuss in future meetings labels Oct 13, 2022
@lapcat
Copy link
Author

lapcat commented Oct 23, 2022

A few notes before the issue is discussed by the group.

I ported some working code from Chrome to Safari. I didn't discover the bug in my Safari extension for some time, because the code was more complex than the simple sample code given above, and the bug was more subtle. Consequently, the fix was somewhat different than the one suggested.

The issues as I see them:

  1. The MDN API documentation presumes a certain behavior, and the sample code from the documentation is in fact broken in Safari.
  2. There are hidden pitfalls in porting extension code from Chrome and Firefox to Safari, because Safari does not conform to the same behavior. The issue can of course be worked around, but only if extension developers are aware of the issue, which I don't think they are.
  3. Typically, extension developers are only interested in StorageArea.set() failures, so it's unclear whether calling StorageArea.get() in a callback is or ought to be standard practice, given that the value is already available to the extension at the initial call site. The concern is really when the set and the get occur in different functions and contexts.

@erosman
Copy link

erosman commented Oct 24, 2022

re: FIFO (first in, first out) of sequentially queued async operations of the same API (i.e. storage)

StorageArea.remove() & StorageArea.clear() are also noteworthy.

// normally there isn't any conflict & FIFO shouldn't matter in remove()
browser.storage.local.remove("kitten");
browser.storage.local.set({monster});

// although not logical but just in case, where FIFO does matter
let kitten = { name: 'Lucy' };
browser.storage.local.remove("kitten");
kitten = { name: 'Balla' };
browser.storage.local.set({kitten, monster});

// clearing storage and starting fresh, where FIFO does matter
browser.storage.local.clear();
browser.storage.local.set({kitten, monster});

@carlosjeurissen
Copy link
Contributor

carlosjeurissen commented Oct 25, 2022

The storage methods are asynchronous by design. So the Safari behaviour is expected and if the implementation of Chrome and Firefox changes, it could behave exactly like Safari without any note to developers. It comes down to this:

Say you have some code like this:

console.log('a');

browser.storage.local.set({
  something: true
}, () => {
  console.log('b');
});

console.log('c');

It would display this in the console:
"a", "c", "b".

This is something mdn could be more clear on when it comes to their example code by using the example from @xeenon.

The Safari behaviour also seems to result in faster responses for these API calls as it does not have to keep a queue in the background.

An easy fix would be to keep a local reference to the stored value like such:

let currentValue = "value";

function doSomethingWithStorage () {
  console.log(currentValue);
}

async function restoreValue () {
  const result = await browser.storage.local.get(['key']);
  currentValue = result.key;
}

function setNewValue (newValue) {
  currentValue = newValue;
  browser.storage.local.set({key: newValue});
}

function changeAndRefresh (newValue) {
  setNewValue(newValue);
  doSomethingWithStorage();
}

async function init () {
  await restoreValue();
  doSomethingWithStorage();
}

@lapcat
Copy link
Author

lapcat commented Oct 25, 2022

The storage methods are asynchronous by design. So the Safari behaviour is expected and if the implementation of Chrome and Firefox changes, it could behave exactly like Safari without any note to developers. It comes down to this:

Say you have some code like this:

console.log('a');

browser.storage.local.set({
  something: true
}, () => {
  console.log('b');
});

console.log('c');

It would display this in the console: "a", "c", "b".

Everyone understands that the storage methods are async, and that this sample code would output "a", "c", b". So it doesn't actually come down to this. The console.log('c'); command is not async or using the storage API, so the example here erases the crucial elements to the bug.

@carlosjeurissen
Copy link
Contributor

@lapcat let me clarify by adding the get request here.

console.log('a');

browser.storage.local.set({
  something: true
}, () => {
  console.log('b');
});

console.log('c');

browser.storage.local.get("something", () => {
  console.log('d');
});

console.log('e');

From what is mentioned in the post. Chrome and Firefox would output:
"a", "c", "e", "b", "d"

While Safari outputs:
"a", "c", "e", "d", "b"

The point I was trying to make is by design, the APIs are asynchronous so if "b" gets returned before or after "d" is undetermined and something the programmer can not rely on.

@lapcat
Copy link
Author

lapcat commented Oct 25, 2022

The point I was trying to make is by design, the APIs are asynchronous

As I said in my last comment, everyone knows this. There's no confusion about what an async callback is. We're all professionals here.

so if "b" gets returned before or after "d" is undetermined and something the programmer can not rely on.

This does not follow. There are async queues that are still FIFO. It's commonplace in computing, and what I'm trying to say is that it's an implicit assumption of the API that's actually already documented to an extent.

@hanguokai
Copy link
Member

From the perspective of database consistency, consistency means AFTER a successful write, update or delete of a Record, any read request immediately receives the latest value of the Record.

So what means browser.storage updated successfully? I think when the callback is called and there is no runtime.lastError (or the promise fulfilled, not reject), the storage is updated successfully. After this, it is guaranteed to read the updated data.

For Chrome's document, I think it is just an example for how to use get and set. The docs explicitly say it may fail, and set runtime.lastError. For MDN, the docs says the return value is a Promise that will be fulfilled if the operation succeeded. If the operation failed, the promise will be rejected with an error message.

@lapcat
Copy link
Author

lapcat commented Oct 25, 2022

From the perspective of database consistency, consistency means AFTER a successful write, update or delete of a Record, any read request immediately receives the latest value of the Record.

The set() and get() calls are both async (and could both fail), so "immediately" doesn't really apply here.

According to the principle that the developer can't depend in any way on the order of the callbacks, there's no guarantee that the value isn't stale even if get() is called inside a set() callback, because there's no guarantee that other set() calls and successful callbacks occur between the time that get() is called and its callback is called.

Safari's implementation is multi-threaded, but with multi-threaded code where no ordering is guaranteed, multiple competing threads have the ability to acquire a lock, and while the lock is held by a thread, no other thread can change the shared data. The extension storage API is not locking, though, which makes it difficult for the developer to reason about the consistency of the storage, and why FIFO makes sense.

@hanguokai
Copy link
Member

there's no guarantee that the value isn't stale even if get() is called inside a set() callback, because there's no guarantee that other set() calls and successful callbacks occur between the time that get() is called and its callback is called.

Maybe. At least, the callback means the set operation is complete. So if not in set's callback, there is no guaranteed the set operation is complete. If no other set operation, get() in set's callback is guaranteed to read the updated data. If this does not satisfy your needs, I think browser.storage is not suitable. Try localStorage or IndexedDB.

@bershanskiy
Copy link
Member

bershanskiy commented Oct 25, 2022

Try localStorage or IndexedDB.

I would caution against this, since localStorage/sessionStorage and IndexedDB have their own incompatibilities:

  • they are limited only to document contexts, so they are not available in MV3 background service workers. One might have to port MV2 extension to MV3 at some point (e.g., by summer of 2023 or earlier if they really care about CWS listing); then they will have to move data or use tricks like accessing data only via popup UIs or iframes within pages, which is inconvenient.
  • localStorage/sessionStorage and IndexedDB are not controlled by storage/unlimitedStorage permissions and are instead just regular Cookie/"cache" storage. This means they are unavailable if user blocks cookies in third-party contexts or even first-party contexts. In particular, DevTolls pages are considered third-party to the DevTools themselves. Also, localStorage/sessionStorage and IndexedDB may suddenly be cleared by the user manually or by browser if it attempts to reclaim some storage.

@bershanskiy
Copy link
Member

@lapcat If you need operation order coherence like you seem to describe, the best thing to do is to store your data in variables local to the current execution context (instant synchronous access) and then just write them into StorageArea asynchronously. In general, you might have a problem of very frequent storage writes (e.g., on sync storage). I have seen lots of people using simple debouncing to work around this. There is also this monstrosity which guarantees instant writing without data races: https://github.com/darkreader/darkreader/blob/main/src/utils/state-manager-impl.ts

@lapcat
Copy link
Author

lapcat commented Oct 25, 2022

@lapcat If you need operation order coherence like you seem to describe

Well, this issue is simply a report of web browser inconsistency, not a support request or a feature request.

I didn't even know that I "need" operation order coherence, because it just happens automatically in Chrome and Firefox. It wasn't until porting code to Safari that I eventually discovered a problem. I'm aware of Safari's unique behavior now, so I can deal with it in my own code, but I'm not sure how many other developers are aware. I don't think that advice is on topic here; the only questions are whether the browser implementations should be consistent, and whether the MDN documentation should change.

@hanguokai
Copy link
Member

  1. In context of "Try localStorage or IndexedDB", I said "if this does not satisfy your needs", i.e. under extreme conditions (get one updated data when there are multiple set at the same time). That condition definitely does not exist in most cases.

  2. IndexedDB is available in both window and worker context.

  3. I don't mention sessionStorage, that data is not persistent.

localStorage/sessionStorage and IndexedDB may suddenly be cleared by the user manually or by browser if it attempts to reclaim some storage.

Are you sure for this. Browsers usually don't clear extensions data by default.

  1. Why localStorage or IndexedDB are related to cookie in extension pages?

@erosman
Copy link

erosman commented Oct 25, 2022

From what I understand, this topic has not been about how to use async functions, or how to use storage API, or how to use alternative API for storage.

Topic is simply enquiring about FIFO (in the same API i.e. storage) as outlined in MDN examples (and in test).

  • Does FIFO exist in storage API or not?
  • Is its existence (or lack thereof) consistent across browsers?
  • Should documentation reflect its existence (or lack thereof)?

Simple FIFO Test:

  • Does 'John' get saved first or 'Lucy'?
  • What is the eventual value of 'name'?
    • If FIFO exits, it will be 'Lucy'
    • If FIFO doesn't exist, it can be either
browser.storage.local.set({name: 'John'});
browser.storage.local.set({name: 'Lucy'});

@hanguokai
Copy link
Member

FIFO implies that all operations(get/set/remove) are in a queue waiting for previous operations to complete. No document say that. So I support the second comment in this issue. The callback is used to guaranteed the operation is complete. I suggest improving the documentation to clarify this.

@erosman
Copy link

erosman commented Oct 25, 2022

FIFO implies that all operations(get/set/remove) are in a queue waiting for previous operations to complete.

That depends on whether there is a locking mechanism or not. I am only guessing but often databases employ locking mechanism to prevent data corruption when multiples processes attempt to write to the same data location. The lock is released when a process is completed therefore, they end up in a queue.

@carlosjeurissen
Copy link
Contributor

It seems logical any write operations (set, remove, clear) should be FIFO by using a queue. I checked in all browsers and this seems to be the case right now. It would be good to get a confirmation on this from the browser vendors.

The question then is, should read operations (get) be on the same queue. Having FIFO for get as well:
Pros: potential simpler extension code and less bugs as some devs may expect FIFO on get calls
Cons: slower responses as it needs to consider queues and wait on items on the queue

Personally I would be in favour of having the get operations not using a queue as it leads to higher performance in extensions.

@lapcat
Copy link
Author

lapcat commented Oct 26, 2022

The question then is, should read operations (get) be on the same queue.

I think the question is whether the web browsers should be consistent with each other and with the documentation.

Personally I would be in favour of having the get operations not using a queue as it leads to higher performance in extensions.

Has anyone actually measured a performance difference, and if so, how much is it? The call is async either way.

In any case, it seems unlikely that both Chrome and Firefox would change their implementations to match Safari. And maybe Safari won't change its implementation to match Chrome and Firefox either. But at the very least, the API documentation is currently misleading.

@hanguokai
Copy link
Member

Back to the topic, I suggest only clearing up this misunderstanding in the documentation so that developers don't assume any order of execution results. If order is important, use await statement or nested callback.

@carlosjeurissen
Copy link
Contributor

carlosjeurissen commented Oct 27, 2022

We discussed this during 2022-10-27 meeting. There was consensus on having no guarantee on the order of the calls except for write-calls.

So it is probably best to proceed with updating the documentation in mdn.

Can browser vendors confirm the write-calls actually do use a queue?

@carlosjeurissen carlosjeurissen removed discussion Needs further discussion agenda Discuss in future meetings labels Oct 27, 2022
@carlosjeurissen carlosjeurissen removed the inconsistency Inconsistent behavior across browsers label Oct 27, 2022
@Sxderp
Copy link

Sxderp commented Nov 2, 2022

If you need operation order coherence like you seem to describe, the best thing to do is to store your data in variables local to the current execution context (instant synchronous access) and then just write them into StorageArea asynchronously.

I just want to point out that this will not hold true forever. With the move to "ServiceWorkers" caching data like this will be invalid. When the "ServiceWorker" hits the "execution limit" it will get killed and you lose all your state. This is actually a major reason that many are upset about the "ServiceWorker" switch (go read the other issues if you want more context).

@Rob--W
Copy link
Member

Rob--W commented Nov 10, 2022

Can browser vendors confirm the write-calls actually do use a queue?

The typical implementation is for an API call to be processed in the order of invocation. If order is important, you should await the result. That also ensures that if an error occurs at one of the calls, that there is a predictable failure mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants