Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Clarification / spec text correction wrt when is the "KeepDuringJob" Set cleared? #39

Closed
marjakh opened this issue Oct 29, 2018 · 21 comments

Comments

@marjakh
Copy link
Contributor

marjakh commented Oct 29, 2018

Can it be cleared between microtasks?

The current spec text talks about associating the set with Job (i.e.,microtask?). If we implement it like this, the WeakRef can be cleared between microtasks.

On the other hand, the slides (slide 17) say "A program cannot observe a target getting reclaimed within the execution of a job (turn of the event loop)" (not microtask queue).

E.g.,

Promise.resolve().then(() => { wr = wf.makeRef(...); }).then(() => { wr.deref(); });

... can the deref() now return undefined?

@gsathya
Copy link
Member

gsathya commented Oct 29, 2018

I thought about this a little more and I'm not sure what the right answer should be.

If we compare the behavior of the synchronous case:

let wr = wf.makeRef(...);
wr.deref(); // <-- this can never fail

with the promise version:

Promise.resolve()
  .then(() => wf.makeRef())
  .then(wr => wr.deref()) <-- this was in the same turn, no reason to fail!

then it looks clear to me that the deref shouldn't fail.

But if you consider async functions:

async function foo() {
  let wr = wf.makeRef(...);
  wr.deref(); <-- never fails
}

async function foo2() {
  let wr = wf.makeRef(...);
  await fetch(...);
  wr.deref(); <-- could potentially fail
}

It's not super clear to me now that we have value in waiting until the end of the turn because you can never depend on this unless you're sure that your promise chain is always resolved in the same turn.

Maybe making this have consistent behavior (ie, "wait until end of turn") that we can reason about is worthwhile in itself.

@bmeurer
Copy link

bmeurer commented Oct 30, 2018

How about keeping the set until the end of the current turn, i.e. only clear it after all microtasks from the current turn are processed and execution goes back to the main event loop? That seems (a) more consistent - i.e. less surprising - and (b) more efficient.

@bmeurer
Copy link

bmeurer commented Oct 30, 2018

@hannespayer just mentioned that there was agreement earlier this year, that the KeepDuringJob set is cleared after all the microtasks are processed. So in terms of the HTML spec, that means we introduce a new step into the perform a microtask checkpoint algorithm right before or after the Cleanup Indexed Database transactions step.

@tschneidereit
Copy link
Member

It's almost certainly best to keep references strong until the end of the current task, not microtask. The reason is slightly more subtle, though, and it won't help with @gsathya's async function example.

It won't help, because the fetch could take arbitrarily long. In fact, IINM it's guaranteed to not happen before the next task.

There are two reasons for making references strong after deref (or construction of a WeakRef):

  1. To guard developers against intermittent breakage caused by variable GC timing
  2. To avoid programs breaking from slight changes in GC timing between engines or versions of an engine

Of these, the second one is more crucial, because it could become a hazard for future optimizations.

We could theoretically do things like having suspended async function frames keep all WeakRefs deref'd in—or under—them strong. In practice, I think that'd almost certainly lead to leaks and cause more issues than it'd solve.

I have some ideas for how devtools can help with the hazards here, but I don't think they're relevant to the spec.

joyeecheung pushed a commit to joyeecheung/v8 that referenced this issue Nov 6, 2018
- Add the WeakRef class and its deref() function.

- Add WeakFactory.prototype.makeRef

- Implement the "keep during job" behavior for WeakRef constructor and deref().

- Here we keep the targets alive longer than until the end of the job
  (microtask), contradicting the spec. However, this is probably the indended
  behavior, see tc39/proposal-weakrefs#39 .

BUG=v8:8179

Change-Id: I41990d41ac1799e34f675d8431b9a7aa7ed3d48d
Reviewed-on: https://chromium-review.googlesource.com/c/1306435
Commit-Queue: Marja Hölttä <[email protected]>
Reviewed-by: Benedikt Meurer <[email protected]>
Reviewed-by: Sathya Gunasekaran <[email protected]>
Reviewed-by: Ulan Degenbaev <[email protected]>
Cr-Commit-Position: refs/heads/master@{#57242}
@littledan
Copy link
Member

The current spec text tries to indicate that this group is cleared after all microtasks are run (though it could use some editorial cleanups).

@domenic, We've been discussing the hazards of using the "microtask checkpoint" timing for too much, and although I'm currently leaning towards "task" timing for finalization callbacks, I'm pretty persuaded by @gsathya @bmeurer and @tschneidereit 's arguments above. What do you think?

@domenic
Copy link
Member

domenic commented Apr 17, 2019

I don't really see any arguments above the address my concerns, or explain why after-microtasks is better than task timing. Maybe because I just am not immersed enough in the space to understand these examples.

To reiterate, there is a continual desire from specs to go "after everything else", which is problematic, and will lead to constant one-upmanship. We made this mistake once with IndexedDB transactions, which manage to go last right now. You're proposing usurping their place in the "goes last" pecking order, and so I think at the very least you'd need to coordinate with the IndexedDB spec authors and implementers to see if that's OK.

I then encourage you to imagine the future, after we've built up this two-specs-strong "goes last after microtasks" precedent. When a third spec wants to join your ranks, and go last, you'll need to negotiate with them as to whether they're allowed to go after group clearing, or before, and how their goals of going-last conflict with your goals of going-last.

Using tasks, i.e. a FIFO queue with optional UA-defined switching via between task sources, seems like a much better strategy to me.

I don't intend to block on these concerns, as I don't have the context necessary to understand the issues deeply. I just want to lay them out, and get some assurance that the champions have considered them, and are OK with signing up for negotiating with the IDB spec authors/implementors, as well as any future spec authors who want to build on their "my spec goes last" precedent.

@littledan
Copy link
Member

littledan commented Apr 17, 2019

@domenic I agree with you that we should call finalizers callbacks in a task. I wanted to thank you for explaining the motivation of this, and I plan to make a PR to update this specification to that timing.

This thread is about clearing the list of objects that is kept alive. E.g., if you deref a WeakRef once, it will still be possible to deref it later within the same little piece of code. The list is cleared at some point in time, when we want to relinquish things to the GC. If we say clearing the list happens at a microtask checkpoint, it prioritizes releasing the memory as high as possible, which seems appropriate.

Microtask checkpoint timing rather than task timing is more motivated by "going earlier" rather than "going last". Releasing the references when we get out of a synchronous code chunk seems like the right time to me; I am not sure why this should wait for a task. Maybe if we said it was a task on its own task source, this would allow implementations to do the same thing, since they would be allowed to schedule this first, but it seems more obscure.

@domenic
Copy link
Member

domenic commented Apr 17, 2019

Thanks for explaining. I can't see any observable difference between queuing a task and using after-microtask timing in this case.

@mhofman
Copy link
Member

mhofman commented Apr 17, 2019

In the following example, wouldn't there be a difference:

async function foo(weakRef) {
  if (!weakRef.deref()) return;
  weakRef.deref().bar();
  await new Promise.resolve();
  // could fail if agent released target after previous microtask and collected it immediately, 
  // but not if held by agent until a new full task.
  weakRef.deref().bar(); 
}

@domenic
Copy link
Member

domenic commented Apr 17, 2019

Your "could fail" comment would be true with between-microtask timing. I'm referring to the lack of difference between queuing-a-task timing and after-microtask timing.

@mhofman
Copy link
Member

mhofman commented Apr 17, 2019

Right, my bad.

@littledan, Regarding the order of the steps in DoAgentFinalization, I'm wondering why the agent.[[KeptAlive]] list in emptied after queueing cleanup jobs.

Imagine the following:

let fg1 = new FinalizationGroup(items => console.log("In 1", ...items));
let fg2 = new FinalizationGroup(items => console.log("In 2", ...items));

let wr = new WeakRef({});
fg1.register(wr.deref(), "weakRef held object");
fg2.register(wr.deref(), "weakRef held object");
fg1.register({}, "regular object");

Now let's assume the GC is capable of immediately collecting young "regular" object, before the DoFinalizationJob is executed.

Step 2 will enqueue a FinalizationGroupCleanupJob for fg1, but not fg2 since the agent is still holding the WeakRef's target.

Step 3 releases the held object, and the GC is capable of immediately collecting it as well.

When FinalizationGroupCleanupJob is executed for fg1, the cleanupCallback's iterator will contain the holdings for both cells: the "regular" object that triggered the cleanup, and the "weakRef held" object. fg2's cleanup won't be executed until another turn.

// "In 1" "weakRef held object" "regular object"
// Task turn
// "In 2" "weakRef held object"

If reversing step 2 and 3, the engine would be allowed to execute the cleanup jobs for both groups. Did I miss something?

@littledan
Copy link
Member

@domenic If we put it on its own task source, I agree that there is no observable difference. If it's on a task source with other things, then it might mandate holding this alive longer.

@annevk
Copy link
Member

annevk commented May 9, 2019

I realized there's an observable difference, UI input events coming from the user (i.e., not synthetic events). For that case there's a single task with a microtask checkpoint after each event listener callback.

It's not clear to me how this affects the rationale in #39 (comment).

(It does not seem to desirable to me to create a special case for this case, for what it's worth.)

@littledan
Copy link
Member

littledan commented May 9, 2019

I'd rather stick with microtask checkpoint timing. Till's rationale didn't seem to rest on this distinction. Microtask checkpoint timing may be how some JavaScript implementations are planning on making this actually work, and it seems sensible to me. I don't think web browsers are planning on actually queueing a task to clear this list.

@domenic
Copy link
Member

domenic commented May 9, 2019

My main worry in that case is that the JavaScript engine teams are not sufficiently coordinating with the IndexedDB teams at each browser, and perhaps with the more general web platform team responsible for the event loop. Decisions which make sense locally within one engineering context can be more troubling in the larger context.

If folks could work toward reassuring me that they've had those conversations, and that support from TC39 members for a given semantic translates to support from their larger browser teams for that same semantic, I'd be much more comfortable.

@ajklein
Copy link

ajklein commented May 9, 2019

Sorry I'm late to this thread, but is there any conceptual advantage to microtask over task timing? I've been hearing plenty of complaints from folks working on scheduling APIs for the web platform about large microtask queues making it hard to intelligently schedule work (since they run as an uninterruptible block).

@littledan
Copy link
Member

Maybe @inexorabletash could help us understand these IndexedDB things, and point us to others who would also be good to consult. I'd suggest clearing the list after IndexedDB commit, but more IndexedDB expertise would be helpful.

As far as the "megatasks" scheduling concern, I'm not sure that clearing a list is the same kind of thing as scheduling a bunch of CPU-intensive JS that would be better to deprioritize. Holding these references alive longer than necessary would increase memory pressure, and the operation is cheap computationally (I think), so it makes sense to prioritize.

Conceptually, if microtask checkpoints happen when the JS stack is empty (in other words, when returning totally out of JS), that's a natural point to stop holding WeakRefs which have already been deref'd alive. The rationale for holding them alive is to prevent inconsistency when a piece of JS code is dereferencing them, and to avoid garbage collection from inserting itself into the middle of JS code. So, when the JS callstack empties, it's a great time to cut that loose.

@inexorabletash
Copy link

inexorabletash commented May 9, 2019

Background on IndexedDB:

IDB transactions do an inactive→active→inactive transition during event dispatch. They autocommit if they have no further work when becoming inactive. One pain point for developers w/r/t microtasks was understanding if microtasks execute within the event dispatch i.e. does the inactive transition occur before or after microtasks run. We standardized that ages ago (within, so after), browsers have converged and libraries now depend on this.

There are also two IDB-specific inter-spec hooks:

  1. The explicit hook to clearing transactions at the end of a "turn"
  2. Observing exceptions thrown during event dispatch

The first was mentioned above; it's needed because transactions are created active synchronously, and there's no end-of-dispatch to make them inactive, so the end-of-turn cleanup hook is used. The second just captures a boolean so that exceptions thrown during event dispatch can affect the commit behavior.

I don't think any of these three requirements (requiring microtasks to run within a dispatch; hook for when a turn ends; hook for knowing if exceptions were thrown during dispatch) end up being sensitive to timing of weakref clearing. Work that follows on from these points is asynchronous and necessarily completes in a different turn. So I don't think the ordering between a "cleanup weakrefs" and the "cleanup IDB transactions" steps in "perform a microtask checkpoint" would be observable.

@ajklein
Copy link

ajklein commented May 9, 2019

@littledan Thanks for the clarification, I knew I was stepping into the middle of a conversation, should have read a lot closer. If this is just about clearing the references, not about running finalizers, microtask timing is OK with me at least WRT blocking the thread.

@littledan
Copy link
Member

Yes, to clarify, the timing for running finalizers in HTML is a separate queued task per FinalizationGroup; see #96 (comment) .

@littledan
Copy link
Member

We've decided to let the host decide when to schedule clearing the KeepDuringJob set, in #86 . HTML does this as part of its "microtask checkpoint" algorithm.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants