Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cosmic-swingset): poll timer after swingset quiescent #7423

Merged
merged 1 commit into from
Apr 17, 2023

Conversation

mhofman
Copy link
Member

@mhofman mhofman commented Apr 14, 2023

refs: #6964

Description

Drain the kernel queue before polling the timer

Security Considerations

None

Scaling Considerations

This should prevent userland logic creating faulty timers to compound their problems by not advancing the time and invoking new callbacks while previous timer triggered logic is still running.

Documentation Considerations

None

Testing Considerations

Unsure how to test this. As with most cosmic-swingset changes, we rely mostly on integration tests.

@mhofman mhofman requested a review from warner April 14, 2023 19:57
Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure @btulloh and @rowgraus are cool with this change, as it means things like time-triggered liquidations will be deferred in favor of pre-existing work (actions that have already started, but haven't finished yet, not new actions coming from the queue).

Assuming that, I'm happy with this change as long as you remove the if (addedToQueue) condition.

`polled; blockTime:${blockTime}, h:${blockHeight}; ADDED =`,
addedToQueue,
);
if (addedToQueue) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to runSwingset() independently of the return value of timer.poll().

I made timer.poll return "there is work to be done" so that solo apps could do less work in the no-work-to-be-done case. But I remember we had determinism problems (maybe during replay/restart?) when we allowed it to avoid work in a consensus-based kernel, so we stopped using the return value.

I'll look for the relevant issue/change, but to be safe, I'd recommend removing the if (addedToQueue) condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7703aa7#diff-cb472ee053cfbc36ebbae1abba0c95dc1f6c2775c90dc563e0b8f8307ac48e4fR134 is one relevant fix (three years ago, wow). At that time, we did the timer.poll() during BEGIN_BLOCK, and ran the kernel immediately iff it added anything. That was broken because running the kernel is also what committed the state, and a timer.poll will change state (recording the new time) even if nothing is triggered.

I'll keep poking around for others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just re-read the timer-device.js code, and now I think making c.run() conditional on the return value of timer.poll() is probably ok. The important constraint is to always always commit the kernel DB. timer.poll() has state side-effects that need to be committed, even if it doesn't result in work that needs to be done (so no controller.run()), I think that was the mistake we made last time. I'm not 100% sure how those state changes (kvStore entries for device-timer's state.now) get incorporated into the crankhash/activityhash if we don't do controller.run, that's the one remaining thing I need to investigate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least swingstore exports are not flushed until a crank ends, which means we must attempt to run the kernel even if nothing was queued to generate an empty crank. It's likely activityHash would also not get updated correctly without it.

So yeah we must run unconditionally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for my reference later: "swingstore exports" refers to the "swingstore export" feature we added to support state-sync snapshots: the "Export Data" associated with swing-store changes are streamed to a callback function, but we deliberately delay those updates if we think we might unwind the changes associated with a delivery. In this case, these changes were being delayed until some subsequent actual delivery. This also raises the concern that the first delivery of the subsequent run might get rewound, thus losing the unrelated device state changes that were made before c.run() started)

@warner warner added the cosmic-swingset package: cosmic-swingset label Apr 14, 2023
@mhofman mhofman added the automerge:rebase Automatically rebase updates, then merge label Apr 17, 2023
@mhofman mhofman force-pushed the mhofman/6964-timer-when-kernel-quiescent branch from 38a50db to c6df9ad Compare April 17, 2023 18:42
@mergify mergify bot merged commit 226b7f5 into master Apr 17, 2023
@mergify mergify bot deleted the mhofman/6964-timer-when-kernel-quiescent branch April 17, 2023 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge:rebase Automatically rebase updates, then merge cosmic-swingset package: cosmic-swingset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants