Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip unnecessary event check #4517

Closed

Conversation

leviok
Copy link
Contributor

@leviok leviok commented Oct 21, 2020

Deal state checking currently takes a long time. With an average of 5 blocks per epoch, this can significantly reduce time-consuming.

As I mentioned in PR and slack. I had to turn off deal reception twice due to the issue.

After this modification, my miner can handle deal state correctly now. But this is still temporary. as number of deals increased, event checking will fail to keep up with the chain again.

additional info:

  1. My miner currently has 59086 deals
    image
  2. The performance of deal event checking with PR is below. It's running on a 3970x with nvme driver.
events  events/events_called.go:136     checkStateChanges, elapsed: 16231
events  events/events_called.go:136     checkStateChanges, elapsed: 16351

@whyrusleeping
Copy link
Member

@leviok is the problem here when we expand tipsets? like, theres a 5 block tipset, and you accept one block at a time? I think this makes sense, but @magik6k is the right person to give the real 👍

@leviok
Copy link
Contributor Author

leviok commented Oct 23, 2020

@whyrusleeping @magik6k @hannahhoward
nope, the problem is a little complicated. let me try to explain it more clearly.

when new tipset incoming, it triggers headchange in events from chainnotify api.

every deal has a matcher to check in this loop. This is a very time-consuming operation. even use this PR, it still costs about 16s with 59086 deals.

if every block (which is 5 in a tipset) costs 16s, it will heap the buffer in chain notify up, and cause tipset cache can't keep up with the chain.(entire headchange should complete less than 30s in a epoch)
then, it will lead to check failed here. This will interrupt the deal processing and cause the piece data to be cleaned up. Then the sector that has entered the sealing will not be able to pack because the piece data cannot be found.

I investigated some code above.
deal committed checking, queueForConfidence, checkNewCalls, applyWithConfidence, applyTimeouts.
it seems like all of them only dependency ts parent state or ts height. so i skipped the processing when last ts has the same height and the same stateroot as new ts. it maybe skipped the other 4 times of processing in a tipset.

but the event checking and handleing still need more optimize. it is still the bottleneck of deal processing.

Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change, but it looks like some tests need to be fixed

@dirkmc
Copy link
Contributor

dirkmc commented Oct 23, 2020

OnDealSectorCommitted is called by markets from VerifyDealActivated. It should only be actively checking the state for as long as it takes for the deal to be committed to a sector and to become active.

@leviok with regards to the number of deals you mentioned above 59,086: Approximately how many of these deals are in the process of becoming active at the same time?

@leviok
Copy link
Contributor Author

leviok commented Oct 23, 2020

@dirkmc

events  events/events_called.go:398     matcher count: 34929
events  events/events_called.go:136     checkStateChanges, elapsed: 15895
events  events/events_called.go:398     matcher count: 34929
events  events/events_called.go:136     checkStateChanges, elapsed: 15898

@leviok
Copy link
Contributor Author

leviok commented Oct 23, 2020

every deal checking need to load state of f05 here. i think cache it or something else changes with the same effect make sense.

@dirkmc
Copy link
Contributor

dirkmc commented Oct 27, 2020

I think the issue here is the way that the events API state matching works:

  • StateChanged registers a listener for a state change, with a matcher that compares TipSets
    eg the matcher for OnDealExpiredOrSlashed checks if the deal state has changed for a particular deal ID
  • Each time there is a new TipSet, the events API
    • Runs all matchers against the previous and current TipSets to see if something changed.
    • If something changed, schedules an event to fire at the confidence epoch.

There are a couple of problems:

  • Matchers never get removed, so even after OnDealExpiredOrSlashed completes, the matcher still keeps running every time there is a new tipset.
    We should remove matchers once the event fires or times out.
  • Each matcher is run independently, so for example the matcher for OnDealExpiredOrSlashed drills down to the same place in the state tree for every deal.
    We should use some mechanism that allows us to just add deal IDs to an existing matcher, or uses caching to avoid repeatedly loading the state.

@magik6k
Copy link
Contributor

magik6k commented Jan 30, 2021

Fixed in #4623, and then improved in a bunch of other PRs

@magik6k magik6k closed this Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants