Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ledger: shorter deltas lookback implementation (320 rounds project) #4003

Merged
merged 92 commits into from
Jul 6, 2022

Conversation

algorandskiy
Copy link
Contributor

@algorandskiy algorandskiy commented May 17, 2022

Summary

Main

Reduce deltas size from 320 to 8 by introducing a new online accounts tracker that preserves history of state (online/offline, stake, protos) for at least MaxBalLookback = 2 x SeedRefreshInterval x SeedLookback rounds back from Latest. New data are stores in new tables and they are excluded from catchpoints.

Additional

  1. TxTail stores its data into a table as well in order to prevent full blocks loading on startup
  2. TxTail stores MaxTxnLife + DeeperBlockHistory blocks
  3. TxTail caches up to MaxTxnLife + DeeperBlockHistor recent block headers
  4. Catchpoint generation is made in two stages: data file at X-320 and catchpoint itself at round X.
  5. Voters subtracker can extend longer than MaxBalLookback history persisted in DB.

Performance impact

  1. Regular nodes see ~3x memory consumption decrease on high 6,000 TPS load.
  2. Archival nodes demonstrate near the same memory pressure around catchpoint round as master branch.
  3. More tests are in progress

Test plan

  1. Unit tests + coverage check
  2. Manual tests on private nets, Beta, Test, Main - upgrades, catchup, fast catchup, incremental network update.
  3. Instrumented nodes runs for LookupAgreement/OnlineTotals and catchpoint verification on private nets, Beta, Test, Main
  4. Long betanet, testnet, mainnet nodes run.

algorandskiy and others added 27 commits March 18, 2022 19:05
At the moment the new tracker has the same logic as account updates
but has own onlineacctbase round and own committing logic

Because of trackers' initialization logic
the new onlineacctbase and old acctbase must be synchronized.
This is done at the end of onlineAccounts.produceCommittingTask

Disclaimer: eventually most of this code will be removed
but this PR allows independent work on removing 320 rounds from account updates
and implementing the history storing in online accounts tracker
merge master into feature/320-rounds
Move the transaction tail storage to be on a tracker database table txtail
* online accounts DB support
* tests
* Added LRU cache to track the most recent updates into onlineaccounts table
* Added online accounts tests for updates and expirations
* TODO: add another cache for looking up since accounts can change
and we need answers about specific round, not the most recent write round
New table and data structure to track online totals and protocol versions history
* Add CatchpointLookback=320 to agreement
* Add MaxAcctLookback=8 to options
* Updated tests
* Cache block headers in txtail
* Introduce DeeperBlockHeaderHistory consensus parameter
TestAcctOnlineRoundParamsCache was failing because
it randomly increments rewards over 640 rounds and
caused an overflow on the reward calculation.
Fixed by limiting the number of rounds run in the test to 200
by changing the maxBalLookback consensus param.
Fixed t.lowWatermark value in loadFromDisk to match master as discovered by a test
ledger/tracker.go Outdated Show resolved Hide resolved
Comment on lines +715 to +716
func (l *Ledger) BlockHdrCached(rnd basics.Round) (hdr bookkeeping.BlockHeader, err error) {
l.trackerMu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be BlockHdr, not BlockHdrCached. Just like with other lookup routines, if you ask for something with the rnd too early, it can fail. But we don't call them all lookupResourceCached and so on. Those fail if rnd if before the dbround, this one fails if asked before txtail length. But the caller probably thinks of them very similarly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I tried this. See #3935 (comment)
TLDR: replay deadlocks.

Copy link
Contributor

@jannotti jannotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blockheader cache questions

synchronousMode: db.SynchronousMode(cfg.LedgerSynchronousMode),
accountsRebuildSynchronousMode: db.SynchronousMode(cfg.AccountsRebuildSynchronousMode),
verifiedTxnCache: verify.MakeVerifiedTransactionCache(verifiedCacheSize),
cfg: cfg,
dbPathPrefix: dbPathPrefix,
}

l.headerCache.maxEntries = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we eliminate this cache? It seems that it is a small heap that replicates a small portion of the blockheaders that we have to maintain for 1001 rounds anyway. The lookups in txtail can be constant time so why bother with an LRU heap for 10 of them?

It seems like that's the only use of heapLRUCache, so we get to eliminate runtime cost and the code.

I may be missing something, because I also don't understand the code for blockQueue.getBlockHdr. It seems to be doing a lot of work, but now the txtail offers a simple lookup, I think.

So I suppose this is actually a followup on why I was surprised that BlockHdrCached() was called that. It's because we already have BlockHdr(), but now my question is why the BlockHeaderCached() method can't take the place of BlockHdr() and we can remove a lot of other code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a plan to revisit this after the merge and use BlockHdr only eventually. This will be done as part of locking light refactoring project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State proofs also need a block header cache, plan is to review with @algoidan and efforts to make block headers available to their work without duplicate caches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Rename round -> rnd column in new table for consistency
* Other PR feedback
Copy link
Contributor

@jannotti jannotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more questions before tomorrow's meeting.

ledger/acctonline.go Outdated Show resolved Hide resolved
ledger/acctonline.go Outdated Show resolved Hide resolved
ledger/accountdb.go Show resolved Hide resolved
roundHash = append(roundHash, crypto.Hash(data))
expectedRound--
}
// reverse the array ordering in-place so that it would be incremental order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we read them with ORDER BY rnd DESC and then reverse the results, rather than read them by ASC?

I think I see that it's so you can calculate the roundHashes. Are they explained somewhere? I'm really surprised we're hashing the msgpack form of the txtail record.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can read in reverse but the idea is to latest to 1000 back and then check expected size (rnd number outside).
I'd change it as a separate PR.

Comment on lines +68 to +72
roundTailHashes []crypto.Digest

// blockHeaderData contains the recent (MaxTxnLife + DeeperBlockHeaderHistory + len(deltas)) block header data.
// The oldest entry is lowestBlockHeaderRound = database round - (MaxTxnLife + DeeperBlockHeaderHistory) + 1
blockHeaderData map[basics.Round]bookkeeping.BlockHeader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the hashes exactly yet, but it does seem like roundTailHashes and blockHeaderData ought to be managed the same way. They seem to have the same amount of data in them. But roundTailHashes is managed as a slice that keeps losing its front entries, while blockHeaderData is a map where we delete the ones we don't care about any more. Why not handle them the same way? For that matter, why not stick them both in a struct and manage them together?

Copy link
Contributor

@jannotti jannotti Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a little circular buffer called Recents. It does pretty well.

go test -run ^NOTHING -bench 'BenchmarkRecents|BenchmarkTable|BenchmarkSlice'
goos: darwin
goarch: amd64
pkg: github.com/algorand/go-algorand/util
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkRecents-8   	574202229	         2.036 ns/op	       0 B/op	       0 allocs/op
BenchmarkTable-8     	17191695	        68.92 ns/op	       1 B/op	       0 allocs/op
BenchmarkSlice-8     	186941323	         6.276 ns/op	      31 B/op	       0 allocs/op
PASS
ok  	github.com/algorand/go-algorand/util	4.680s

These are sort of dumb microbenchmarks that grow the buffer up to 1000, and then start adding and removing one item at a time. The "op" being benchmarked is one add/drop cycle. I got the Add() and Drop() methods small enough to be inlined, so it's pretty much just shuffling some offsets for each operation.

Comment on lines +150 to +151
list = append(list, txTailRound.TxnIDs[i])
lastValid[txTailRound.LastValid[i]] = list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not worth doing in the PR, but it seems quite surprising to me that we go through so much trouble around the size of these lists - guessing they will start at 256, and managing the slice capacities ourselves. How about just using the size of the list for the previous block, and let Go do the management beyond that? It seems like a better guess, and simpler code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also not quite remember the rationale. Maybe part of the effort of running on r-pi 2 or something

ledger/accountdb.go Outdated Show resolved Hide resolved
* Move record.StateProofID setting to before online/offline switch
* Reuse account resource separation logic
* Remove baseAcctDataMigrate
* Update account hash for updated accounts

Co-authored-by: chris erway <[email protected]>
Co-authored-by: Pavel Zbitskiy <[email protected]>
Copy link
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion: things seem complete enough that we should merge this into master so that the entire engineering team can start exercising the new code. Review can continue here as long as it's useful even after the merge.

* seenAddr in onlineAccountsAll
* createNormalizedOnlineBalanceIndex
* comments
@algorandskiy algorandskiy merged commit 90b1c05 into master Jul 6, 2022
cce added a commit to cce/go-algorand that referenced this pull request Jul 7, 2022
@@ -417,7 +452,7 @@ func (tr *trackerRegistry) commitSyncer(deferredCommits chan *deferredCommitCont
}

// commitRound commits the given deferredCommitContext via the trackers.
func (tr *trackerRegistry) commitRound(dcc *deferredCommitContext) {
func (tr *trackerRegistry) commitRound(dcc *deferredCommitContext) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we made commitRound return an error, but we do not check the error when it is called in commitSyncer

@algorandskiy
Copy link
Contributor Author

algorandskiy commented Jul 7, 2022 via email

@@ -55,6 +55,7 @@ func main() {
mu.Unlock()
// prevent requests for block #2 to go through.
if strings.HasSuffix(request.URL.String(), "/block/2") {
response.Write([]byte("webProxy prevents block 2 from serving"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling Write before WriteHeader means HTTP 200 will be sent

algorandskiy pushed a commit that referenced this pull request Jul 11, 2022
In #4003 some pointer receivers and pointer arguments were changed
to pass by value and value receivers, which could lead to a performance regression.
This changes them back.
@algojohnlee algojohnlee deleted the feature/320-rounds branch November 2, 2022 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants