Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

jimmygchen · 2024-04-23T06:02:04Z

Issue Addressed

Addresses #4388.

A large number of unknown validators on a VC is known to overwhelm the beacon node because the VC triggers a retrieval for each validator per slot. This PR reduces the frequency to one query per unknown validator each epoch.

Proposed Changes

Poll for all validator indices on startup (unchanged)
If any validator is unknown, register to poll again in the 32 slots (1 epoch) instead of the next slot.
Avoid polling on the first 1st slot of epoch.

For more details and rationale, see comment here.

pawanjay176

Nice, LGTM. Just a minor question

validator_client/src/duties_service.rs

jimmygchen · 2024-04-26T07:18:41Z

I'll do a bit of manual testing next week before we merge this!

jimmygchen · 2024-04-30T23:23:26Z

@chong-he found an issue during testing and it looks like the VC re-queries the BN during the first slot of the epoch, I'll look into this.

jimmygchen · 2024-05-06T05:32:58Z

Upon investigating the logs, I think it is working as intended, and the poll in the first slot of epoch is skipped. Will wait for CK to confirm.

However, I think there's likelihood that this could happen due to the async nature of this function, in the scenario where BN returns a response really late - basically if iterating through all the validators takes close to 12 seconds, then we could be querying in a new slot, which could potentially be first slot of a new epoch.

This is because of the calculation outside the for loop:

lighthouse/validator_client/src/duties_service.rs

Lines 483 to 502 in ce914d1

    
           let current_slot_opt = duties_service.slot_clock.now(); 
        
           let next_poll_slot_opt = current_slot_opt.map(|slot| slot.saturating_add(E::slots_per_epoch())); 
        
           let is_first_slot_of_epoch = if let Some(current_slot) = current_slot_opt { 
        
               let current_epoch_first_slot = current_slot 
        
                   .epoch(E::slots_per_epoch()) 
        
                   .start_slot(E::slots_per_epoch()); 
        
               current_slot == current_epoch_first_slot 
        
           } else { 
        
               false 
        
           }; 
        
           for pubkey in all_pubkeys { 
        
               // This is on its own line to avoid some weirdness with locks and if statements. 
        
               let is_known = duties_service 
        
                   .validator_store 
        
                   .initialized_validators() 
        
                   .read() 
        
                   .get_index(&pubkey) 
        
                   .is_some();

For more accuracy, I'll move the calculation into the for loop given we await on each validator query and the slot status could be inaccurate if we have a large list or slow BN.

chong-he · 2024-05-06T23:57:59Z

@chong-he found an issue during testing and it looks like the VC re-queries the BN during the first slot of the epoch, I'll look into this.

Apologies! I was looking at the wrong slot time.

The VC does not re-query on the first slot in the next epoch (even though we start the VC in the first slot). It waits until the 12s has lapsed and only re-query the BN about the status (i.e., query at the second slot of an epoch). The VC also only query once per epoch, which is a great improvement.

All good upon testing, this PR is good to go

michaelsproul

Looks good. Nice improvement

I think we should include this in 5.2

validator_client/src/duties_service.rs

Co-authored-by: Michael Sproul <[email protected]>

michaelsproul · 2024-05-21T04:41:43Z

sorry looks like cargo fmt is failing because I messed up the indentation in my suggestion

jimmygchen · 2024-05-21T06:49:22Z

My bad, should've checked myself :P thanks!

michaelsproul · 2024-05-22T00:23:07Z

@Mergifyio queue

mergify · 2024-05-22T00:23:13Z

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 52e3112

jimmygchen added val-client Relates to the validator client binary ready-for-review The code is ready for review labels Apr 23, 2024

Reduce frequency of polling unknown validators.

ce914d1

jimmygchen force-pushed the reduce-polling-pending-validators branch from 097dafd to ce914d1 Compare April 23, 2024 07:24

pawanjay176 approved these changes Apr 23, 2024

View reviewed changes

validator_client/src/duties_service.rs Show resolved Hide resolved

jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Apr 26, 2024

jimmygchen self-assigned this Apr 26, 2024

Move slot calculation into for loop.

f66aae6

jimmygchen added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels May 7, 2024

michaelsproul added the v5.2.0 Q2 2024 label May 21, 2024

michaelsproul approved these changes May 21, 2024

View reviewed changes

validator_client/src/duties_service.rs Outdated Show resolved Hide resolved

michaelsproul added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels May 21, 2024

Simplify logic.

437ef41

Co-authored-by: Michael Sproul <[email protected]>

Fix formatting

e782323

mergify bot added a commit that referenced this pull request May 22, 2024

Merge of #5628

be66819

mergify bot mentioned this pull request May 22, 2024

merge queue: embarking unstable (2a87016) and #5628 together #5820

Closed

5 tasks

mergify bot merged commit 52e3112 into sigp:unstable May 22, 2024
28 checks passed

chong-he mentioned this pull request May 26, 2024

Consider reducing the frequency of pending validator indices queries from validator client #4388

Closed

jimmygchen deleted the reduce-polling-pending-validators branch May 28, 2024 00:40

jimmygchen mentioned this pull request Aug 30, 2024

Increase priority for validator HTTP requests #6292

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

jimmygchen commented Apr 23, 2024

pawanjay176 left a comment

jimmygchen commented Apr 26, 2024

jimmygchen commented Apr 30, 2024

jimmygchen commented May 6, 2024

chong-he commented May 6, 2024

michaelsproul left a comment

michaelsproul commented May 21, 2024

jimmygchen commented May 21, 2024

michaelsproul commented May 22, 2024

mergify bot commented May 22, 2024 •

edited

Loading

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

Conversation

jimmygchen commented Apr 23, 2024

Issue Addressed

Proposed Changes

pawanjay176 left a comment

Choose a reason for hiding this comment

jimmygchen commented Apr 26, 2024

jimmygchen commented Apr 30, 2024

jimmygchen commented May 6, 2024

chong-he commented May 6, 2024

michaelsproul left a comment

Choose a reason for hiding this comment

michaelsproul commented May 21, 2024

jimmygchen commented May 21, 2024

michaelsproul commented May 22, 2024

mergify bot commented May 22, 2024 • edited Loading

✅ The pull request has been merged automatically

mergify bot commented May 22, 2024 •

edited

Loading