Consider reducing the frequency of pending validator indices queries from validator client #4388

jimmygchen · 2023-06-09T13:05:36Z

Description

Currently the Validator Client (VC) polls the validator states for all inactive validators from the Beacon Node (for indices retrieval), once every slot (12 seconds on mainnet)

lighthouse/validator_client/src/duties_service.rs

Lines 279 to 292 in f167951

    
               loop { 
        
                   // Run this poll before the wait, this should hopefully download all the indices 
        
                   // before the block/attestation tasks need them. 
        
                   poll_validator_indices(&duties_service).await; 
        
                   if let Some(duration) = duties_service.slot_clock.duration_to_next_slot() { 
        
                       sleep(duration).await; 
        
                   } else { 
        
                       // Just sleep for one slot if we are unable to read the system clock, this gives 
        
                       // us an opportunity for the clock to eventually come good. 
        
                       sleep(duties_service.slot_clock.slot_duration()).await; 
        
                   } 
        
               } 
        
           },

This is mostly not an issue until the number of inactive validator validators reaches a large number ~1000, which is probably quite rare. However we've recently seen some performance issues when the endpoint beacon/states/{state_id}/validators/{validator_id} is called repeatedly in a short period of time. Here's a script created by @michaelsproul to spam this endpoint aggressively, and it turns out this could cause an OOM on the beacon node.

There has been some discussions on how to improve this, potentially queuing the requests, however they might take a while to implement. In the mean time, we can probably reduce the frequency of this query to once or twice per epoch to reduce the performance impact on the node, as validator activation only happens once every epoch, and it may not be necessary to query the indices so often.

The text was updated successfully, but these errors were encountered:

paulhauner · 2024-04-15T00:10:02Z

One way to reduce the frequency of polling would be to make an assumption that the VC only needs to know a validator index for an active validator.

Validators are added to the BeaconState via Deposit objects in a BeaconBlock. This happens on a per slot basis, so new validators can land in the BeaconState (i.e. be assigned an index) each slot. However, validators are activated on a per epoch basis (process_epoch > process_registry_updates).

So, it seems that it would be safe for us to:

Poll for all validator indices on startup.
When we've received a None response for a validator, poll each epoch.
- I think we should consider not polling in the first slot of the epoch. That slot is known to be a weak point for us. Validators are activated at least 1 + MAX_SEED_LOOKAHEAD == 5 epochs after they first appear in the BeaconState so there's no rush to discover the validator.

There's an edge-case here where a syncing BN might return None, then import several epochs of the chain and then return Some and that validator is already active. In per-slot polling we would resolve this within a slot, but with per-epoch processing it takes an epoch to resolve. I'm not sure how I feel about this edge-case, it's hard to weigh off the costs of optimising for weird sync cases vs optimising for the case of big operators running lots of undeposited validators.

jimmygchen · 2024-04-15T05:01:52Z

Thanks Paul!
As discussed the above edge case is quite unlikely and it's a much more common scenario for validators to get their infrastructure running before even making deposits. So the proposal above sounds like a good way to go 👍

dapplion · 2024-04-15T05:10:22Z

What's the UX benefit of discovering the deposit inclusion immediately on the next slot? You won't be activated until your deposit is finalized anyway, there's nothing useful the VC can do besides logging

paulhauner · 2024-04-15T05:25:40Z

What's the UX benefit of discovering the deposit inclusion immediately on the next slot?

That's my argument here: #4388 (comment)

I think per-slot discovery has some benefits in very rare edge cases (e.g., syncing several epochs of the canonical chain and then switching to it). However that's very low impact for the operator and insignificant for the chain as a whole, so I'd say we let the edge-case suffer.

chong-he · 2024-05-26T23:49:02Z

Closed by #5628

jimmygchen added the val-client Relates to the validator client binary label Jun 9, 2023

jimmygchen changed the title ~~Consider reducing the frequency of pending validator indices query~~ Consider reducing the frequency of pending validator indices queries Jun 9, 2023

jimmygchen changed the title ~~Consider reducing the frequency of pending validator indices queries~~ Consider reducing the frequency of pending validator indices queries from validator client Jun 9, 2023

jimmygchen added the optimization Something to make Lighthouse run more efficiently. label Sep 28, 2023

jimmygchen self-assigned this Apr 15, 2024

jimmygchen mentioned this issue Apr 23, 2024

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

Merged

chong-he closed this as completed May 26, 2024

jimmygchen mentioned this issue Aug 30, 2024

Increase priority for validator HTTP requests #6292

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider reducing the frequency of pending validator indices queries from validator client #4388

Consider reducing the frequency of pending validator indices queries from validator client #4388

jimmygchen commented Jun 9, 2023 •

edited

Loading

paulhauner commented Apr 15, 2024 •

edited

Loading

jimmygchen commented Apr 15, 2024

dapplion commented Apr 15, 2024 •

edited

Loading

paulhauner commented Apr 15, 2024

chong-he commented May 26, 2024

Consider reducing the frequency of pending validator indices queries from validator client #4388

Consider reducing the frequency of pending validator indices queries from validator client #4388

Comments

jimmygchen commented Jun 9, 2023 • edited Loading

Description

paulhauner commented Apr 15, 2024 • edited Loading

jimmygchen commented Apr 15, 2024

dapplion commented Apr 15, 2024 • edited Loading

paulhauner commented Apr 15, 2024

chong-he commented May 26, 2024

jimmygchen commented Jun 9, 2023 •

edited

Loading

paulhauner commented Apr 15, 2024 •

edited

Loading

dapplion commented Apr 15, 2024 •

edited

Loading