Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(rln-relay): add isReady check #1989

Merged
merged 5 commits into from
Sep 6, 2023
Merged

chore(rln-relay): add isReady check #1989

merged 5 commits into from
Sep 6, 2023

Conversation

rymnc
Copy link
Contributor

@rymnc rymnc commented Sep 4, 2023

Description

Adds an isReady proc to waku_node, waku_rln_relay

Changes

  • WakuNode.isReady => if all protocols are ready, returns true
  • WakuRlnRelay.isReady => if group manager is ready, returns true

Issue

Addresses 25. #1906

@github-actions
Copy link

github-actions bot commented Sep 4, 2023

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:1989

@github-actions
Copy link

github-actions bot commented Sep 4, 2023

You can find the experimental image built from this PR at

quay.io/wakuorg/nwaku-pr:1989-experimental

@rymnc rymnc marked this pull request as ready for review September 4, 2023 17:01
@rymnc
Copy link
Contributor Author

rymnc commented Sep 4, 2023

cc: @alrevuelta i think this pr should be separate from the one that creates an api endpoint to reduce cognitive load - wdyt?

Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rymnc agree, we can have 2 PRs.

Not sure having a isReady variable is a good idea. I would have just a function isReady() that check the difference between the last processed block in the rln tree and the latest head block in the blockchain (perhaps allowing 1 or 2 blocks of difference).

As it is, you may set g.isReady = true but at any point in time you can lose sync and unless im missing something you wont set isReady back to false. Imagine for example that the eth node falls behind sync.

tldr: isReady() function but no stored isReady state. This state is calculated upon request. wdyt?

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

@rymnc agree, we can have 2 PRs.

Not sure having a isReady variable is a good idea. I would have just a function isReady() that check the difference between the last processed block in the rln tree and the latest head block in the blockchain (perhaps allowing 1 or 2 blocks of difference).

As it is, you may set g.isReady = true but at any point in time you can lose sync and unless im missing something you wont set isReady back to false. Imagine for example that the eth node falls behind sync.

tldr: isReady() function but no stored isReady state. This state is calculated upon request. wdyt?

I think that's overkill - when we are out of sync, we do set isReady to false (check the ethRpc.onDisconnect)

@alrevuelta
Copy link
Contributor

I think that's overkill - when we are out of sync, we do set isReady to false (check the ethRpc.onDisconnect)

yep but there are other ways you can get out of sync without ethRpc.onDisconnect. For example, if the eth node gets out of sync, then you will fall behind while staying connected. You assume that once you are in sync for the first time, you stay in sync forever (unless disconnected). What happens also if your node can't process all the membership insertions realtime and falls behind?

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

I think that's overkill - when we are out of sync, we do set isReady to false (check the ethRpc.onDisconnect)

yep but there are other ways you can get out of sync without ethRpc.onDisconnect. For example, if the eth node gets out of sync, then you will fall behind while staying connected. You assume that once you are in sync for the first time, you stay in sync forever (unless disconnected). What happens also if your node can't process all the membership insertions realtime and falls behind?

hmm, how would it be possible to detect if the eth node itself is out of sync without using multiple providers?

What happens also if your node can't process all the membership insertions realtime and falls behind?
should we set it to false then during inserts? and set it back to true right after?

@alrevuelta
Copy link
Contributor

hmm, how would it be possible to detect if the eth node itself is out of sync without using multiple providers?

afaik even if the eth node is out of sync, it knows that its out sync. meaning that it knows the latest head block known by the blockchain and the latest known block by the node. I think this is the rpc link

I guess you will also notice indirectly. If the eth node falls behind sync, it wont give you the latest blocks, so you will fall behind sync, seeing that the latestHead is way beyond your latest localRlnLatestProcessedBlock.

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

hmm, how would it be possible to detect if the eth node itself is out of sync without using multiple providers?

afaik even if the eth node is out of sync, it knows that its out sync. meaning that it knows the latest head block known by the blockchain and the latest known block by the node. I think this is the rpc link

I guess you will also notice indirectly. If the eth node falls behind sync, it wont give you the latest blocks, so you will fall behind sync, seeing that the latestHead is way beyond your latest localRlnLatestProcessedBlock.

if the eth node falls behind sync, would it still emit the newHead events? unsure about this behaviour - i do agree about calculating every time isReady is called though

@alrevuelta
Copy link
Contributor

alrevuelta commented Sep 5, 2023

if the eth node falls behind sync, would it still emit the newHead events? unsure about this behaviour - i do agree about calculating every time isReady is called though

have never tried empirically. I would bet that it stops emitting events, and you will fall behind sync without noticing.

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

if the eth node falls behind sync, would it still emit the newHead events? unsure about this behaviour - i do agree about calculating every time isReady is called though

have never tried empirically. I would bet that it stops emitting events, and you will fall behind sync without noticing.

yes so then there are 2 ways we will be out of sync -

  1. eth node out of sync
  2. we do not process insertions in time

we cannot account for 1 unless we use multiple providers
for 2, we can do this by having the isReady proc compute the difference between latestBlockSeen and latestBlockProcessed, as you said. wdyt?

@alrevuelta
Copy link
Contributor

we cannot account for 1 unless we use multiple providers

why? an eth node out of sync will know its out of sync. so with just one node you should know that its out of sync. or?

afaik we can address both 1 and 2 without multiple providers unless im missing something.

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

we cannot account for 1 unless we use multiple providers

why? an eth node out of sync will know its out of sync. so with just one node you should know that its out of sync. or?

afaik we can address both 1 and 2 without multiple providers unless im missing something.

maybe are you suggesting that we make the rpc call every time the isReady proc is called?

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Super interesting idea!

In further PRs, I think we will need to add a /healthcheck endpoint as a valid REST path ( cc @NagyZoltanPeter )

@alrevuelta
Copy link
Contributor

maybe are you suggesting that we make the rpc call every time the isReady proc is called?

Yes exactly, perhaps this: https://docs.infura.io/networks/ethereum/json-rpc-methods/eth_syncing#returns

I'm fine with leaving a TODO for this (your point 1). But would like to address 2. here.

In further PRs, I think we will need to add a /healthcheck endpoint as a valid REST path ( cc @NagyZoltanPeter )

tracked here:
#1988

@NagyZoltanPeter
Copy link
Contributor

I like the idea of having /healthcheck!

Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rymnc
Copy link
Contributor Author

rymnc commented Sep 5, 2023

@alrevuelta I hope cbb266a covers the requirement (1 & 2)

Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! left a comment of a possible edge case, let me know if im missing something.

@@ -73,6 +73,8 @@ type
# in event of a reorg. we store 5 in the buffer. Maybe need to revisit this,
# because the average reorg depth is 1 to 2 blocks.
validRootBuffer*: Deque[MerkleNode]
# this variable tracks the last seen head
lastSeenBlockHead*: BlockNumber
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure i understand the need of this variable, and perhaps there is an edge case?

afaiu we sync in batches:

  • batch1: fromblock1 toblock1
  • batch2: fromblock2 toblock2
  • etc

And lastSeenBlockHead takes the values of toblock1, toblock2, etc?

So what happens when we just processed batch1. In this exact moment we latestProcessedBlock = lastSeenBlockHead so we may think we are in sync. But in reality we are in batch1 and batch2 is still left.

So the lastSeenBlockHead should be the last head block known by the blockchain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I'll just move it to the newHeadCallback, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 3c9d4be

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the lastSeenBlockHead should be the last head block known by the blockchain.

I think its safer to get this from eth_blockNumber endpoint rather than relying on the newHeadcallback. If the eth node fails to keep you updated with the latest head (but without disconnecting) then your lastSeenBlockHead will be outdated.

But if you use eth_blockNumber, since its req/rep, you ensure that your latest head is correct. If then node fails to reply, then its not ready.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 15bfde4

error "failed to get the syncing status", error = getCurrentExceptionMsg()
return false

method isReady*(g: OnchainGroupManager): Future[bool] {.async,gcsafe.} =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps its cleaner to run the checks for true all together?

if g.ethRpc.isSome() && g.lastSeenBlockHead !=0 && g.latestProcessedBlock >= g.lastSeenBlockHead && !await g.isSyncing():
  return true
return false

maybe just a personal prefernce, just a sugerence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did think of this, imo the way it is right now is slightly more readable

@@ -551,10 +553,12 @@ method isReady*(g: OnchainGroupManager): Future[bool] {.async,gcsafe.} =
if g.ethRpc.isNone():
return false

if g.lastSeenBlockHead == 0:
return false
let currentBlock = cast[BlockNumber](await g.ethRpc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess if this fails an exceptions is thrown? should we catch it and return false if so. guess if we dont catch it the node will crash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b058c97

@alrevuelta alrevuelta self-requested a review September 6, 2023 08:34
Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@rymnc rymnc merged commit 5638bd0 into master Sep 6, 2023
14 checks passed
@rymnc rymnc deleted the rln-in-sync branch September 6, 2023 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants