Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC Service Locked Up #29902

Closed
buffalu opened this issue Jan 25, 2023 · 7 comments
Closed

RPC Service Locked Up #29902

buffalu opened this issue Jan 25, 2023 · 7 comments
Labels
community Community contribution stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@buffalu
Copy link
Contributor

buffalu commented Jan 25, 2023

Problem

  • RPC service locked up on several v1.14.13-jito RPC servers around 11:17PM CT time, Tuesday Jan 24
  • Node continued to replay ontime. Pubsub + geyser continued to work.
  • Likely to be batched getMultipleAccounts, getTokenAccountsByOwner

Likely deadlock in RPC Servers or getMA?

Relevant rpc cli args:

  • --rpc-threads 8
  • --account-index-exclude-key kinXdEcpDQeHPEuQnqmUgtYykqKGVFq6CeVX5iAHJq6
  • --account-index-exclude-key TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA

Proposed Solution

Determine lockup cause and fix

@buffalu buffalu added the community Community contribution label Jan 25, 2023
@buffalu
Copy link
Contributor Author

buffalu commented Jan 25, 2023

Screen Shot 2023-01-25 at 10 05 34 AM

non-potato picture

@steviez
Copy link
Contributor

steviez commented Jan 25, 2023

I'm not super dialed in on this one, but I think there has been some investigation / theory on potential culprit: #24644

RPC service locked up on several v1.14.13-jito RPC servers

Obligatory question comes to mind as to whether the stock software shows the issue. But, given the issue we have + some reports I've heard from others, this is seemingly not an issue that jito commits introduce.

@buffalu
Copy link
Contributor Author

buffalu commented Jan 25, 2023

@mschneider might find this interesting

@mschneider
Copy link
Contributor

mschneider commented Jan 26, 2023

So we observed some users of the mango rpc using massively batched gMA and gPA to avoid rate limits, we ended up blocking any form of batching involving these requests, which dramatically improved performance for other users.

Curious: did websocket / pubsub connections that were opened before still transmit data?

@mschneider
Copy link
Contributor

mschneider commented Jan 26, 2023

If you were to direct request workload through a reverse proxy you might be able to create a log of the RPC method combination that lead to the issue

@buffalu
Copy link
Contributor Author

buffalu commented Jan 28, 2023

Curious: did websocket / pubsub connections that were opened before still transmit data?

yes and geyser. Everything kept running except rpc.

If you were to direct request workload through a reverse proxy you might be able to create a log of the RPC method combination that lead to the issue

the chart I sent is from our proxy showing the rpc methods that caused it

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 29, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests

3 participants