-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Use lock properly in filter_order_by_nonces to do it concurrently #38
Fix: Use lock properly in filter_order_by_nonces to do it concurrently #38
Conversation
Benchmark results for
|
Date (UTC) | 2024-07-13T16:06:03+00:00 |
Commit | 465eadbd19a1cb2d62f8fe829e819e7db6e1a902 |
Base SHA | db634fcb90a6b2fc484e19f02a59a5b74ffeefa2 |
Significant changes
None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! Really appreciate it and good catch :)
I think there is a potential race condition in the nonce cache update logic. Currently, the code checks if a value exists, fetches it if not, then writes it back. This could lead to overwriting more recent values in concurrent scenarios: imagine if there were two transactions from account X in different threads.
Here's a scenario that illustrates the problem:
- Thread A checks the cache for address X, doesn't find it
- Thread B checks the cache for address X, doesn't find it
- Thread A fetches nonce N1 from the network for address X
- Thread B fetches nonce N2 from the network for address X (N2 > N1 because it's more recent)
- Thread B writes N2 to the cache
- Thread A writes N1 to the cache, overwriting the more recent N2
What do you think about using a single write lock operation that checks and updates atomically? Something like this:
let mut nonce_cache = nonce_cache.write().map_err(|e| eyre::eyre!("Failed to acquire write lock: {}", e))?;
if !nonce_cache.contains_key(&nonce.address) {
let address = nonce.address;
let onchain_nonce = self
.eth_provider
.get_transaction_count(address)
.block_id(BlockId::Number(parent_block.into()))
.await
.wrap_err("Failed to fetch onchain tx count")?;
nonce_cache.insert(nonce.address, onchain_nonce);
}
let res_onchain_nonce = *nonce_cache.get(&nonce.address).unwrap();
@bertmiller spot on the analysis that there is an overwrite on the nonce value for the same address X. However, the part of getting the nonce, is a nonce of an address from parent block which should be the same regardless of the order of onchain network request to get the data. |
Thanks!!!!! |
b5ff532
to
b671c02
Compare
b671c02
to
99fde4a
Compare
Thanks for your feedback. At the |
It's just to make the code more robust, in most cases rust allows us to avoid wrap() which might fail if something changes in the future.
This way the code enforces the correctness at compile time and not at runtime just by the logic of it. |
it makes sense. |
e6f0c9b
to
465eadb
Compare
#38) ## 📝 Summary The current use of Mutex on `nonce_cache` in `filter_order_by_nonces` is not correct. It causes the function to run sequentially because every other task will wait for prev one ( both read and write ) complete in order to do its work. I change to use RwLock instead. Before the change it takes 18m to fetch a block with 2k4 orders, and now it takes 18s with 400 concurrent request. - And lint the file ## 💡 Motivation and Context <!--- (Optional) Why is this change required? What problem does it solve? Remove this section if not applicable. --> --- ## ✅ I have completed the following steps: * [x] Run `make lint` * [x] Run `make test` * [ ] Added tests (if applicable)
📝 Summary
The current use of Mutex on
nonce_cache
infilter_order_by_nonces
is not correct. It causes the function to run sequentially because every other task will wait for prev one ( both read and write ) complete in order to do its work.I change to use RwLock instead. Before the change it takes 18m to fetch a block with 2k4 orders, and now it takes 18s with 400 concurrent request.
💡 Motivation and Context
✅ I have completed the following steps:
make lint
make test