-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle some gas corner-cases #2537
Conversation
…bc-rs into 2487_gas_fixes
…nerror IbcEvent retrieved from submitting a client update
relayer/src/foreign_client.rs
Outdated
if let Some(ibc_events) = res.clone() { | ||
match ibc_events | ||
.into_iter() | ||
.find(|e| e.event_type() == IbcEventType::ChainError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document the assumption we're making.
.find(|e| e.event_type() == IbcEventType::ChainError) | |
// We make the assumption that we push a single `ChainError` variant | |
.find(|e| e.event_type() == IbcEventType::ChainError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, alternatively, we could have strong type guarantees on this. CC @soareschen @romac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @soareschen, I think we should get rid of these ChainError
events altogether and figure out a better way to bubble up such errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked in #2565
relayer/src/foreign_client.rs
Outdated
self.dst_chain().id(), | ||
ev, | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many different execution branches and it's not clear what are the post-conditions of the refresh
method. Can we briefly describe that?
Let's maybe also put a comment to describe what is this match arm for. There is also an Ok(res)
and a Err(e
branches.
relayer/src/worker/client.rs
Outdated
@@ -14,6 +14,9 @@ use crate::{ | |||
|
|||
use super::WorkerCmd; | |||
|
|||
const TRUSTING_PERIOD_DIVIDER: u32 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like for Tendermint clients we would refresh every ~3 hours if the trusting period is 18 days. This seems a bit too infrequest. We should retry more frequently.
Notes from discussing with Luca: It seems that there are 3 cases:
|
Co-authored-by: Adi Seredinschi <[email protected]>
…bc-rs into 2487_gas_fixes
…ent delay only if 'refresh()' fails
// Check that the gas_multiplier is greater than or equal to 1.0 | ||
if let Some(gas_multiplier) = config.gas_multiplier { | ||
if gas_multiplier < 1.0 { | ||
return Err(Diagnostic::Error(Error::invalid_gas_multiplier( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take care to also remove the Error
enum variant if it is no longer in use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thank you, I forgot to remove it.
assert!(gas_multiplier >= 1.0); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep it here, just in case someone removes the checks in a refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert!
was replaced so that the test, in client_refresh.rs
, using a gas_multiplier = 0.8
to trigger a failed refresh()
runs correctly. The checks are now in the new GasMultiplier
struct:
https://github.com/informalsystems/ibc-rs/blob/5c522c4d72659ceec7b81ee31ecc3dddf9460700/relayer/src/config/gas_multiplier.rs#L18-L27
I am not sure if there is a way to easily reproduce the failure while keeping a gas_multiplier >= 1.0
.
fn gas_multiplier_from_config(config: &ChainConfig) -> f64 { | ||
config.gas_multiplier.unwrap_or(DEFAULT_GAS_MULTIPLIER) | ||
pub fn gas_multiplier_from_config(config: &ChainConfig) -> f64 { | ||
config.gas_multiplier.unwrap_or_default().to_f64() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
relayer/src/config/types.rs
Outdated
@@ -258,6 +258,86 @@ pub mod memo { | |||
} | |||
} | |||
|
|||
pub use gas_multiplier::GasMultiplier; | |||
|
|||
pub mod gas_multiplier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to create a separate module file, this is quite long already for an inline module.
@@ -459,6 +459,15 @@ define_error! { | |||
e.chain_id, e.default_gas, e.max_gas) | |||
}, | |||
|
|||
ConfigValidationGasMultiplierLow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many ConfigValidationThisAndThat
variants in this enum. Prefixing in Rust usually means we should refactor things in a more elegant and concise way.
I wonder if there is a ConfigValidationError
enum hiding here, which would make one variant of relayer::Error
and be also convertible to that with a From
impl. Maybe not for this PR, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open an issue for this
relayer/src/foreign_client.rs
Outdated
Ok(res) => { | ||
// If elapsed < refresh_window for the client, `try_refresh()` will | ||
// be successful with an empty vector. | ||
if let Some(ibc_events) = res.clone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could avoid cloning the whole collection here: if it destructures to Some(ibc_events)
, you already own it, if it's None
, you don't need the res
variable to return Ok(None)
. Can also use convenience methods here like Option::map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so I removed the clone()
in commit 47a49b9, but I am not sure I understand how Option::map
could be used here. Would the idea be to apply the .map()
to res
in order to avoid the Ok(None)
when res
is None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see if res.map(|ibc_events| { ... })
could be more elegant here.
… enum in 'relayer-cli/src/config.rs'
relayer/src/foreign_client.rs
Outdated
@@ -787,17 +787,17 @@ impl<DstChain: ChainHandle, SrcChain: ChainHandle> ForeignClient<DstChain, SrcCh | |||
Ok(res) => { | |||
// If elapsed < refresh_window for the client, `try_refresh()` will | |||
// be successful with an empty vector. | |||
if let Some(ibc_events) = res.clone() { | |||
if let Some(ibc_events) = &res { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better, but I think you can just destructure res
since this scope owns it and it's not used anywhere else.
if let Some(ibc_events) = &res { | |
if let Some(ibc_events) = res { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue I am having is that res
is used at this line.
In order to have a more elegant refresh()
method, a part of it was extracted in a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can rewrite refresh
as follows to avoid such nesting:
pub fn refresh(&mut self) -> Result<Option<Vec<IbcEvent>>, ForeignClientError> {
match self.try_refresh() {
// If elapsed < refresh_window for the client, `try_refresh()` will
// be successful with an empty vector.
Ok(Some(ibc_events)) => Self::verify_chain_error(ibc_events, self.dst_chain().id()),
// Return the successful empty vector.
Ok(None) => Ok(None),
// Return the error
Err(e) => Err(e),
}
}
or by using the ?
operator on the result of try_refresh
:
pub fn refresh(&mut self) -> Result<Option<Vec<IbcEvent>>, ForeignClientError> {
// If elapsed < refresh_window for the client, `try_refresh()` will
// be successful with an empty vector.
match self.try_refresh()? {
Some(events) => Self::verify_chain_error(events, self.dst_chain().id()),
None => Ok(None),
}
}
or even like this, but I find the latter harder to read:
pub fn refresh(&mut self) -> Result<Option<Vec<IbcEvent>>, ForeignClientError> {
// If elapsed < refresh_window for the client, `try_refresh()` will
// be successful with an empty vector.
self.try_refresh().and_then(|opt_events| {
opt_events
.map(|events| Self::verify_chain_error(events, self.dst_chain().id()))
.transpose()
.map(Option::flatten)
})
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is to have verify_chain_error
(renamed below to check_no_errors
) return a Result<(), _>
:
pub fn refresh(&mut self) -> Result<Option<Vec<IbcEvent>>, ForeignClientError> {
fn check_no_errors(
ibc_events: &[IbcEvent],
dst_chain_id: ChainId,
) -> Result<(), ForeignClientError> {
// The assumption is that only one IbcEventType::ChainError will be
// in the resulting Vec<IbcEvent> if an error occurred.
let chain_error = ibc_events
.iter()
.find(|&e| e.event_type() == IbcEventType::ChainError);
match chain_error {
None => Ok(()),
Some(ev) => Err(ForeignClientError::chain_error_event(
dst_chain_id,
ev.to_owned(),
)),
}
}
// If elapsed < refresh_window for the client, `try_refresh()` will
// be successful with an empty vector.
if let Some(events) = self.try_refresh()? {
check_no_errors(&events, self.dst_chain().id())?;
Ok(Some(events))
} else {
Ok(None)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the latter solution the best, as it seems a bit weird to me to have veriy_chain_errors
return an Option<Vec<_>>
where the Option
will always be Some
, ie. the signature does not really tell us what is going on, and could also have us think that the events that are returned are not the very same than the one which are passed in.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed solutions are great, and I agree that the last one is better, as the check_no_errors
makes it more understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Closes: #2487
Description
Error 1: "Too much gas wanted"
For this issue Hermes report has been changed from "success" to "submitted" to avoid confusion.
Error 2: "out of gas in location" during
CheckTx
A warning during the health-check is displayed if the
gas_multiplier
is lower than 1.1Error 3: "out of gas in location" during DeliverTx
The interval between client. In addition, the clientrefresh
calls has been changed to be the client'strusting_period/100
refresh
method has been updated in order to catchChainError
resulting events. The interval betweenrefresh()
calls which fail to update the client increases using the Fibonacci sequence.PR author checklist:
unclog
.docs/
).Reviewer checklist:
Files changed
in the GitHub PR explorer.