Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Review: QUIC send back errors backport 1.14.12 #20

Closed
wants to merge 16 commits into from

Conversation

mschneider
Copy link

No description provided.

@mschneider mschneider marked this pull request as draft January 18, 2023 04:59
core/src/qos_service.rs Outdated Show resolved Hide resolved
core/src/banking_stage.rs Outdated Show resolved Hide resolved
@@ -4242,27 +4245,51 @@ impl Bank {
.filter_map(|(index, res)| match res {
// following are retryable errors
Err(TransactionError::AccountInUse) => {
bidirection_reply_service.send_message(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to trigger this from two places?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these types of errors is not tracked in qos_service.rs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, so they are exhaustive and non-overlapping?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally yes but still will confirm it while doing final review.


pub fn get_signature_from_packet(packet: &Packet) -> Option<Signature> {
// add instruction signature for message
match packet.data(1..65) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a real PR, it would be better to make this a bit more robust, for testing lite rpc it's probably ok

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, noted

match task {
Ok(message) => {
let serialized_message = serialize(&message).unwrap();
let _ =send_stream.write(serialized_message.as_slice()).await;
Copy link
Author

@mschneider mschneider Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this is ideal, do you know if there's any buffering happening inside send_stream.write?
It's not very efficient for the network, when you send a lot of messages with 68 byte length, better to buffer at least to a few 100 bytes.

please add some metrics here, so we get an idea of how many messages / s this processes

Copy link

@godmodegalactus godmodegalactus Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sending in batchs of size 16

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont think batching is important, actually think that's premature optimization, just important to measure how often it is called

Copy link

@godmodegalactus godmodegalactus Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing and to correctly handle the replies in reply handler I had to make the QuicReplyMessage of fixed length of 200 bytes. On the handle side I read the chunks and when the chunk is greater than 200 bytes I get to know that we got a reply.
This is different than the quic connection in the TPU because when client sends all the messages to the tpu server we close the connection so we know that all packets are transfered. On contrary we keep the send channel open on server side until it is closed by the client. So in this case client will never know when it will get the last message.

}
Err(e) => {
// recv error
warn!("got {} on quic bidirectional channel", e.to_string());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use metrics, not sure how often this error can be triggered in production

@godmodegalactus
Copy link

@mschneider will add the metrics after testing the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants