-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix average write duration and extend server stats #52
Conversation
The average duration that is taken by a miss or hit is interesting when testing different cache types. This patch adds the average time taken by a cache miss and hit to the server stats. This measurements can be entirely done in the server and do not need any changes to the cache implementations. Furthermore the calculation of the average duration for cache writes is fixed (wrong factor used, that produced micros instead of millis).
Just a note: I played around with a moving average calculation for the durations. The measurements are the same as in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've wanted to add cache hit/miss timings for a while, that's really useful. I just have one question, otherwise this looks fine and I'll merge it.
cache_writes: u64::default(), | ||
cache_write_duration: Duration::new(0, 0), | ||
cache_read_hit_duration: Duration::new(0, 0), | ||
cache_read_miss_duration: Duration::new(0, 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoided doing this because I was too lazy to spell all this out, and Duration
didn't have a Default
implementation. That'll be fixed in Rust 1.15, but for now this is a reasonable thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not think a second about Default
for Duration
until rustc
told me some facts and I wrote this bloaty method.
@@ -146,15 +145,41 @@ struct ServerStats { | |||
/// The count of errors writing to cache. | |||
pub cache_write_errors: u64, | |||
/// The number of successful cache writes. | |||
pub cache_writes: u32, | |||
pub cache_writes: u64, | |||
/// The total seconds spent writing cache entries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could change the description of these to just "The total time spent..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups. I missed that and will update the PR.
stat.set_name(String::from($name)); | ||
if $num > 0 { | ||
let duration = $dur / $num as u32; | ||
stat.set_str(format!("{}.{:03} s", duration.as_secs(), duration.subsec_nanos() / 1000_000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wish Duration
had an as_fractional_secs
method or something like that. I filed an issue on that. Also: oops on the conversion bug. I wish there was at least a constant in the stdlib we could use for it!
let duration = $dur / $num as u32; | ||
stat.set_str(format!("{}.{:03} s", duration.as_secs(), duration.subsec_nanos() / 1000_000)); | ||
} else { | ||
stat.set_str("0.000 s".to_owned()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there'd be any value in adding a Duration
case to the CacheStatistic
enum? We don't currently do anything interesting with it, although I'd like to add a way to have --show-stats
output a JSON representation of the stats for consumption by other programs. (Granted you're just duplicating what I previously implemented here...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Duration
in CacheStatistic
came to my mind when thinking about #12 in order to have a two column display with value and unit (like in the ccache
output):
max cache size 5.0 GB
some duration 433 ms
other duration 0.324 s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. We can fix that later, it would make fixing that nicer certainly.
@@ -146,15 +145,41 @@ struct ServerStats { | |||
/// The count of errors writing to cache. | |||
pub cache_write_errors: u64, | |||
/// The number of successful cache writes. | |||
pub cache_writes: u32, | |||
pub cache_writes: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you're changing this to u64
? You're just doing a conversion in the set_duration_stat
macro below anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to u64
because all other counters are u64
and it looked odd. That's it.
I thought you just use u32
because of the division for the time calculation.
Shall I revert that one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say I have strong feelings either way. I was mostly just curious to make sure I wasn't missing something important about the patch.
I think the current behavior is fine for now. When we start persisting stats we might have to do something smarter. We're going to want to rework cache timing stats soon anyway, since we're going to implement support for fetching from multiple cache sources and other fanciness, so we'll probably want to store cache timing stats per-backend instance. |
Thanks for the patch! |
The average duration that is taken by a miss or hit is interesting
when testing different cache types. This patch adds the average time
taken by a cache miss and hit to the server stats. This measurements
can be entirely done in the server and do not need any changes to the
cache implementations.
Furthermore the calculation of the average duration for cache writes is
fixed (wrong factor used, that produced micros instead of millis).