Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix average write duration and extend server stats #52

Merged
merged 2 commits into from
Dec 19, 2016

Conversation

flxo
Copy link
Contributor

@flxo flxo commented Dec 18, 2016

The average duration that is taken by a miss or hit is interesting
when testing different cache types. This patch adds the average time
taken by a cache miss and hit to the server stats. This measurements
can be entirely done in the server and do not need any changes to the
cache implementations.
Furthermore the calculation of the average duration for cache writes is
fixed (wrong factor used, that produced micros instead of millis).

The average duration that is taken by a miss or hit is interesting
when testing different cache types. This patch adds the average time
taken by a cache miss and hit to the server stats. This measurements
can be entirely done in the server and do not need any changes to the
cache implementations.
Furthermore the calculation of the average duration for cache writes is
fixed (wrong factor used, that produced micros instead of millis).
@flxo
Copy link
Contributor Author

flxo commented Dec 19, 2016

Just a note: I played around with a moving average calculation for the durations. The measurements are the same as in this PR.
If the server is constantly running or the stats are persisted this might be worth to consider - I just did't want to PR something that changes the current behavior...

Copy link
Contributor

@luser luser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've wanted to add cache hit/miss timings for a while, that's really useful. I just have one question, otherwise this looks fine and I'll merge it.

cache_writes: u64::default(),
cache_write_duration: Duration::new(0, 0),
cache_read_hit_duration: Duration::new(0, 0),
cache_read_miss_duration: Duration::new(0, 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided doing this because I was too lazy to spell all this out, and Duration didn't have a Default implementation. That'll be fixed in Rust 1.15, but for now this is a reasonable thing to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not think a second about Default for Duration until rustc told me some facts and I wrote this bloaty method.

@@ -146,15 +145,41 @@ struct ServerStats {
/// The count of errors writing to cache.
pub cache_write_errors: u64,
/// The number of successful cache writes.
pub cache_writes: u32,
pub cache_writes: u64,
/// The total seconds spent writing cache entries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could change the description of these to just "The total time spent..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups. I missed that and will update the PR.

stat.set_name(String::from($name));
if $num > 0 {
let duration = $dur / $num as u32;
stat.set_str(format!("{}.{:03} s", duration.as_secs(), duration.subsec_nanos() / 1000_000));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wish Duration had an as_fractional_secs method or something like that. I filed an issue on that. Also: oops on the conversion bug. I wish there was at least a constant in the stdlib we could use for it!

let duration = $dur / $num as u32;
stat.set_str(format!("{}.{:03} s", duration.as_secs(), duration.subsec_nanos() / 1000_000));
} else {
stat.set_str("0.000 s".to_owned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there'd be any value in adding a Duration case to the CacheStatistic enum? We don't currently do anything interesting with it, although I'd like to add a way to have --show-stats output a JSON representation of the stats for consumption by other programs. (Granted you're just duplicating what I previously implemented here...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Duration in CacheStatistic came to my mind when thinking about #12 in order to have a two column display with value and unit (like in the ccache output):


max cache size            5.0 GB
some duration             433 ms
other duration          0.324 s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. We can fix that later, it would make fixing that nicer certainly.

@@ -146,15 +145,41 @@ struct ServerStats {
/// The count of errors writing to cache.
pub cache_write_errors: u64,
/// The number of successful cache writes.
pub cache_writes: u32,
pub cache_writes: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're changing this to u64? You're just doing a conversion in the set_duration_stat macro below anyway...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to u64 because all other counters are u64 and it looked odd. That's it.
I thought you just use u32 because of the division for the time calculation.
Shall I revert that one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I have strong feelings either way. I was mostly just curious to make sure I wasn't missing something important about the patch.

@luser
Copy link
Contributor

luser commented Dec 19, 2016

Just a note: I played around with a moving average calculation for the durations. The measurements are the same as in this PR.
If the server is constantly running or the stats are persisted this might be worth to consider - I just did't want to PR something that changes the current behavior...

I think the current behavior is fine for now. When we start persisting stats we might have to do something smarter. We're going to want to rework cache timing stats soon anyway, since we're going to implement support for fetching from multiple cache sources and other fanciness, so we'll probably want to store cache timing stats per-backend instance.

@luser luser merged commit f2f69e8 into mozilla:master Dec 19, 2016
@luser
Copy link
Contributor

luser commented Dec 19, 2016

Thanks for the patch!

@flxo flxo deleted the average branch December 19, 2016 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants