Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream metrics 1.5 #1010

Closed
wants to merge 24 commits into from
Closed

Conversation

mattrjacobs
Copy link
Contributor

Rebased #981 and #986 against master

Matt Jacobs added 19 commits December 4, 2015 15:38
… to RollingNumber/RollingPercentile as well)
* Added HystrixServoMetricsPublisherCommandTest as a concrete unit test that behavior is still correct
* They get calculated only on health count intervals
…el, to thread-level.

* Added global-level stream as a way to recover command-level streams
@mattrjacobs
Copy link
Contributor Author

(copied from comment in #981)

This refactoring addresses #950. It is a large change, so I want to give the community a chance to look it over/ask questions before I start in on the 1.5.0 RC series (this will be the first commit there, if satisfactory).

The main changes in this PR:

  • Instead of command executions writing directly to HystrixRollingNumber/HystrixRollingPercentile data structures, they now write to a Subject that models a stream of command execution events.
  • HystrixCommandMetrics now consumes this stream in 4 ways:
    • Cumulative counter of event types
    • Rolling counter of event types
    • Rolling distribution of command latencies
    • Health counts of a circuit
  • Sample-based stream of thread pool / queue utilization

This change has not hurt performance when used in Netflix production for ~24 hours. I will publish jmh results here as well. I'll also start linking more Github issues to this ticket as I build on this work.

There should be no visible change in application behavior as a result of this change (see caveats below). However, it enables future work to add new functionality. (like #333)

Caveats (known differences in behavior)

  • Maximum concurrency of command is always 0 (HystrixCommandMetrics#getRollingMaxConcurrentExecutions)
  • HdrHistogram is used for latency distributions. Percentile calculation is different from HystrixRollingPercentile.
  • On application startup, an object pool of Histograms is allocated. This is currently hardcoded to 1000 objects. It's an area I want to explore more deeply

Comments/questions are welcomed and encouraged.

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #276 FAILURE
Looks like there's a problem with this pull request

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #278 FAILURE
Looks like there's a problem with this pull request

@cloudbees-pull-request-builder

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #284 FAILURE
Looks like there's a problem with this pull request

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #285 FAILURE
Looks like there's a problem with this pull request

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #286 FAILURE
Looks like there's a problem with this pull request

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #287 SUCCESS
This pull request looks good

@mattrjacobs
Copy link
Contributor Author

Closed in favor of rebased #1029

@mattrjacobs mattrjacobs deleted the stream-metrics-1.5 branch December 29, 2015 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants