Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support raw Prometheus metrics from models #1651

Closed
haykinson opened this issue Apr 1, 2020 — with Board Genius Sync · 8 comments
Closed

Support raw Prometheus metrics from models #1651

haykinson opened this issue Apr 1, 2020 — with Board Genius Sync · 8 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

Copy link

Currently in our model code we gather metrics using OpenCensus, then have to marshal that data into the format that the /metrics endpoint uses. It would be great if we could, instead, just use a Prometheus exporter for OpenCensus and publish those metrics in its native format, to be scraped as per the implementation #1507.

@haykinson haykinson added the triage Needs to be triaged and prioritised accordingly label Apr 1, 2020
@RafalSkolasinski
Copy link
Contributor

Hi, not sure if I understand what you want to achieve. Do you want to bypass logic that transforms output of metrics method in the wrapper and exposes it and instead provide content for /metrics on your own?

@haykinson
Copy link
Author

Correct.

@ukclivecox ukclivecox removed the triage Needs to be triaged and prioritised accordingly label Apr 2, 2020
@ryandawsonuk
Copy link
Contributor

So basically you'd like to add your own custom endpoint for metrics and configure prometheus to scrape it? And you're using the python wrapper?

@haykinson
Copy link
Author

Correct, we're using the python wrapper (with 1.0.2 right now). However, we're trying to capture measures (internally within our model) using OpenCensus. We then try to expose them using the metrics method, but Seldon assumes that each COUNTER metric is actually additive to the previous one that was captured (i.e. on each call it performs effectively old += new). Since we're keeping the count measure ourselves, we're in effect already doing the counter aggregation, and it feels like it would be easier for us to just expose the metrics via the Prometheus exporter for OpenCensus, rather than proxy the metric via Seldon. The Prometheus exporter runs a separate server on a separate port, I think, but the key aspect is that we don't let Seldon manipulate what we emit.

As an aside, maybe there's an easier way of handling this issue? My goal is to measure failures / latencies internal to my code, classified with tags etc. Given that the model class is instantiated at service startup, and that predict() can be called separately from metrics(), we would need to keep our own aggregations that are then returned by metrics(). The way I see it:

  1. predict() called. Increment counter from 0 to 1
  2. metrics() called. Return counter value of 1, reset to 0.
  3. predict() called. Increment counter from 0 to 1.
  4. predict() called. Increment counter from 1 to 2.
  5. metrics() called. Return counter value of 2, reset to 0.
  6. metrics() called. Return counter value of 0 or don't return the counter.

This is part of the reason we're using OpenCensus to return aggregates. But they don't provide a way to reset the count, typically, and so we're just always increasing every time metrics() is called. My workaround for this right now is to keep a counter metric internal to my code, but expose it within metrics() as a GAUGE instead.

If I misunderstand the way this works and you suggest a better way of handling this than exposing our aggregates to prometheus directly, please advise.

@RafalSkolasinski
Copy link
Contributor

RafalSkolasinski commented Apr 6, 2020

@haykinson Would for example having RAW_COUNTER type possible to be returned by metrics method that expose metrics in same format as COUNTER but does not auto-increment (just uses value provided) address your use case?

@haykinson
Copy link
Author

Yeah, I suppose that could work as well as a GAUGE in that case. Even better would be to let the metrics endpoint be scraped by Prometheus from a custom port / path (i.e. 0.0.0.0:9090/metrics), so that it doesn't have to be served from the model class — this way histograms would be servable too without depending on any seldon wrapper translation that may happen.

@seldondev
Copy link
Collaborator

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
/lifecycle stale

@seldondev seldondev added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2020
@axsaucedo axsaucedo changed the title Support raw Prometheus metrics from models OSS-50: Support raw Prometheus metrics from models Apr 26, 2021
@axsaucedo axsaucedo changed the title OSS-50: Support raw Prometheus metrics from models Support raw Prometheus metrics from models Apr 28, 2021
@ukclivecox
Copy link
Contributor

Will close this. Can you reopen if still a requirement?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants