Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace regex to escape special chars with simple binary comprehension #121

Merged

Conversation

RoadRunnr
Copy link
Contributor

Running regex (especially three times in a row) is much more expensive
that a binary comprehension. This makes little difference for small
data sets, for anything beyond a few hundred items it becomes measurable.
Data sets beyond a few thousand items are completely unusable without
this change.

Running regex (especially three times in a row) is much more expensive
that a binary comprehension. This makes little difference for small
data sets, for anything beyond a few hundred items it becomes measurable.
Data sets beyond a few thousand items are completely unusable without
this change.
@deadtrickster
Copy link
Owner

Hi, thank you for the PRs. Do you have benchmark code at hand? It's ok if these are the follow ups for real world observations, no problem. I want to have something scraping-related in the benchmarks folder for some time already.

@deadtrickster
Copy link
Owner

I also wonder if it makes sense to do just file:write instead of constructing binary and then doing write. OTOH escaped labels/help strings could be just cached on first encounter.

@essen essen requested a review from gerhard April 26, 2021 20:10
@RoadRunnr
Copy link
Contributor Author

RoadRunnr commented Apr 27, 2021

Hi, thank you for the PRs. Do you have benchmark code at hand? It's ok if these are the follow ups for real world observations, no problem. I want to have something scraping-related in the benchmarks folder for some time already.

I don't have a real benchmark, just a few lines that create a configurable amount of metrics and then I try to read them. For this PR I read them through a cowboy handler, for #120 is use timer:tc, e.g.:

Callback = fun (Registry, Collector) -> {Time, _} = timer:tc(fun() -> prometheus_collector:collect_mf(Registry, Collector, fun({T, _}) -> io:format("T: ~p~n", [T]), ok; (_) -> ok end) end), io:format("~p, ~p, ~p~n", [Time, Registry, Collector]) end.
timer:tc(fun() -> prometheus_registry:collect(default, Callback) end). 

@deadtrickster
Copy link
Owner

Hey, maybe you can share those lines too? it's totally possible to timer:tc the format call.

@RoadRunnr
Copy link
Contributor Author

I have pushed the bits and pieces that I used into RoadRunnr/prom

@deadtrickster
Copy link
Owner

thanks, I will play a bit and this will be merged soon.

@RoadRunnr
Copy link
Contributor Author

I have two further speed ups in this branch https://github.com/RoadRunnr/prometheus.erl/commits/experimental/speedup.

For 200k histograms with 10 data points each, I get a ~60% speedup in the text formatting on top of the other improvements (from ~13 sec down to ~5 sec).

Shall I open a new PR with the combined changes or wait till you have merge this PR and #120 ?

@deadtrickster
Copy link
Owner

I think having "replace io_lib:format with faster helpers" commit in this MR makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants