-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace regex to escape special chars with simple binary comprehension #121
replace regex to escape special chars with simple binary comprehension #121
Conversation
Running regex (especially three times in a row) is much more expensive that a binary comprehension. This makes little difference for small data sets, for anything beyond a few hundred items it becomes measurable. Data sets beyond a few thousand items are completely unusable without this change.
Hi, thank you for the PRs. Do you have benchmark code at hand? It's ok if these are the follow ups for real world observations, no problem. I want to have something scraping-related in the benchmarks folder for some time already. |
I also wonder if it makes sense to do just file:write instead of constructing binary and then doing write. OTOH escaped labels/help strings could be just cached on first encounter. |
I don't have a real benchmark, just a few lines that create a configurable amount of metrics and then I try to read them. For this PR I read them through a cowboy handler, for #120 is use
|
Hey, maybe you can share those lines too? it's totally possible to timer:tc the format call. |
I have pushed the bits and pieces that I used into RoadRunnr/prom |
thanks, I will play a bit and this will be merged soon. |
I have two further speed ups in this branch https://github.com/RoadRunnr/prometheus.erl/commits/experimental/speedup. For 200k histograms with 10 data points each, I get a ~60% speedup in the text formatting on top of the other improvements (from ~13 sec down to ~5 sec). Shall I open a new PR with the combined changes or wait till you have merge this PR and #120 ? |
I think having "replace io_lib:format with faster helpers" commit in this MR makes sense. |
Running regex (especially three times in a row) is much more expensive
that a binary comprehension. This makes little difference for small
data sets, for anything beyond a few hundred items it becomes measurable.
Data sets beyond a few thousand items are completely unusable without
this change.