-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex code optimization candidates #25043
Comments
@dnickless I believe you mentioned in the past you were interested in perf optimizations? |
Thanks, @danmosemsft, for pinging me. I don't have an awful lot of time these days so I can't promise anything but I will do my best. |
Although RTL is a global option only, it's used internally to implement lookbehind assertions. So it is a dynamic value. Updated text above. |
For the stringmatch case, I tried the diff below and it made perf_regex.match a few % slower:
Still, @tarekgh it's a shame that TextInfo.ToLower(char) allocates a string (albeit 1 character string) even when the output is the same character. |
@danmosemsft @stephentoub talked to me this morning regarding the exact same issue and he mentioned going to look at it. I can help if @stephentoub don't have cycles for that. |
I have a fix I'm perf testing. |
@stephentoub 's fix does not help ToLower(char). Perhaps stringmatch is as good as it can be right now. |
That's because TextInfo.ToLower(char) already wasn't allocating 😄 |
I meant in wall-clock. |
I've been investigating the EnsureStorage function. While it seems to be possible to rewrite it to use stackalloc spans I'm not sure if the benefit is going to be worth the effort. Altering stack and track (at minimum, possibly crawl as well) to be span requires all the associated functions (push, pop, peek and overloads for all of them) to change to take the span parameters which is going to add a lot of extra parameter passing into the Go method. It will also cause a public api change because the method is protected it will need to change from EnsureStorage() to EnsureStorage(ref Span stack, ref Span track). At the moment that would only save allocation of 32+16 bytes at the default minimum sizes. I'm not sure what the maximum sensible stack allocation size would be before switching back to heap allocation but i can't see it making a lot of difference unless you're memory constrained and running quickly through matching in loops, which doesn't seems sensible in the first place. |
@Wraith2, @ViktorHofer can show you his private branch with his spanification work which may be interesting to you |
That may be instructive, I'd like to see that. I'm not sure how it would be possible to maintain the contract in a functional state while moving to spans for the initial storage. |
This is my private branch with the ongoing Spanification work: ViktorHofer/corefx#1 I don't think that a lot of time is really spent in EnsureStorage as the runner with its buffer is cached across its Regex instance. Only the first time when the buffer is empty or when the buffer is too small the |
Your branch is good progress towards modernising the entire assembly. I don't think there is any specific element in the current code that can be optimized without having an unfortunate effect elsewhere. In particular the tight coupling of the internals of the interpreter/runner and the use of private fields and methods through reflection make it very difficult to work with. I'm not sure this should be marked as up for grabs anymore. |
@ViktorHofer can you please check if it is really appropriate for Hackathon? If yes, please unassign yourself & Dan + add comment about next steps. |
Indeed, this issue requires a lot of knowledge about the code base. Seems right to not mark it as a Hackathon candidate. I will still leave the up-for-grabs label as I would be super happy if community members find further optimization candidates. |
The following items are potential candidates for code optimization which could result in an increased performance / reduced allocation. Based on perfview data and manual code review.
Time
https://source.dot.net/#System.Text.RegularExpressions/System/Text/RegularExpressions/RegexCharClass.cs,787
comments from @danmosemsft: optimizing for common characters [A-Za-z] first? Bloom filter? It's doing a binary search over a short list
comments from myself: I think binary search shouldn't be an issue even for small classes. See this article: https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/
https://source.dot.net/#System.Text.RegularExpressions/System/Text/RegularExpressions/RegexInterpreter.cs,256
Inline optimization for short case-sensitive strings (default).
if (str[--c] != _culture.TextInfo.ToLower(runtext[--pos]))
check whether the characters are identical before doing ToLowerhttps://source.dot.net/#System.Text.RegularExpressions/System/Text/RegularExpressions/RegexInterpreter.cs,148
Store a boolean whether either of those opcodes appear in the pattern at all. Reduces costs to a Boolean check only.
https://source.dot.net/#System.Text.RegularExpressions/System/Text/RegularExpressions/RegexInterpreter.cs,249
The benchmark analyzed by perfview uses case sensitive comparison. I don't know why so much time is spent in this function when it only does two comparisons and an index operation.
https://source.dot.net/#System.Text.RegularExpressions/System/Text/RegularExpressions/RegexRunner.cs,345
Though one of the functions is called Stack it's allocating arrays on the Heap. We should investigate how often these array sizes are increased and think about using stackalloc with Span/ValueListBuilder here.
There are probably tons of other places where inline optimizations are applicable.
System.Text.RegularExpression.Inline.xlsx
Allocations
In regex-redux there are 3 huge sets of allocations that dominate the rest
Could we use spans or pooled buffers or otherwise avoid these huge temporary strings?
Help is appreciated. If someone wants to collaborate I can share my perfview zip. I will extend this list over time.
FYI @stephentoub @jkotas @joshfree @vancem
The text was updated successfully, but these errors were encountered: