-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Regex.EnumerateMatches #67794
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsAdding EnumerateMatches method which returns an enumerator that can iterate over the matches in a passed-in span. The operation is performed amortized allocation free.
|
Here is a quick benchmark I wrote to see how this compares with the existing way to iterate over a MatchCollection using Regex.Matches: // regex pattern used is "\b\w+\b" and the input is loremIpsum 5 paragraph string.
[Benchmark(Baseline = true)]
public int MatchCollection()
{
int x = 0;
for (int i = 0; i < 1000; i++)
{
foreach (Match match in regex.Matches(loremIpsum))
{
if (match.ValueSpan[0] >= 'a' && match.ValueSpan[0] <= 'z')
x++;
}
}
return x;
}
[Benchmark]
public int MatchEnuemrator()
{
int x = 0;
ReadOnlySpan<char> span = loremIpsum.AsSpan();
for (int i = 0; i < 1000; i++)
{
foreach (ValueMatch word in regex.EnumerateMatches(span))
{
if (span.Slice(word.Index, word.Length)[0] >= 'a' && span.Slice(word.Index, word.Length)[0] <= 'z')
x++;
}
}
return x;
} And the results are:
|
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
@olsaarik, in the current NonBacktracking code, it would benefit from knowing that indexes are needed but not captures. Will that still be the case after your upcoming fixes? |
…rateMatches and cleaning up some code.
ad317d4
to
aba9a54
Compare
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Count.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
876c140
to
ac0ca12
Compare
Fixes #65011
Fixes #23602
Adding EnumerateMatches method which returns an enumerator that can iterate over the matches in a passed-in span. The operation is performed amortized allocation free.