Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex UpdateBumpalong logic should be applicable to lazy loops as well #62677

Closed
stephentoub opened this issue Dec 11, 2021 · 1 comment · Fixed by #63398
Closed

Regex UpdateBumpalong logic should be applicable to lazy loops as well #62677

stephentoub opened this issue Dec 11, 2021 · 1 comment · Fixed by #63398
Assignees
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Dec 11, 2021

The optimization we employ in RegexNode.FinalOptimize to insert an UpdateBumpalong node after a starting loop does so for unbounded {One/Notone/Set}loop{atomic} nodes, but not for {One/Notone/Set}lazy nodes.

case Oneloop when node.N == int.MaxValue:
case Oneloopatomic when node.N == int.MaxValue:
case Notoneloop when node.N == int.MaxValue:
case Notoneloopatomic when node.N == int.MaxValue:
case Setloop when node.N == int.MaxValue:
case Setloopatomic when node.N == int.MaxValue:

It should be applicable for lazy loops as well, as if the match ends up failing, we will have explored the same range, just effectively in the opposite direction. One thing that may need more thought is if we need to special-case not applying this for atomic lazy loops, which will always only check for the minimum bound.

(This code, and other code in RegexNode, can also be cleaned up nicely now that we have improved C# pattern matching support.)

@ghost
Copy link

ghost commented Dec 11, 2021

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

The optimization we employ in RegexNode.FinalOptimize to insert an UpdateBumpalong node after a starting loop does so for unbounded {One/Notone/Set}loop{atomic} nodes, but not for {One/Notone/Set}lazy nodes.

case Oneloop when node.N == int.MaxValue:
case Oneloopatomic when node.N == int.MaxValue:
case Notoneloop when node.N == int.MaxValue:
case Notoneloopatomic when node.N == int.MaxValue:
case Setloop when node.N == int.MaxValue:
case Setloopatomic when node.N == int.MaxValue:

It should be applicable for lazy loops as well, as if the match ends up failing, we will have explored the same range, just effectively in the opposite direction. One thing that may need more thought is if we need to special-case not applying this for atomic lazy loops, which will always only check for the minimum bound.

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions, tenet-performance

Milestone: 7.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Dec 11, 2021
@stephentoub stephentoub self-assigned this Jan 3, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 5, 2022
@stephentoub stephentoub removed the untriaged New issue has not been triaged by the area owner label Jan 6, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 15, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Feb 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant