Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share RegexInterpreter.FindFirstChar logic with SymbolicRegexMatcher #60885

Closed
stephentoub opened this issue Oct 26, 2021 · 3 comments
Closed

Comments

@stephentoub
Copy link
Member

RegexInterpreter.FindFirstChar has multiple schemes it employs for finding the next place a match might exist, and we're adding more. Separately, SymbolicRegexMatcher has its own use of Boyer-Moore and IndexOfAny for when it gets into a state that's guaranteed not part of a match and it can try to zoom ahead to a location to start matching from again. It should be straightforward to factor out RegexInterpreter's FindFirstChar logic into a helper that we can then use from both places, both avoiding existing duplication, avoiding duplicated efforts moving forward, ensuring that SymbolicRegexMatcher benefits from existing RegexInterpreter optimizations, and ensuring SymbolicRegexMatcher benefits from new ones in the near future.

@stephentoub stephentoub added this to the 7.0.0 milestone Oct 26, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Oct 26, 2021
@ghost
Copy link

ghost commented Oct 26, 2021

Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

RegexInterpreter.FindFirstChar has multiple schemes it employs for finding the next place a match might exist, and we're adding more. Separately, SymbolicRegexMatcher has its own use of Boyer-Moore and IndexOfAny for when it gets into a state that's guaranteed not part of a match and it can try to zoom ahead to a location to start matching from again. It should be straightforward to factor out RegexInterpreter's FindFirstChar logic into a helper that we can then use from both places, both avoiding existing duplication, avoiding duplicated efforts moving forward, ensuring that SymbolicRegexMatcher benefits from existing RegexInterpreter optimizations, and ensuring SymbolicRegexMatcher benefits from new ones in the near future.

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: 7.0.0

@stephentoub
Copy link
Member Author

cc: @veanes, @olsaarik

@stephentoub stephentoub self-assigned this Oct 28, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 28, 2021
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Oct 28, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Nov 2, 2021
@stephentoub
Copy link
Member Author

Fixed by #61490

@ghost ghost locked as resolved and limited conversation to collaborators Dec 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants