-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Optimization] Specialized quantification instruction #577
[Optimization] Specialized quantification instruction #577
Conversation
// which we then signalFailure if nil or currentPosition = next otherwise | ||
// This would have the benefit of potentially allowing us to not duplicate | ||
// code between the normal matching instructions and this loop here | ||
var next: Input.Index? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future work: Do we want to rework our Processor.Cycle() switch loop to do something like this for all of the matching instructions?
ie: A bunch of _doMatchThing functions that return Input.Index?
which we then signalFailure
if nil or currentPosition = next
otherwise
This would have the benefit of potentially allowing us to not duplicate code between the normal matching instructions and this switch here
/// the quantified cases | ||
/// | ||
/// Essentially we trade off implementation complexity for runtime speed by adding more true cases to this | ||
func shouldDoFastQuant(_ opts: MatchingOptions) -> Bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Future work: Should we allow ConsumeFunctions into .quantify?
// This would open up non-ascii custom character classes as well as the
// possibility of wrapping weirder cases into consume functions
// (allowing us to .quantify anything we want, but increasing our reliance on ConsumerInterface)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still be limited to knowing that it only consumes a single character at a time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, in that case we'd need a separate runQuantify
for consumers that would emit save points the normal way
@swift-ci test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, but can you make sure reluctant quantifications is well tested in the new builtins?
/// the quantified cases | ||
/// | ||
/// Essentially we trade off implementation complexity for runtime speed by adding more true cases to this | ||
func shouldDoFastQuant(_ opts: MatchingOptions) -> Bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still be limited to knowing that it only consumes a single character at a time?
guard let idx = next else { | ||
return true // matched zero times | ||
} | ||
if payload.quantKind != .possessive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these work for reluctant? Possessive also never backtracks, so I wonder if (future work) we should consider it entirely separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reluctant quantification isn't allowed into .quantify
because it never loops inside it like the other two so it didn't make sense to add into .quantify
In the future we could have a specialized reluctant quantifier instruction that takes in a both the quantification and the match instruction after it, which would handle cases like .*?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some asserts then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I had some before but they got lost in the other changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future work: It should be possible to optimize the pattern of reluctant quantifier + anchor character in a similar style by having a payload of two int registers that store the respective payloads for the quantification and the anchor.
The reason why we can't do this peephole optimization now is that we're stuck with the tree representation so there isn't a good way of determining if the reluctant quantifier is followed by an anchor
@swift-ci test |
The idea is to have an optimized quantification instruction that fuses match + save + loop into one instruction
Processor.cycle()
this will hopefully help with the instruction layout work that is soon to come. Some instructions likesplitSaving
orcondBranchIfZeroElseDecrement
might become much less common if most quantifications end up just being onequantify
instructionThe plan: Compile the most common quantification cases like
.*
,\w*
,[a-z0-9]*
anda*
into this fused instructionResults:
LinesAll
,EmailLookahead
,EagerQuantWithTerminalWhole
) as expectedAnchoredNotFoundWhole
) due to being killed by ARC insignalFailure()
NumbersAll
) due to the increased overheadGenerally good improvements but if we could speed up
signalFailure()
some of these benchmarks would be much faster, there were many cases of the quantification part of the regex getting faster but being much slower insignalFailure()
due to a combination of ARC and having to callindex(before:)
Note: based on top of #547