Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize out unnecessary eclass bitmaps #596

Merged
merged 1 commit into from
Dec 6, 2024

Conversation

zherczeg
Copy link
Collaborator

@zherczeg zherczeg commented Dec 5, 2024

Large patch, contains code moves and reworks.

@@ -2276,12 +2288,11 @@ return TRUE;
/* This function consumes unary prefix operators. */

static BOOL
compile_class_unary(uint32_t options, uint32_t xoptions, BOOL negated,
compile_class_unary(eclass_context *context, BOOL negated,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be merged with compile_class_juxtaposition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clear as it is. I don't mind either way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe later. Not in this patch.

Copy link
Member

@NWilson NWilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That much easier than I'd feared it would be. Thank you.

src/pcre2_xclass.c Outdated Show resolved Hide resolved
src/pcre2_compile_class.c Outdated Show resolved Hide resolved
@@ -2143,6 +2173,7 @@ switch (meta)
if (lengthptr != NULL)
*lengthptr += code - (code_start + 1);
code = code_start + 1;
*context->eclass_start |= CLASS_ECLASS_HAS_BITMAP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you could add a check, for whether any bits in the bitmap are actually set. If the user writes [^\x00-\xFF] then it will generate an OP_NCLASS with no bits set, and the eclass won't need a bitmap.

If you wanted to add the check it would be "for ECL_NONE see if bitmap is all zero; for ECL_ANY see if the bitmap is all set"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. The OP_ALLANY is handled above, and the OP_CLASS with all zero bits can be optimized. Nothing else can be optimized. Am I right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition to leave out the map is: if you emit ECL_ANY, does the pop_info have an all-ones bitmap? And if you emit an ECL_NONE, does the pop_info have an all-zeros bitmap?

One of the OP_NCLASS branches could hit that condition, so doesn't always need a bitmap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ECL_ANY is set, and all bits are set, the compile_class_not_nested already converts the XCLASS to OP_ALLANY, and handled earlier. Hence the ECL_NONE and all bits are cleared are the remaining case. Then my question was correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, we agree! You've special-cased the OP_NCLASS case, which (out of OP_CLASS & OP_NCLASS) is the only one where it's possible to not set need_bitmap.


PCRE2_ASSERT(*code_start == OP_XCLASS);
*code_start = pop_info->op_single_type = ECL_XCLASS;

PCRE2_ASSERT(code - code_start >= 1 + LINK_SIZE + 1);
flags = code_start[1 + LINK_SIZE];

if ((flags & XCL_MAP) != 0)
if ((flags & XCL_MAP) != 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing that you write out the XCL_MAP flag, but then don't write out the bitmap (leaving it in cb->classbits instead), and then check & clear the flag immediately afterwards.

Is there not a cleaner way to do it? (I admit - it was ugly, the way I was rewriting the bytecode before.) If not, then it needs a comment a least.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced it with a BOOL* pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes me happy!

src/pcre2_compile_class.c Outdated Show resolved Hide resolved
@@ -2276,12 +2288,11 @@ return TRUE;
/* This function consumes unary prefix operators. */

static BOOL
compile_class_unary(uint32_t options, uint32_t xoptions, BOOL negated,
compile_class_unary(eclass_context *context, BOOL negated,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clear as it is. I don't mind either way.

negated = !negated;

/* Because it's a non-empty class, there must be an operand at the start. */
if (!compile_class_binary_loose(options, xoptions, negated, &ptr, &code,
pop_info, errorcodeptr, cb, lengthptr))
(*pptr)++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (*pptr++) is placed confusingly under the comment now. But this skip has nothing to do with the operand, it's the META_CLASS we're skipping.

@@ -2292,15 +2303,13 @@ while (*ptr == META_ECLASS_NOT)
negated = !negated;
}

*pptr = ptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving up the pptr assignment breaks the pattern with the other functions. Although, it looks like it behaves the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an optimization. There is no need to cache the pptr.

PCRE2_UCHAR *previous;
BOOL allbitsone = TRUE;

context.eclass_start = *pptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing this bit in the META code seems very dangerous. It carries the risk that different branches will be taken on the two passes through the code.

The flag you want should just be a BOOL on the eclass_context, so it's got a fresh state on each pass.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the whole point of the patch. Although I realized you inject the bitmap at the end, so caching this value is likely unnecessary.

src/pcre2_compile_class.c Show resolved Hide resolved
@zherczeg
Copy link
Collaborator Author

zherczeg commented Dec 5, 2024

I tried one of your suggestions. Original code:

  re> /[\pL&&[^\x00-\xFF]]/B,utf,alt_extended_class
------------------------------------------------------------------
        Bra
        [\p{L}]
        Ket
        End
------------------------------------------------------------------
data> a
 0: a

This should not happen. Interestingly, this patch fixes it. Not sure what is happening.

@zherczeg
Copy link
Collaborator Author

zherczeg commented Dec 5, 2024

I have resolved the some conversions, but feel free to reopen them if they are not resolved. There is one comment I don't understand. Let me know if other changes are needed.

@zherczeg zherczeg force-pushed the optimize_bitmap branch 3 times, most recently from 700e029 to a9a8a4b Compare December 5, 2024 15:47
@zherczeg
Copy link
Collaborator Author

zherczeg commented Dec 5, 2024

I hope I fixed everything. I noticed that it is not possible to tell the difference in the byte code dump that if an xclass has no bitset, or all bits in the bitset are zero. Maybe an (allzero) could be added after [] if a bitset is present, and everything is 0. I can do this in another patch.

@@ -2125,6 +2155,7 @@ switch (meta)
pop_info->length = 1;
*code_start = pop_info->op_single_type = ECL_ANY;
memset(pop_info->bits.classbits, 0xff, 32);
context->needs_bitmap = TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this OP_ALLANY → ECL_ANY branch does not need to set needs_bitmap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True!

@@ -1752,6 +1774,8 @@ whole class was negated and whether there were negative specials such as \S
(non-UCP) in the class. Then copy the 32-byte map into the code vector,
negating it if necessary. */

PCRE2_ASSERT(classbits == cb->classbits.classbits);
Copy link
Member

@NWilson NWilson Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could lose both of these assertions if you want. We only ever assign once to the classbits variable (it's const) so there's really not much to check.

There are many places throughout the function where we assume that classbits and classwords are the same.

Or make it a uint8_t *const classbits = ... instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I have added a comment about it as well.

@NWilson
Copy link
Member

NWilson commented Dec 5, 2024

I tried one of your suggestions. Original code:

  re> /[\pL&&[^\x00-\xFF]]/B,utf,alt_extended_class
------------------------------------------------------------------
        Bra
        [\p{L}]
        Ket
        End
------------------------------------------------------------------
data> a
 0: a

This should not happen. Interestingly, this patch fixes it. Not sure what is happening.

Ouch! I have diagnosed the cause of this, in the code that's currently on master.

There are two cases where we emit a bitmap: there's the branch (previously in pcre2_compile.c) which emits an actual OP_ECLASS, and there's the branch that converts a single ECL_XCLASS into an actual OP_XCLASS. Both of these were using the same check: need_map = !allbitszero;

For OP_ECLASS, that (was) correct - no bitmap meant the same as all-bits-zero. But for OP_XCLASS it was not true. So there was an existing bug in my code you fixed.

The new logic checks context.needs_map in both branches, which is good - because OP_ECLASS and OP_XCLASS now have the same semantics for when a map is not provided.

Great! I'm glad I understand the bug, and can verify that you've properly fixed it, not just hidden it.

Copy link
Member

@NWilson NWilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left two additional small comments. Thank you for working on this! You can merge whenever you're ready (with or without those small suggestions).

@zherczeg
Copy link
Collaborator Author

zherczeg commented Dec 6, 2024

I am happy that one less bug in the code. The test case was your suggestion, I just tried it.

@zherczeg zherczeg merged commit f95b881 into PCRE2Project:master Dec 6, 2024
21 checks passed
@zherczeg zherczeg deleted the optimize_bitmap branch December 6, 2024 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants