[BUG] regexp: inconsistent number of tokens produced by string split
and word boundaries \b
and \B
#11102
Labels
bug
Something isn't working
Describe the bug
The number of tokens is different between cuDF and Python when using
split
with\b
and\B
word boundaries.Steps/Code to reproduce bug
Expected behavior
I expect the number of tokens to be the same.
Environment overview (please complete the following information)
Environment details
Click here to see environment details
Additional context
None
The text was updated successfully, but these errors were encountered: