Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Possible race condition in regular expression support for octal digits #4409

Closed
andygrove opened this issue Dec 21, 2021 · 3 comments · Fixed by #4735
Closed

[BUG] Possible race condition in regular expression support for octal digits #4409

andygrove opened this issue Dec 21, 2021 · 3 comments · Fixed by #4735
Assignees
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf P0 Must have for release

Comments

@andygrove
Copy link
Contributor

Describe the bug

I am seeing odd results with regular expressions containing octal digits. During testing I seem to be able to trigger a race condition where the first call to containsRe will produce false and subsequent calls will return true. Here is a test that reproduces the issue (to be added to RegularExpressionTranspilerSuite).

test("possible race condition") {
  for (_ <- 0 until 100) {
    println(gpuContains("\\101", Seq("A")).mkString(", "))
  }
}

For reference, here is the gpuContains method. I Added some println statements here.

private def gpuContains(cudfPattern: String, input: Seq[String]): Array[Boolean] = {
  println(cudfPattern)
  println(input.mkString(", "))
  val result = new Array[Boolean](input.length)
  withResource(ColumnVector.fromStrings(input: _*)) { cv =>
    withResource(cv.containsRe(cudfPattern)) { c =>
      withResource(c.copyToHost()) { hv =>
        result.indices.foreach(i => result(i) = hv.getBoolean(i))
      }
    }
  }
  result
}

Test output:

\101
A
false
\101
A
true
\101
A
true
...

If I remove the println statements from the gpuContains method then the result is always true.

Steps/Code to reproduce bug
See above.

Expected behavior
Result should be the same on every call.

Environment details (please complete the following information)
N/A

Additional context
N/A

@andygrove andygrove added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 21, 2021
@andygrove andygrove added this to the Dec 13 - Jan 7 milestone Dec 21, 2021
@andygrove andygrove self-assigned this Dec 21, 2021
@sameerz sameerz added P0 Must have for release and removed ? - Needs Triage Need team to review and classify labels Dec 21, 2021
@andygrove andygrove added the cudf_dependency An issue or PR with this label depends on a new feature in cudf label Dec 21, 2021
@andygrove
Copy link
Contributor Author

Related cuDF issue: rapidsai/cudf#9946

@andygrove
Copy link
Contributor Author

For 22.02 we will fall back to CPU for regular expressions containing octal digits (this is already the case).

@andygrove
Copy link
Contributor Author

andygrove commented Jan 28, 2022

We should go ahead and re-enable support octal digits (and the related tests) now that rapidsai/cudf#9993 was merged into cuDF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf P0 Must have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants