Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect if a regex escape is needed in a pytest match statement #13705

Closed
nstarman opened this issue Oct 10, 2024 · 7 comments · Fixed by #14966
Closed

detect if a regex escape is needed in a pytest match statement #13705

nstarman opened this issue Oct 10, 2024 · 7 comments · Fixed by #14966
Labels
rule Implementing or modifying a lint rule

Comments

@nstarman
Copy link

nstarman commented Oct 10, 2024

Sometimes re.escape are needed in pytest for match=... arguments.
Then the string is modified and the regex escape is no longer needed.
It would be great if ruff could detect this and remove the unneeded re.escape.

M~WE:

import pytest, re

with pytest.raises(Exception, match=re.escape("regexnotneeded")):
    ...
@MichaReiser
Copy link
Member

@InSyncWithFoo was so kind to implement the rule and we looked through the ecosystem changes. After seeing the changes, I don't think we should add this rule to ruff. While it's true that the re.escape call is unnecessary, it does train people not to use re.escape and they might then forget using it in cases where they actually should.

It makes me wonder if we could create ar ule that warns about match parameters that would need escaping. Are there heuristics that would allow us to detect correctly escaped from incorrectly escaped strings?

@nstarman
Copy link
Author

Easier for escaped that don't need to be, much harder for not-escaped that should to be. Pytest matches can work with regex, so it would be very hard to figure out if a string that looks like a regex isn't and should be escaped.

@AlexWaygood
Copy link
Member

If you have a string passed to pytest.raises() that is:

  1. Not a raw string
  2. Not surrounded by re.escape() and
  3. contains a regex metacharacter

Then I think it's reasonable to say that the string should either be a raw string (if it's meant to be a regex) or should be surrounded by re.escape() (if it's not meant to be a regex). That might possibly be a useful rule to add? What do we think?

@InSyncWithFoo
Copy link
Contributor

Then I think it's reasonable to say that the string should either be a raw string (if it's meant to be a regex) or should be surrounded by re.escape() (if it's not meant to be a regex).

Sounds good to me. This also goes well with unraw-re-pattern.

@nstarman
Copy link
Author

And the reverse — if a string is surrounded by re.escape but doesn't have any regex in it?

@AlexWaygood
Copy link
Member

And the reverse — if a string is surrounded by re.escape but doesn't have any regex in it?

We already considered and rejected that check in #14746, as Micha stated above in #13705 (comment) I agree with his rationale.

@InSyncWithFoo
Copy link
Contributor

I suppose I can go ahead and implement the new rule then? A PR should be ready sometime later this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
4 participants