-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter: --query
broken with backtick quoting
#1415
Comments
If you want ideas for approaches here, let me know. |
@tsibley thanks. Function in question is augur.filter.include_exclude_rules.extract_variables. I'm looking to see if we can leverage Pandas internals to do this. So far I have this, but it produces false-positives such as import keyword
from pandas.core.computation.parsing import tokenize_string, BACKTICK_QUOTED_STRING
# Backtick-quoted string or 1 (NAME)
return set(tokval for toknum, tokval in tokenize_string(pandas_query)
if toknum == BACKTICK_QUOTED_STRING or (toknum == 1 and not keyword.iskeyword(tokval))) Do you have other ideas? |
Actually, false positives are manageable since the column names are known. I'll make a PR with what I drafted, but still open to suggestions! |
I'd likely preprocess the query as a string to replace backtick-quoted names with generated names, then proceed with parsing as-is now, then translate generated names in resulting set back to the originals. The preprocessing of backtick-quoted names can be done with a regex (it's a rather constrained problem to tackle) or with something more formal like a parsing expression grammar via pyparsing. |
@tsibley I've implemented your suggestion as #1417. There are now 3 PRs that address this bug:
The first two are mutually exclusive. I think the last one is good to have regardless. |
flagged on Slack
Description
Previously, a metadata column could be queried using backtick quoting¹ supported by pandas.DataFrame.query. It no longer works as of d0f36a1.
¹ e.g.
--query '(`region name` == "A")'
Affected versions
Augur 24.2.0, 24.2.1
Workaround
Use Augur <=24.1.0, or <=24.0.0 to avoid #1411. For users of Nextstrain's Docker runtime,
NEXTSTRAIN_DOCKER_IMAGE=nextstrain/base:build-20240122T233533Z
is the latest image that has Augur 24.0.0.The text was updated successfully, but these errors were encountered: