-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix escaped untracked files #440
base: main
Are you sure you want to change the base?
Conversation
Files that are escaped by git have to be unescaped so that the diff_reporter can open the file.
Related Issue: #439 |
the filename. | ||
""" | ||
if filename.startswith(chr(34)) and filename.endswith(chr(34)): | ||
filename = ast.literal_eval(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I am freaked out by including literal_eval
in the code. Even if its safer than eval
I am not super familiar with c style escapes so I poked around a bit. Not finding much I ended up talking to copilot. After some chat it came up with this
def to_unescaped_filename(filename: str) -> str:
"""Try to unescape the given filename.
Some filenames given by git might be escaped with C-style escape sequences
and surrounded by double quotes.
"""
if not (filename.startswith('"') and filename.endswith('"')):
return filename
# Remove surrounding quotes
unquoted = filename[1:-1]
# Handle C-style escape sequences
result = []
i = 0
while i < len(unquoted):
if unquoted[i] == '\\' and i + 1 < len(unquoted):
# Handle common C escape sequences
next_char = unquoted[i + 1]
result.append({
'\\': '\\',
'"': '"',
"'": "'",
'n': '\n',
't': '\t',
'r': '\r',
'b': '\b',
'f': '\f',
}.get(next_char, next_char))
i += 2
else:
result.append(unquoted[i])
i += 1
return ''.join(result)
I checked out your branch and ran the tests. Any reason to think this would be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely understand the hesitation, only suggested it because the other solutions seemed to rely on certain encodings, and not sure how that behaves from system to system.
I tried to find the c-style encoding that the git man page talked about. The first thing I stumbled upon was
https://github.com/git/git/blob/a554262210b4a2ee6fa2d594e1f09f5830888c56/quote.c#L267
Not sure if that is the relevant code, but tbh I am not sure what exactly is happening here anyway.
But they look into a char array which looks somewhat similar to the dict you use:
https://github.com/git/git/blob/a554262210b4a2ee6fa2d594e1f09f5830888c56/quote.c#L222
Maybe adding the 'a'
and removing '
would make it an equal reversion, not really sure here.
Otherwise the function looks sensible and the tests pass. Also checked if I encounter any exceptions with the files that currently lead to problems (should be covered by the tests, but you never know) and all works fine.
Even if the function isn't perfect the current state is neither so it at least fixes a known subset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im happy to evolve this over time if other people find issues. How about putting this code into this pr, then I can merge it
Files that are escaped by git have to be unescaped so that the diff_reporter can open the file.