-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
env::args differs from MSVC CRT in some cases on Windows #44650
Comments
This probably won't surprise you, but stdlib uses CommandLineToArgvW: rust/src/libstd/sys/windows/args.rs Line 29 in 1cdd689
|
That's fine. It wouldn't be too terrible to write a test case to match with the newer CRT, as long as the libs team was fine with it. This sounds like it's going to end with "you broke backwards compat." |
Fundamentally this is a big @rust-lang/libs decision of which standard to follow for argument parsing. I'm not particularly concerned with which standard we follow as long as it is Microsoft's standard and we follow it accurately. Decide between:
|
I personally feel that we should follow the newer standard since that's what all Windows apps are using these days. I'm concerned about the possibility of this being a breaking change but this seems to only affect some rare edge cases in quote handling. |
I'm legitimately not sure what "all Windows apps are using these days."
Let's try cmd.exe:
Now let's try PowerShell:
I'm not actually sure if there even exists a way to get decent escaping in PS. From the way variables behave, it looks like putting two quotes in a row creates an escaped quote, but they don't get properly re-escaped when you pass them to a command line app:
Microsoft is making it really difficult to figure out what actually counts as "correct" command line parsing, when there doesn't even seem to be a "right" way that actually works with their stuff. Let's see what WSL does:
Bash still acts like you expect it, and apparently the WSL-to-Windows bridge generates backslash escapes when it tries to convert WSL's arguments list into a Windows GetCommandLineW string. Nuts that Bash on Ubuntu on Windows does a better job than PowerShell does (at least, it generates command strings that are capable of being unambiguously parsed). My next thought was to create a file name with quotes in it, and see what kind of parameters Explorer generates for me, but apparently Windows doesn't actually allow quotes in file names. And, finally, let's see what happens if one Rust program calls another one:
Apparently, everybody seems to prefer backslashes when generating CLIs for Windows, except for PowerShell which passes quotes unescaped, and cmd.exe which just passes the raw CLI unchanged. So the current approach already seems to have interoperability as good as we would need it to be, but changing it to mimic the current CRT version probably won't break anything, either. |
I think it's safe to say that "all Windows apps" are using Microsoft's C runtime for parsing commandline arguments. If "all" means "the majority of software built since 2008". In powershell you can use backticks to escape quotes:
Or you can surround the argument with single quotes:
Or you can pass the entire line verbatim, without interpretation by the shell:
Or, if you're feeling particularly adventurous:
I'm not sure how you reached your last conclusion. It seems odd to use Rust's argument parsing to prove how most Windows programs handle arguments, no? |
No I can't. Here's what happens when I try:
It's exactly the same as when I did it with two quotes in a row. It creates a string object with quotes in it (like we want), but then it turns around and passes it verbatim without re-escaping it.
It's not PowerShell's syntax that I'm questioning here. I can easily create a string object with quotes in it. What I want to know is what syntax PowerShell expects us to have. What syntax do they use when re-serializing the command line? Apparently, they don't.
Rust and Cargo communicate through command-line parameters, so we probably want to make sure that |
The behaviour I was hoping for, by the way, is something like one of these:
It's the same behaviour that you were getting when you demoed the python interpreter's argument parsing: it's a single argument with spaces and quotes within it. It's "how do I escape a quote?" |
Just to be absolute clear about how Windows The Windows kernel doesn't know at all about arguments. As far as it's concerned arguments are just a single string that it can pass on to a new process. No more, no less. It's not an array of strings. It's just one string. This string doesn't have to contain the So what happens is relatively simple:
Step 1 in the above could be replaced by an application. In which case it's entirely up to the application what it sets Of course you usually want to make an arguments string that will be correctly interpreted by other applications. So for compatibility reasons double quotes should be escaped if they are actually wanted in the parsed arguments. Or not if they're unwanted. Some code for printing arguments// C++
// Compile: `cl /EHsc /nologo argv.cpp`
#include <iostream>
int wmain(int argc, const wchar_t* wargv[])
{
for (int i = 0; i < argc; i++) {
std::wcout << '`' << wargv[i] << '`' << std::endl;
}
} // C#
// Compile: `csc /nologo argv.cs`
class MainClass
{
static int Main(string[] args)
{
foreach (string arg in args) {
System.Console.WriteLine("`" + arg + "`");
}
return 0;
}
} # Python
import sys
for arg in sys.argv:
print(f"`{arg}`") They all work the same, except for Rust which is special. In the following table I've trimmed the enclosing
|
In short the powershell user is constructing the But I would add that the intricacies of powershell should have nothing to do with Rust std's handling of command line arguments. So long as the std behaves like other Windows applications it doesn't matter. If powershell makes a nicer way to construct a So to sum up, as you say, Rust has two jobs:
Of course it's easy for Rust to be consistent with itself no matter what rules it uses. It just has to test that construction and parsing match up as expected. However, it's obviously beneficial to be consistent with other applications on the platform. All that said, there's nothing massively wrong with how Rust currently parses arguments. It's just that 12 years ago Microsoft tweaked their CRT's argument parsing rules slightly. Rust uses the old rules so, in some situations, Rust applications may exhibit surprising behaviours to users more accustomed to modern applications. |
Yeah, I agree with everything you said there. I'm fine with changing to the new CRT behaviour, as long as we don't break anything. And it looks like we won't. |
PowerShell is not a good example, because the rules are still screwed up. PowerShell/PowerShell#1995 . It's very uncomfortable to call apps from PowerShell. I sometimes ended up with generating a bat file and then running that from PowerShell. |
@stej I'd highly recommend installing Powershell 7 if you can. The current stable version is 7.0.1 which has slightly better command line handling, and the upcoming 7.1 release should improve it further. Otherwise, using |
…-ou-se Update Windows Argument Parsing Fixes rust-lang#44650 The Windows command line is passed to applications [as a single string](https://docs.microsoft.com/en-us/archive/blogs/larryosterman/the-windows-command-line-is-just-a-string) which the application then parses to get a list of arguments. The standard rules (as used by C/C++) for parsing the command line have slightly changed over the years, most recently in 2008 which added new escaping rules. This PR implements the new rules as [described on MSDN](https://docs.microsoft.com/en-us/cpp/cpp/main-function-command-line-args?view=msvc-160#parsing-c-command-line-arguments) and [further detailed here](https://daviddeley.com/autohotkey/parameters/parameters.htm#WIN). It has been tested against the behaviour of C++ by calling a C++ program that outputs its raw command line and the contents of `argv`. See [my repo](https://github.com/ChrisDenton/winarg/tree/std) if anyone wants to reproduce my work. For an overview of how this PR changes argument parsing behavior and why we feel it is warranted see rust-lang#87580 (comment). For some examples see: rust-lang#87580 (comment)
…-ou-se Update Windows Argument Parsing Fixes rust-lang#44650 The Windows command line is passed to applications [as a single string](https://docs.microsoft.com/en-us/archive/blogs/larryosterman/the-windows-command-line-is-just-a-string) which the application then parses to get a list of arguments. The standard rules (as used by C/C++) for parsing the command line have slightly changed over the years, most recently in 2008 which added new escaping rules. This PR implements the new rules as [described on MSDN](https://docs.microsoft.com/en-us/cpp/cpp/main-function-command-line-args?view=msvc-160#parsing-c-command-line-arguments) and [further detailed here](https://daviddeley.com/autohotkey/parameters/parameters.htm#WIN). It has been tested against the behaviour of C++ by calling a C++ program that outputs its raw command line and the contents of `argv`. See [my repo](https://github.com/ChrisDenton/winarg/tree/std) if anyone wants to reproduce my work. For an overview of how this PR changes argument parsing behavior and why we feel it is warranted see rust-lang#87580 (comment). For some examples see: rust-lang#87580 (comment)
For a command-line of the form
"a/"b.exe
, wherea/b.exe
does indeed exist,std::env::args()
produces different results thanargc
andargv
in a C++ program. Specifically,CommandLineToArgvW
is returning troublesome results. It looks like the CRT and CommandLineToArgvW disagree."a/"b.exe
a/b.exe
]a/b.exe
]"a/"b.exe
a/
,b.exe
]std::env::args()
: [a/
,b.exe
]Obviously, this interpretation makes no sense.
env::current_exe
does not seem to be affected.A Rust program which demonstrates the mismatch:
A C++ program which demonstrates the mismatch:
The text was updated successfully, but these errors were encountered: