Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parseCmdLine doesn't handle quoting correctly (and also prevents passing empty arguments) #14343

Open
timotheecour opened this issue May 14, 2020 · 4 comments · Fixed by #18086
Labels
Medium Priority OS (stdlib) Related to OS modules in standard library

Comments

@timotheecour
Copy link
Member

timotheecour commented May 14, 2020

parseCmdLine doesn't handle quoting correctly. IMO current behavior, even if "works as designed according to docs" is not uesful and should be replaced by a more intuitive behavior.

Example 1

import os, unittest

proc main()=
  # let a = "foo '--nimcache:hi world" # ok
  let a = "foo --nimcache:'hi world'" # BUG
  # let a = "foo --nimcache:\"hi world\"" # ditto
  let s = parseCmdLine(a)
  check s == @["foo", "--nimcache:hi world"]
main()

Current Output

fails

Expected Output

works

Example 2

this bug prevents passing empty arguments, eg --batch:'' or --nimcache:'' (and is the reason why --threads: on "works" by accident but shouldn't)

see # BUG: with initOptParser, `--batch:'' all` interprets `all` as the argument of --batch`
(refs https://github.com/nim-lang/Nim/pull/14823/files#diff-0bc1750c146a1dfef2e41e4aa555789f310d35a44d7ee97abd35ef3fd128389bR5530)

Example 3

root cause of #18077

Possible Solution

We don't need to reproduce everything the shell does (which is complicated) but at least we should handle quoting correctly. If there is valid use case to keep existing behavior, I suggest we add a new proc, but I doubt there's a valid use case.

Additional Information

@Araq
Copy link
Member

Araq commented May 14, 2020

Since there is nothing "intuitive" about the typical shell's behaviour, there is also nothing to change here. IMO. Feel free to use your own parseCmdLine instead with different quirks. The real problem here is that the underlying char** argv should always be kept and never concated to a single string. Everything else simply doesn't work on Unix reliably as the parsing is the shell's domain.

@timotheecour
Copy link
Member Author

timotheecour commented May 14, 2020

you made the same argument before see #9951 (comment)

The quoting rules are shell specific and there is no way to do this properly. You need to use the API differently, cmdLineRest is inherently unreliable.
It is unfixable on Unix.

And yet, AFAIK I fixed cmdLineRest in #10291

Likewise in this case.

parseCmdLine(quoteShellCommand(args)) == args should always be true (IMO that's possible and shouldn't be too hard) or at very least should be true as much as possible. Maybe there's some esoteric shell where this won't work (please show examples), but I'm talking about the shells everyone uses (eg bash on osx) [2].

eg this fails currently:

when true: # D20200513T231430:here gitissue
  import std/[os,unittest]
  let args = @["foo", "baz bam", "--char:'"]
  check parseCmdLine(quoteShellCommand(args)) == args # fails: @["foo", "baz bam", "--char:", "\'", ""]

The real problem here is that the underlying char** argv should always be kept and never concated to a single string

This is of no help for the myriad of API's that consume string instead of seq[string] and that somehow must be converted to a seq (eg in testament, input commands are fed to execCmdEx2 after parseCmdLine). Feeding commands to a shell is not the only outcome; oftentimes programs need to manipulate these commands (eg check whether some argument is there etc), so a reliable parseCmdLine is needed.

The reality is that taking input commands as string instead of seq is often what's most convenient and it's in a ton of programs (nim or not), and nim compiler+tools makes extensive use of this (and rightfully so, it's practical). eg: important_packages takes input cmd, which is then fed to parseCmdLine; changing all these commands to seq[string] wouldn't make sense for convenience reason.


[1] one reason I didn't add args: seq[string] optional param in #14211 is because {poEvalCommand} is passed by default in execCmdEx and simply relying on args.len == 0 to override poEvalCommand would be incorrect.

Once this bug is fixed, execCmdEx2 and a few others become entirely redundant and callers will be able to call parseCmdLine + execCmdEx ` directly

[2] as mentioned in top post, I'm not talking about any kind of shell specific command execution/interpretation such as as transforming `${foo} or backtick interpretation `foo` or evaluation $(echo 12) ; only the quoting rules.

@Araq
Copy link
Member

Araq commented May 14, 2020

The reality is that taking input commands as string instead of seq is often what's most convenient and it's in a ton of programs (nim or not), and nim compiler+tools makes extensive use of this (and rightfully so, it's practical)

Oh I completely agree with that.

Likewise in this case.

Ok, fair enough, give it a try.

@ghost ghost added the OS (stdlib) Related to OS modules in standard library label Oct 19, 2020
@timotheecour timotheecour changed the title parseCmdLine doesn't handle quoting correctly parseCmdLine doesn't handle quoting correctly (and also prevents passing empty arguments) May 25, 2021
@timotheecour timotheecour reopened this May 26, 2021
@timotheecour
Copy link
Member Author

inadvertantly closed because of PR msg containing in future work we should still fix #14343, for other reasons

timotheecour added a commit to timotheecour/Nim that referenced this issue Jul 31, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 8, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 9, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 11, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 11, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 11, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 11, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 12, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 15, 2021
timotheecour added a commit to timotheecour/Nim that referenced this issue Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Medium Priority OS (stdlib) Related to OS modules in standard library
Projects
None yet
2 participants