Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docopt-ng fails to parse usage string that worked with docopt #33

Closed
h4l opened this issue Aug 18, 2022 · 4 comments · Fixed by #36
Closed

docopt-ng fails to parse usage string that worked with docopt #33

h4l opened this issue Aug 18, 2022 · 4 comments · Fixed by #36

Comments

@h4l
Copy link
Contributor

h4l commented Aug 18, 2022

I've got an old CLI program which uses OG docopt. I'm giving it a bit of minor TLC to refresh the tooling, and I tried switching to docopt-ng, but -ng fails to parse my usage string for some reason:

vscode@46fada5e45f4 /w/rnginline ((bef5200c…)) [127]> poetry run ipython
Python 3.10.5 (main, Jun  6 2022, 12:05:50) [GCC 9.5.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import docopt

In [2]: docopt.__version__
Out[2]: '0.8.1'

In [3]: from rnginline import cmdline

In [4]: docopt.docopt(cmdline.__doc__, argv=['rnginline', '--no-libxml2-compat', '/some/file'])
An exception has occurred, use %tb to see the full traceback.

DocoptExit: Warning: found unmatched (duplicate?) arguments [Option(None, '--no-libxml2-compat', 0, True)]
usage: rnginline [options] <rng-src> [<rng-output>]
       rnginline [options] --stdin [<rng-output>]

/home/vscode/.cache/pypoetry/virtualenvs/rnginline--qKLlanv-py3.10/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3406: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

In [5]:                                                                                                                                                                                                            
Do you really want to exit ([y]/n)? y
vscode@46fada5e45f4 /w/rnginline ((bef5200c…))> poetry run pip uninstall docopt-ng
Found existing installation: docopt-ng 0.8.1
Uninstalling docopt-ng-0.8.1:
  Would remove:
    /home/vscode/.cache/pypoetry/virtualenvs/rnginline--qKLlanv-py3.10/lib/python3.10/site-packages/docopt/*
    /home/vscode/.cache/pypoetry/virtualenvs/rnginline--qKLlanv-py3.10/lib/python3.10/site-packages/docopt_ng-0.8.1.dist-info/*
Proceed (Y/n)? y
  Successfully uninstalled docopt-ng-0.8.1
vscode@46fada5e45f4 /w/rnginline ((bef5200c…))> poetry run pip install docopt
Collecting docopt
  Using cached docopt-0.6.2-py2.py3-none-any.whl
Installing collected packages: docopt
Successfully installed docopt-0.6.2

vscode@46fada5e45f4 /w/rnginline ((bef5200c…))> poetry run ipython
Python 3.10.5 (main, Jun  6 2022, 12:05:50) [GCC 9.5.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import docopt

In [2]: docopt.__version__
Out[2]: '0.6.2'

In [3]: from rnginline import cmdline

In [4]: docopt.docopt(cmdline.__doc__, argv=['rnginline', '--no-libxml2-compat', '/some/file'])
Out[4]: 
{'-': None,
 '--base-uri': None,
 '--default-base-uri': None,
 '--help': False,
 '--no-libxml2-compat': True,
 '--stdin': False,
 '--traceback': False,
 '--version': False,
 '<rng-output>': '/some/file',
 '<rng-src>': 'rnginline'}

This is the usage string: https://github.com/h4l/rnginline/blob/b1d1c8cda2a17d46627309950f2442021749c07e/rnginline/cmdline.py#L14

I really appreciate your efforts in keeping docopt going, It's a great library.

@h4l
Copy link
Contributor Author

h4l commented Aug 21, 2022

I just looked into what's causing this. The original docopt repo contains some changes to the parser in the master branch that were never released. They aim to fix the problem of needing a blank line after the usage: section before the options: section starts: docopt/docopt#102

The original docopt author keleshev drafted a fix for the issue, but doesn't seem happy with it:

This is now in master, however I'm still not 100% sure: it seems like a good idea, but it will brake a lot of code.

I'm actually very unhappy with this regexp. It makes even harder to describe docopt formally (using some grammar). Also harder to explain. I'm still sure that "sections" is a good feature, I'm just not sure how to implement them. I need to think more about it.

This regex he introduces to parse sections is the reason why my usage string no longer works. The regex ends a section on a blank line. My options: section looks like this:

Options:
    <rng-src>
        Filesystem path or a URL of the .rng file to inline.

    <rng-output>
        Filesystem path to write the inlined schema to. If not provided, or is
        -, stdout is written to.

    --traceback
        Print the Python traceback on errors.
[...]

So parse_section('options:', ...) thinks this is my entire options section:

Options:
    <rng-src>
        Filesystem path or a URL of the .rng file to inline.

And so it doesn't pick up any of my options.

A workaround is to explicitly insert trailing spaces into the blank lines, using a unicode escape sequence to avoid auto-formatters stripping trailing whitespace. This prevents the regex seeing an empty line, so it matches the whole options section.

Options:
    <rng-src>
        Filesystem path or a URL of the .rng file to inline.
\u0020
    <rng-output>
        Filesystem path to write the inlined schema to. If not provided, or is
        -, stdout is written to.
\u0020
    --traceback
        Print the Python traceback on errors.

I've not thought about this in enough detail to be really confident about suggesting a solution, but something comes to mind. The problem that drove this change was that the usage: section needed a blank line to mark its end. I think docopt only needs to extract the usage section to work (to print a usage summary, and to parse the usage expression). So rather than also changing the rules for parsing options, the usage: section could be ended either on a blank line, or at the apparent start of another section (e.g. a line matching ^[\S]+(?:\s+[\S]+)*: (so foo: or foo bar baz: etc).

@h4l
Copy link
Contributor Author

h4l commented Aug 21, 2022

The add and branch cmds in this repo's examples/git program are affected by this:

git add
(.venv) vscode@c4c2b63bd410 /w/d/e/git (master) [0|1]> python git.py add --help
global arguments:
{'--bare': False,
 '--exec-path': None,
 '--git-dir': None,
 '--help': False,
 '--html-path': False,
 '--no-pager': False,
 '--no-replace-objects': False,
 '--paginate': False,
 '--version': False,
 '--work-tree': None,
 '-c': None,
 '<args>': ['--help'],
 '<command>': 'add'}
command arguments:
usage: git add [options] [--] [<filepattern>...]

    -h, --help
    -n, --dry-run        dry run
    -v, --verbose        be verbose

    -i, --interactive    interactive picking
    -p, --patch          select hunks interactively
    -e, --edit           edit current diff and apply
    -f, --force          allow adding otherwise ignored files
    -u, --update         update tracked files
    -N, --intent-to-add  record only the fact that the path will be added later
    -A, --all            add all, noticing removal of tracked files
    --refresh            don't add, only refresh the index
    --ignore-errors      just skip files which cannot be added because of errors
    --ignore-missing     check if - even missing - files are ignored in dry run
(.venv) vscode@c4c2b63bd410 /w/d/e/git (master)> python git.py add -i
global arguments:
{'--bare': False,
 '--exec-path': None,
 '--git-dir': None,
 '--help': False,
 '--html-path': False,
 '--no-pager': False,
 '--no-replace-objects': False,
 '--paginate': False,
 '--version': False,
 '--work-tree': None,
 '-c': None,
 '<args>': ['-i'],
 '<command>': 'add'}
command arguments:
Warning: found unmatched (duplicate?) arguments [Option('-i', None, 0, True)]
usage: git add [options] [--] [<filepattern>...]
git branch
(.venv) vscode@c4c2b63bd410 /w/d/e/git (master) [0|1]> python git.py branch --help
global arguments:
{'--bare': False,
 '--exec-path': None,
 '--git-dir': None,
 '--help': False,
 '--html-path': False,
 '--no-pager': False,
 '--no-replace-objects': False,
 '--paginate': False,
 '--version': False,
 '--work-tree': None,
 '-c': None,
 '<args>': ['--help'],
 '<command>': 'branch'}
command arguments:
usage: git branch [options] [-r | -a] [--merged=<commit> | --no-merged=<commit>]
       git branch [options] [-l] [-f] <branchname> [<start-point>]
       git branch [options] [-r] (-d | -D) <branchname>
       git branch [options] (-m | -M) [<oldbranch>] <newbranch>

Generic options
    -h, --help
    -v, --verbose         show hash and subject, give twice for upstream branch
    -t, --track           set up tracking mode (see git-pull(1))
    --set-upstream        change upstream info
    --color=<when>        use colored output
    -r                    act on remote-tracking branches
    --contains=<commit>   print only branches that contain the commit
    --abbrev=<n>          use <n> digits to display SHA-1s

Specific git-branch actions:
    -a                    list both remote-tracking and local branches
    -d                    delete fully merged branch
    -D                    delete branch (even if not merged)
    -m                    move/rename a branch and its reflog
    -M                    move/rename a branch, even if target exists
    -l                    create the branch's reflog
    -f, --force           force creation (when already exists)
    --no-merged=<commit>  print only not merged branches
    --merged=<commit>     print only merged branches
(.venv) vscode@c4c2b63bd410 /w/d/e/git (master)> python git.py branch --force foo
global arguments:
{'--bare': False,
 '--exec-path': None,
 '--git-dir': None,
 '--help': False,
 '--html-path': False,
 '--no-pager': False,
 '--no-replace-objects': False,
 '--paginate': False,
 '--version': False,
 '--work-tree': None,
 '-c': None,
 '<args>': ['--force', 'foo'],
 '<command>': 'branch'}
command arguments:
Warning: found unmatched (duplicate?) arguments [Option(None, '--force', 0, True)]
usage: git branch [options] [-r | -a] [--merged=<commit> | --no-merged=<commit>]
       git branch [options] [-l] [-f] <branchname> [<start-point>]
       git branch [options] [-r] (-d | -D) <branchname>
       git branch [options] (-m | -M) [<oldbranch>] <newbranch>

h4l added a commit to h4l/docopt-ng that referenced this issue Aug 27, 2022
This commit uses parse_docstring_sections() and parse_options() to
parse docstrings accepted by docopt 0.6.2, while retaining docopt-ng's
improvements to supported syntax.

Currently, docopt-ng parses option-defaults using a strategy that was in
docopt's master branch, but considered unstable by the author, and was
not released in docopt. It looks for option descriptions in an
"options:" section, which is ended on the first blank line. This has
the side-effect that options defined in a man-page style — with blank
lines in-between — are not found. Neither are options outside an
options: section (docopt allows options to follow the usage with no
section heading).

parse_docstring_sections() is used to separate the usage section from
the rest of the docstring. The text before the usage is ignored. The
usage body (without its header) is parsed for the argument pattern and
the usage header with its body is used to print the usage summary help.
The text following the usage is parsed for options descriptions, using
parse_options(), which supports option the description syntax of both
docopt and the current docopt-ng.

Note that docopt 0.6.2 recognises option descriptions in the text
prior to the usage section, but this change does not, as it seems like
an unintended side-effect of the previous parser's implementation, and
seems unlikely to be used in practice.

The testcases have two cases added for docopt 0.6.2 compatibility.

This fixes jazzband#33
h4l added a commit to h4l/docopt-ng that referenced this issue Aug 27, 2022
This commit uses parse_docstring_sections() and parse_options() to
parse docstrings accepted by docopt 0.6.2, while retaining docopt-ng's
improvements to supported syntax.

Currently, docopt-ng parses option-defaults using a strategy that was in
docopt's master branch, but considered unstable by the author, and was
not released in docopt. It looks for option descriptions in an
"options:" section, which is ended on the first blank line. This has
the side-effect that options defined in a man-page style — with blank
lines in-between — are not found. Neither are options outside an
options: section (docopt allows options to follow the usage with no
section heading).

parse_docstring_sections() is used to separate the usage section from
the rest of the docstring. The text before the usage is ignored. The
usage body (without its header) is parsed for the argument pattern and
the usage header with its body is used to print the usage summary help.
The text following the usage is parsed for options descriptions, using
parse_options(), which supports option the description syntax of both
docopt and the current docopt-ng.

Note that docopt 0.6.2 recognises option descriptions in the text
prior to the usage section, but this change does not, as it seems like
an unintended side-effect of the previous parser's implementation, and
seems unlikely to be used in practice.

The testcases have two cases added for docopt 0.6.2 compatibility.

This fixes jazzband#33
@NickCrews
Copy link
Contributor

Hey @h4l, thanks for the great bug AND inspection AND fix. Sorry I didn't see this earlier. At first read that solution seems great. I'll look into this more sometime in the next week!

@h4l
Copy link
Contributor Author

h4l commented Sep 6, 2022

Thanks @NickCrews, glad to hear & no problem at all, thanks for the review, I should follow up later on today.

h4l added a commit to h4l/docopt-ng that referenced this issue Sep 8, 2022
This commit uses parse_docstring_sections() and parse_options() to
parse docstrings accepted by docopt 0.6.2, while retaining docopt-ng's
improvements to supported syntax.

Currently, docopt-ng parses option-defaults using a strategy that was in
docopt's master branch, but considered unstable by the author, and was
not released in docopt. It looks for option descriptions in an
"options:" section, which is ended on the first blank line. This has
the side-effect that options defined in a man-page style — with blank
lines in-between — are not found. Neither are options outside an
options: section (docopt allows options to follow the usage with no
section heading).

parse_docstring_sections() is used to separate the usage section from
the rest of the docstring. The text before the usage is ignored. The
usage body (without its header) is parsed for the argument pattern and
the usage header with its body is used to print the usage summary help.
The text following the usage is parsed for options descriptions, using
parse_options(), which supports option the description syntax of both
docopt and the current docopt-ng.

Note that docopt 0.6.2 recognises option descriptions in the text
prior to the usage section, but this change does not, as it seems like
an unintended side-effect of the previous parser's implementation, and
seems unlikely to be used in practice.

The testcases have two cases added for docopt 0.6.2 compatibility.

This fixes jazzband#33
h4l added a commit to h4l/docopt-ng that referenced this issue Sep 8, 2022
This commit uses parse_docstring_sections() and parse_options() to
parse docstrings accepted by docopt 0.6.2, while retaining docopt-ng's
improvements to supported syntax.

Currently, docopt-ng parses option-defaults using a strategy that was in
docopt's master branch, but considered unstable by the author, and was
not released in docopt. It looks for option descriptions in an
"options:" section, which is ended on the first blank line. This has
the side-effect that options defined in a man-page style — with blank
lines in-between — are not found. Neither are options outside an
options: section (docopt allows options to follow the usage with no
section heading).

parse_docstring_sections() is used to separate the usage section from
the rest of the docstring. The text before the usage is ignored. The
usage body (without its header) is parsed for the argument pattern and
the usage header with its body is used to print the usage summary help.
The text following the usage is parsed for options descriptions, using
parse_options(), which supports option the description syntax of both
docopt and the current docopt-ng.

Note that docopt 0.6.2 recognises option descriptions in the text
prior to the usage section, but this change does not, as it seems like
an unintended side-effect of the previous parser's implementation, and
seems unlikely to be used in practice.

The testcases have two cases added for docopt 0.6.2 compatibility.

This fixes jazzband#33
NickCrews pushed a commit that referenced this issue Sep 8, 2022
This commit uses parse_docstring_sections() and parse_options() to
parse docstrings accepted by docopt 0.6.2, while retaining docopt-ng's
improvements to supported syntax.

Currently, docopt-ng parses option-defaults using a strategy that was in
docopt's master branch, but considered unstable by the author, and was
not released in docopt. It looks for option descriptions in an
"options:" section, which is ended on the first blank line. This has
the side-effect that options defined in a man-page style — with blank
lines in-between — are not found. Neither are options outside an
options: section (docopt allows options to follow the usage with no
section heading).

parse_docstring_sections() is used to separate the usage section from
the rest of the docstring. The text before the usage is ignored. The
usage body (without its header) is parsed for the argument pattern and
the usage header with its body is used to print the usage summary help.
The text following the usage is parsed for options descriptions, using
parse_options(), which supports option the description syntax of both
docopt and the current docopt-ng.

Note that docopt 0.6.2 recognises option descriptions in the text
prior to the usage section, but this change does not, as it seems like
an unintended side-effect of the previous parser's implementation, and
seems unlikely to be used in practice.

The testcases have two cases added for docopt 0.6.2 compatibility.

This fixes #33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants