Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update nimgrep documentation #17415

Merged
merged 2 commits into from
Mar 23, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 43 additions & 25 deletions doc/nimgrep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,20 @@
nimgrep User's manual
=========================

.. default-role:: literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not code? it renders differently for things like backslash IIRC; code is the closest to verbatim rendering (and to markdown)

.. default-role:: code is what I've used in #17028 (and manual since #17259)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because code is intended for code in programming languages. It expects highlighting. This is an actually implemented feature in rst2html.py:

.. role:: nim(code)
   :language: nim

.. default-role:: nim

Some code: `when PrintRopeCacheStats: echo "rope cache stats: "`.

Another example: `from ic / ic import rodViewer`.

image

It uses pygmentize to highlight code. And pygmentize knows Nim!

I think one day we will implement that too.

...And here in nimgrep.rst all quotes are used either for option names or some commands, which are not exactly code (may be viewed as Bash code though or any other shell without specifics). All those cases match literal role perfectly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok! maybe keep this comment open as it has useful information (or link to it from #17340)


:Author: Andreas Rumpf
:Version: 0.9
:Version: 1.6.0
Copy link
Member

@timotheecour timotheecour Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that version is specific to nimgrep; maybe bump it to 0.10.0 instead

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timotheecour nimgrep is 1.6.0 though

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@timotheecour timotheecour Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok (a bit confusing since matches next nim version but i guess that's a coincidence)
btw @a-mr is there a way to insert comments to rst such that they won't render in rendered html? that would be useful for situations like this (and many others), eg:

.. hidden_comment::
  the version is independent from nim version

EDIT: found it:

before

..
  some comment that will not show

after

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, equal versions is a pure coincidence.


.. contents::

Nimgrep is a command line tool for search&replace tasks. It can search for
Nimgrep is a command line tool for search and replace tasks. It can search for
regex or peg patterns and can search whole directories at once. User
confirmation for every single replace operation can be requested.

Nimgrep has particularly good support for Nim's
eccentric *style insensitivity*. Apart from that it is a generic text
manipulation tool.
eccentric *style insensitivity* (see option `-y` below).
Apart from that it is a generic text manipulation tool.


Installation
Expand All @@ -22,29 +25,44 @@ Compile nimgrep with the command::

nim c -d:release tools/nimgrep.nim

And copy the executable somewhere in your ``$PATH``.
And copy the executable somewhere in your `$PATH`.


Command line switches
=====================

Usage:
nimgrep [options] [pattern] [replacement] (file/directory)*
Options:
--find, -f find the pattern (default)
--replace, -r replace the pattern
--peg pattern is a peg
--re pattern is a regular expression (default); extended
syntax for the regular expression is always turned on
--recursive process directories recursively
--confirm confirm each occurrence/replacement; there is a chance
to abort any time without touching the file
--stdin read pattern from stdin (to avoid the shell's confusing
quoting rules)
--word, -w the match should have word boundaries (buggy for pegs!)
--ignoreCase, -i be case insensitive
--ignoreStyle, -y be style insensitive
--ext:EX1|EX2|... only search the files with the given extension(s)
--verbose be verbose: list every processed file
--help, -h shows this help
--version, -v shows the version
.. include:: nimgrep_cmdline.txt

Examples
========

All examples below use default PCRE Regex patterns:

+ To search recursively in Nim files using style-insensitive identifiers::

--recursive --ext:'nim|nims' --ignoreStyle
# short: -r --ext:'nim|nims' -y

.. Note:: we used `'` quotes to avoid special treatment of `|` symbol
Copy link
Member

@timotheecour timotheecour Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing but IMO we should allow , and undocument |:

  • anything that requires escaping just adds complexity
  • , is more standard in cmdline utilies, | is almost never used for such purpose for this very reason (inside regex is fine though)

for shells like Bash

+ To exclude version control directories (Git, Mercurial=hg, Subversion=svn)
from the search::

--excludeDir:'^\.git$' --excludeDir:'^\.hg$' --excludeDir:'^\.svn$'
# short: --ed:'^\.git$' --ed:'^\.hg$' --ed:'^\.svn$'

+ To search only in paths containing the `tests` sub-directory recursively::

--recursive --includeDir:'(^|/)tests($|/)'
# short: -r --id:'(^|/)tests($|/)'

.. Attention:: note the subtle difference between `--excludeDir` and
`--includeDir`: the former is applied to relative directory entries
and the latter is applied to the whole paths
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing but what's the rationale for that?
IMO ripgrep has a saner inclusion/exclusion syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speed and simplicity of implementation.

--excludeDir is easy to implement as sub-directory name exclusion: just don't step into any matching sub-directory.

For --includeDir one needs to make sure that any of sub-directories in the path does not match to throw it away.

E.g. if we do --includeDir:subdir2 here

subdir1/subdir2/subdir3/file

we can't stop when subdir1 were not matched or subdir3 not matched.


+ Nimgrep can search multi-line, e.g. to find files containing `import`
and then `strutils` use::

'import(.|\n)*?strutils'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • can you make this more precise via \b:
Suggested change
'import(.|\n)*?strutils'
'import(.|\n)*?\bstrutils\b'
  • doesn't that have false positives, eg:
import foo
# see also strutils

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I want to keep this example simple but still usable
  • yes, it will match even if this "strutils" comment is situated somewhere in the end of file. But Regex is not intended for good semantic checking, right? I mean using it from cmdline supposes quick-and-dirty patterns and hence a number of false positives. (I've heard that it's possible to parse pretty complex grammars by Regex in theory -- I'm not a CS guy though)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but at least mention this in the comment, eg:

+ Nimgrep can search multi-line, e.g. to find files containing `import`
  and then `strutils` use this (this can have false positives)::


114 changes: 114 additions & 0 deletions doc/nimgrep_cmdline.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@

Usage:

* To search::
Copy link
Member

@timotheecour timotheecour Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you use :: to make rst2html happy, but this now makes nimgrep -h show :: and other artifacts introduced in this PR (eg .. Note::)

I completely agree with de-duplicating the help message, but we should address this.

rST's include directive provides a parameter code for including files that should be parsed as source code which may be highlighted. This is missing in Nim

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a pr I've been working on for nimgrep. I deleted the old rst file and wrote this to keep it dry::

proc staticParseVerAndUsageFromDoc: (string, string) {.compileTime.} =
  ## Reads in this source file & parses out the title, version, author,
  ## copyright and cli help info from module's doc.
  ## There are many options to `nimgrep` and the old docs diverged over
  ## the years.
  template slice(offset: int, mark: string, endMark = "\n"): untyped =
    let off0 = source.find(mark, offset) + mark.len
    doAssert off0 > -1 + mark.len, "(" & mark.escape & " .. " & endMark.escape & ")"
    let off1 = source.find(endMark, off0) - 1
    doAssert off1 > -1, "(" & mark.escape & " .. " & endMark.escape & ")"
    off0..off1
  let source = staticRead currentSourcePath
  const doc0Mark = "\n\n## ="
  let doc0 = source.find(doc0Mark)
  doAssert doc0 > -1
  let title = slice(doc0 + doc0Mark.len, "=\n## ")
  let author = slice(title.b, "## :Author: ")
  let copyright = slice(author.b, "## :Copyright: ")
  let version = slice(copyright.b, "## :Version: ")
  var usage = slice(version.b, "\n## Usage\n## =", "\n## Design Notes\n## =")
  usage.a -= "\n## Usage\n## =".len
  result[0] = source[version]
  result[1] = "\n" & source[title] &
              "\n\n  Version " & source[version] &
              "\n  (c) " & source[copyright] & " " & source[author] &
              "\n" & source[usage]
  result[1] = result[1].multiReplace([("\n## ", "\n  "), ("\n##", "\n"),
                                      ("::\n", ":\n")])

const (Version, Usage) = staticParseVerAndUsageFromDoc()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That of course was after I reformatted and added more docs such as a "Design Notes" section that was used the boundry of the cli usage sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quantimnot do you mean you put the whole nimgrep.rst inside nimgrep.nim as a doc comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It was really outdated compared to the source. So I deleted it and added lots of docs to the source.

I was exploring other changes to nimgrep. my nimgrep wip branch

Copy link
Contributor Author

@a-mr a-mr Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to avoid documentation getting out of sync but I'm not sure that doc comments are appropriate for writing User's Manual.

The 2 problems with it:

  1. file nimgrep.nim gets larger, though we should probably move some code to separate file(s) anyway
  2. we may want to have internal (developer) documentation that can be generated by nim doc --docInternal. And this approach mixes Manual and Developer docs together.

May be it makes sense to introduce a next level of doc comments ### specifically for writing User's Manuals, Guides, etc.

@timotheecour wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with having nimgrep's documnetation in the nimgrep.rst file. Having this in the source code instead is not acceptable. Yes, yes, maybe it makes it slowly harder to change nimgrep's command line interface as the documentation then also needs to be updated. But that's a good thing -- some users trust nimgrep's command line interface not to change all the time...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Araq

So what do you think about this PR? Say clearly please:

  • yes, it's good to put cmdline options into nimgrep_cmdline.txt for both nimgrep.nim and nimgrep.rst
  • no, it's better to have the options described separately even at the cost of duplication

Copy link
Member

@timotheecour timotheecour Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for araq but as far as I'm concerned, the way you did it by introducing nimgrep_cmdline.txt (included by both nimgrep.nim and nimgrep.rst) is indeeed the way to go (and in fact we do the exact same with basicopt.txt); there's no need for duplication, which is always bad. And yes, it also allows having the documentation in a separate nimgrep.rst + nimgrep_cmdline.txt file.

The only thing is it needs a (hopefully simple) post-processing so that it renders well in both cmdline output and rst docs, but that can be done in a followup PR if needed. The post-processing can be reused in all similar instances and doesn't have to be specific to nimgrep, in fact we can reuse same post-processing for basicopt.txt so that nimc.rst looks better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @timotheecour

nimgrep [options] PATTERN [(FILE/DIRECTORY)*/-]
* To replace::
nimgrep [options] PATTERN --replace REPLACEMENT (FILE/DIRECTORY)*/-
* To list file names::
nimgrep [options] --filenames [PATTERN] [(FILE/DIRECTORY)*]

Positional arguments, from left to right:
1) PATERN is either Regex (default) or Peg if `--peg` is specified.
Araq marked this conversation as resolved.
Show resolved Hide resolved
PATTERN and REPLACEMENT should be skipped when `--stdin` is specified.
2) REPLACEMENT supports `$1`, `$#` notations for captured groups in PATTERN.

.. DANGER:: `--replace` mode **DOES NOT** ask confirmation
unless `--confirm` is specified!

3) Final arguments are a list of paths (FILE/DIRECTORY) or a standalone
minus `-` or not specified (empty):

* empty, current directory `.` is assumed (not with `--replace`)

.. Note:: so when no FILE/DIRECTORY/`-` is specified nimgrep
does **not** read the pipe, but searches files in the current
dir instead!
* `-`, read buffer once from stdin: pipe or terminal input;
in `--replace` mode the result is directed to stdout;
it's not compatible with `--stdin`, `--filenames`, or `--confirm`


For any given DIRECTORY nimgrep searches only its immediate files without
traversing sub-directories unless `--recursive` is specified.

In replacement mode we require all 3 positional arguments to avoid damaging.

Options:
* Mode of operation:
--find, -f find the PATTERN (default)
--replace, -! replace the PATTERN to REPLACEMENT, rewriting the files
--confirm confirm each occurrence/replacement; there is a chance
to abort any time without touching the file
--filenames just list filenames. Provide a PATTERN to find it in
the filenames (not in the contents of a file) or run
with empty pattern to just list all files::
nimgrep --filenames # In current dir
nimgrep --filenames "" DIRECTORY
# Note empty pattern "", lists all files in DIRECTORY

* Interprete patterns:
--peg PATTERN and PAT are Peg
--re PATTERN and PAT are regular expressions (default)
--rex, -x use the "extended" syntax for the regular expression
so that whitespace is not significant
--word, -w matches should have word boundaries (buggy for pegs!)
--ignoreCase, -i be case insensitive in PATTERN and PAT
--ignoreStyle, -y be style insensitive in PATTERN and PAT
.. Note:: PATERN and patterns PAT (see below in other options) are all either
Regex or Peg simultaneously and options `--rex`, `--word`, `--ignoreCase`,
and `--ignoreStyle` are applied to all of them.

* File system walk:
--recursive, -r process directories recursively
--follow follow all symlinks when processing recursively
--ext:EX1|EX2|... only search the files with the given extension(s),
empty one ("--ext") means files with missing extension
--noExt:EX1|... exclude files having given extension(s), use empty one to
skip files with no extension (like some binary files are)
--includeFile:PAT search only files whose names contain pattern PAT
--excludeFile:PAT skip files whose names contain pattern PAT
--includeDir:PAT search only files with their whole directory path
containing PAT
--excludeDir:PAT skip directories whose name (not path)
contain pattern PAT
--if,--ef,--id,--ed abbreviations of the 4 options above
--sortTime, -s[:asc|desc]
order files by the last modification time (default: off):
ascending (recent files go last) or descending

* Filter file content:
--match:PAT select files containing a (not displayed) match of PAT
--noMatch:PAT select files not containing any match of PAT
--bin:on|off|only process binary files? (detected by \0 in first 1K bytes)
(default: on - binary and text files treated the same way)
--text, -t process only text files, the same as `--bin:off`

* Represent results:
--nocolor output will be given without any colors
--color[:on] force color even if output is redirected (default: auto)
--colorTheme:THEME select color THEME from `simple` (default),
`bnw` (black and white), `ack`, or `gnu` (GNU grep)
--count only print counts of matches for files that matched
--context:N, -c:N print N lines of leading context before every match and
N lines of trailing context after it (default N: 0)
--afterContext:N, -a:N
print N lines of trailing context after every match
--beforeContext:N, -b:N
print N lines of leading context before every match
--group, -g group matches by file
--newLine, -l display every matching line starting from a new line
--cols[:N] limit max displayed columns/width of output lines from
files by N characters, cropping overflows (default: off)
--cols:auto, -% calculate columns from terminal width for every line
--onlyAscii, -@ display only printable ASCII Latin characters 0x20-0x7E
substitutions: 0 -> ^@, 1 -> ^A, ... 0x1F -> ^_,
0x7F -> '7F, ..., 0xFF -> 'FF

* Miscellaneous:
--threads:N, -j:N speed up search by N additional workers (default: 0, off)
--stdin read PATTERN from stdin (to avoid the shell's confusing
quoting rules) and, if `--replace` given, REPLACEMENT
--verbose be verbose: list every processed file
--help, -h shows this help
--version, -v shows the version
101 changes: 1 addition & 100 deletions tools/nimgrep.nim
Original file line number Diff line number Diff line change
Expand Up @@ -16,106 +16,7 @@ const
Version & """

(c) 2012-2020 Andreas Rumpf

Usage:
* To search:
nimgrep [options] PATTERN [(FILE/DIRECTORY)*/-]
* To replace:
nimgrep [options] PATTERN --replace REPLACEMENT (FILE/DIRECTORY)*/-
* To list file names:
nimgrep [options] --filenames [PATTERN] [(FILE/DIRECTORY)*]

Positional arguments, from left to right:
* PATERN is either Regex (default) or Peg if --peg is specified.
PATTERN and REPLACEMENT should be skipped when --stdin is specified.
* REPLACEMENT supports $1, $# notations for captured groups in PATTERN.
Note: --replace mode DOES NOT ask confirmation unless --confirm is specified!
* Final arguments are a list of paths (FILE/DIRECTORY) or a standalone
minus '-' (pipe) or not specified (empty). Note for the empty case: when
no FILE/DIRECTORY/- is specified nimgrep DOES NOT read the pipe, but
searches files in the current dir instead!
- read buffer once from stdin: pipe or terminal input;
in --replace mode the result is directed to stdout;
it's not compatible with --stdin, --filenames, --confirm
(empty) current directory '.' is assumed (not with --replace)
For any given DIRECTORY nimgrep searches only its immediate files without
traversing sub-directories unless --recursive is specified.
In replacement mode all 3 positional arguments are required to avoid damaging.

Options:
* Mode of operation:
--find, -f find the PATTERN (default)
--replace, -! replace the PATTERN to REPLACEMENT, rewriting the files
--confirm confirm each occurrence/replacement; there is a chance
to abort any time without touching the file
--filenames just list filenames. Provide a PATTERN to find it in
the filenames (not in the contents of a file) or run
with empty pattern to just list all files:
nimgrep --filenames # In current directory
nimgrep --filenames "" DIRECTORY # Note empty pattern ""

* Interprete patterns:
--peg PATTERN and PAT are Peg
--re PATTERN and PAT are regular expressions (default)
--rex, -x use the "extended" syntax for the regular expression
so that whitespace is not significant
--word, -w matches should have word boundaries (buggy for pegs!)
--ignoreCase, -i be case insensitive in PATTERN and PAT
--ignoreStyle, -y be style insensitive in PATTERN and PAT
NOTE: PATERN and patterns PAT (see below in other options) are all either
Regex or Peg simultaneously and options --rex, --word, --ignoreCase,
--ignoreStyle are applied to all of them.

* File system walk:
--recursive, -r process directories recursively
--follow follow all symlinks when processing recursively
--ext:EX1|EX2|... only search the files with the given extension(s),
empty one ("--ext") means files with missing extension
--noExt:EX1|... exclude files having given extension(s), use empty one to
skip files with no extension (like some binary files are)
--includeFile:PAT search only files whose names contain pattern PAT
--excludeFile:PAT skip files whose names contain pattern PAT
--includeDir:PAT search only files with whole directory path containing PAT
--excludeDir:PAT skip directories whose name (not path) contain pattern PAT
--if,--ef,--id,--ed abbreviations of 4 options above
--sortTime order files by the last modification time (default: off):
-s[:asc|desc] ascending (recent files go last) or descending

* Filter file content:
--match:PAT select files containing a (not displayed) match of PAT
--noMatch:PAT select files not containing any match of PAT
--bin:on|off|only process binary files? (detected by \0 in first 1K bytes)
(default: on - binary and text files treated the same way)
--text, -t process only text files, the same as --bin:off

* Represent results:
--nocolor output will be given without any colors
--color[:on] force color even if output is redirected (default: auto)
--colorTheme:THEME select color THEME from 'simple' (default),
'bnw' (black and white) ,'ack', or 'gnu' (GNU grep)
--count only print counts of matches for files that matched
--context:N, -c:N print N lines of leading context before every match and
N lines of trailing context after it (default N: 0)
--afterContext:N,
-a:N print N lines of trailing context after every match
--beforeContext:N,
-b:N print N lines of leading context before every match
--group, -g group matches by file
--newLine, -l display every matching line starting from a new line
--cols[:N] limit max displayed columns/width of output lines from
files by N characters, cropping overflows (default: off)
--cols:auto, -% calculate columns from terminal width for every line
--onlyAscii, -@ display only printable ASCII Latin characters 0x20-0x7E
substitutions: 0 -> ^@, 1 -> ^A, ... 0x1F -> ^_,
0x7F -> '7F, ..., 0xFF -> 'FF
* Miscellaneous:
--threads:N, -j:N speed up search by N additional workers (default: 0, off)
--stdin read PATTERN from stdin (to avoid the shell's confusing
quoting rules) and, if --replace given, REPLACEMENT
--verbose be verbose: list every processed file
--help, -h shows this help
--version, -v shows the version
"""
""" & slurp "../doc/nimgrep_cmdline.txt"

# Limitations / ideas / TODO:
# * No unicode support with --cols
Expand Down