Skip to content

Commit

Permalink
doc: clarify automatic encoding detection
Browse files Browse the repository at this point in the history
Fixes #1103
  • Loading branch information
BurntSushi committed Jan 26, 2019
1 parent afb89bc commit 6d5dba8
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 3 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ Bug fixes:
`**` is now accepted as valid syntax anywhere in a glob.
* [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095):
Fix corner cases involving the `--crlf` flag.
* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103):
Clarify what `--encoding auto` does.
* [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106):
`--files-with-matches` and `--files-without-match` work with one file.
* [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093):
Expand Down
3 changes: 2 additions & 1 deletion GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -609,7 +609,8 @@ topic, but we can try to summarize its relevancy to ripgrep:
the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
a special exception, UTF-16 is prevalent in Windows environments

In light of the above, here is how ripgrep behaves:
In light of the above, here is how ripgrep behaves when `--encoding auto` is
given, which is the default:

* All input is assumed to be ASCII compatible (which means every byte that
corresponds to an ASCII codepoint actually is an ASCII codepoint). This
Expand Down
9 changes: 7 additions & 2 deletions src/app.rs
Original file line number Diff line number Diff line change
Expand Up @@ -982,10 +982,15 @@ fn flag_encoding(args: &mut Vec<RGArg>) {
const LONG: &str = long!("\
Specify the text encoding that ripgrep will use on all files searched. The
default value is 'auto', which will cause ripgrep to do a best effort automatic
detection of encoding on a per-file basis. Other supported values can be found
in the list of labels here:
detection of encoding on a per-file basis. Automatic detection in this case
only applies to files that begin with a UTF-8 or UTF-16 byte-order mark (BOM).
No other automatic detection is performend.
Other supported values can be found in the list of labels here:
https://encoding.spec.whatwg.org/#concept-encoding-get
For more details on encoding and how ripgrep deals with it, see GUIDE.md.
This flag can be disabled with --no-encoding.
");
let arg = RGArg::flag("encoding", "ENCODING").short("E")
Expand Down

0 comments on commit 6d5dba8

Please sign in to comment.