diff --git a/CHANGELOG.md b/CHANGELOG.md index 44f4ae79a..5c7e61f04 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -27,6 +27,8 @@ Bug fixes: `**` is now accepted as valid syntax anywhere in a glob. * [BUG #1095](https://github.com/BurntSushi/ripgrep/issues/1095): Fix corner cases involving the `--crlf` flag. +* [BUG #1103](https://github.com/BurntSushi/ripgrep/issues/1103): + Clarify what `--encoding auto` does. * [BUG #1106](https://github.com/BurntSushi/ripgrep/issues/1106): `--files-with-matches` and `--files-without-match` work with one file. * [BUG #1093](https://github.com/BurntSushi/ripgrep/pull/1093): diff --git a/GUIDE.md b/GUIDE.md index 8523b6a5a..39ccb52d2 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -609,7 +609,8 @@ topic, but we can try to summarize its relevancy to ripgrep: the most popular encodings likely consist of ASCII, latin1 or UTF-8. As a special exception, UTF-16 is prevalent in Windows environments -In light of the above, here is how ripgrep behaves: +In light of the above, here is how ripgrep behaves when `--encoding auto` is +given, which is the default: * All input is assumed to be ASCII compatible (which means every byte that corresponds to an ASCII codepoint actually is an ASCII codepoint). This diff --git a/src/app.rs b/src/app.rs index 59fd6f234..b4c81a7ce 100644 --- a/src/app.rs +++ b/src/app.rs @@ -982,10 +982,15 @@ fn flag_encoding(args: &mut Vec) { const LONG: &str = long!("\ Specify the text encoding that ripgrep will use on all files searched. The default value is 'auto', which will cause ripgrep to do a best effort automatic -detection of encoding on a per-file basis. Other supported values can be found -in the list of labels here: +detection of encoding on a per-file basis. Automatic detection in this case +only applies to files that begin with a UTF-8 or UTF-16 byte-order mark (BOM). +No other automatic detection is performend. + +Other supported values can be found in the list of labels here: https://encoding.spec.whatwg.org/#concept-encoding-get +For more details on encoding and how ripgrep deals with it, see GUIDE.md. + This flag can be disabled with --no-encoding. "); let arg = RGArg::flag("encoding", "ENCODING").short("E")