Similarities and distance Calculator among Vectors.
There are several algorithms to calculate the similarities of two bectors; however, no commands are exists treats them.
scv
standardizes the interface for calculating the similarities and distances among vectors.
scv [OPTIONS] <VECTORS...>
OPTIONS
-a, --algorithm <ALGORITHM> specifies the calculating algorithm. This option is mandatory.
The value of this option accepts several values separated with comma.
Available values are: simpson, jaccard, dice, cosine, pearson,
euclidean, manhattan, chebyshev, and levenshtein.
-f, --format <FORMAT> specifies the resultant format. Default is default.
Available values are: default, json, and xml.
-t, --input-type <TYPE> specifies the type of VECTORS. Default is file.
If TYPE is separated with comma, each type shows
the corresponding VECTORS.
Available values are: byte_file, term_file, string, and json.
-h, --help prints this message.
VECTORS
the source of vectors for calculation.
$ scv -t string -a simpson distance similarity
simpson(distance, similarity) = 0.5000
$ scv -t string -a jaccard,dice distance similarity
jaccard(distance, similarity) = 0.3333
dice(distance, similarity) = 0.5000
docker run -it ghcr.io/tamada/scv:latest gives some strings for comparing
If scv
reads some files, -v
option should be specified.
docker run -v $PWD:/home/scv -it ghcr.io/tamada/scv:latest -f json testdata/*.json
1.0.0
,latest
Simply type the following commands.
brew tap tamada/brew
brew install scv
go get github.com/tamada/scv
git clone https://github.com/tamada/scv
cd scv
make
- Haruaki Tamada (tamada)
This image is obtained from iconscount.com.