Skip to content

A Java library that implements several algorithms that calculate similarity between strings.

License

Notifications You must be signed in to change notification settings

rrice/java-string-similarity

Repository files navigation

License: MIT Issues Java CI

java-string-similarity that calculates a normalized distance or similarity score between two strings. A score of 0.0 means that the two strings are absolutely dissimilar, and 1.0 means that absolutely similar (or equal). Anything in between indicates how similar each the two strings are.

Example

In this simple example, we want to calculate a similarity score between the words McDonalds and MacMahons. We are selecting the Jaro-Winkler distance algorithm algorithm.

SimilarityStrategy strategy = new JaroWinklerStrategy();
String target = "McDonalds";
String source = "MacMahons";
StringSimilarityService service = new StringSimilarityServiceImpl(strategy);
double score = service.score(source, target); // Score is 0.90

Algorithms

Installation

This project currently uses Maven for management. You can compile, test and install the component to your local repo by calling:

mvn install

Then, you can add this component to your project by adding a dependency:

<dependency>
    <groupId>net.ricecode</groupId>
	<artifactId>string-similarity</artifactId>
	<version>1.0.0</version>
</dependency>

TODO

About

A Java library that implements several algorithms that calculate similarity between strings.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages