Skip to content

Commit

Permalink
Merge pull request #200 from Neeraj-Ghodla/feat/add-support-for-fuzzy…
Browse files Browse the repository at this point in the history
…-search

feat: add support for fuzzy matching in full-text search
  • Loading branch information
guyroyse authored Oct 30, 2023
2 parents 9872e6d + ac9af69 commit 26b9bbc
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 7 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -847,7 +847,7 @@ I'm not going to include all the examples again. Just go check out the section o

#### Full-Text Search

If you've defined a field with a type of `text` in your schema, you can store text in it and perform full-text searches against it. Full-text search is different from how a `string` is searched. With full-text search, you can look for words, partial words, and exact phrases within a body of text.
If you've defined a field with a type of `text` in your schema, you can store text in it and perform full-text searches against it. Full-text search is different from how a `string` is searched. With full-text search, you can look for words, partial words, fuzzy matches, and exact phrases within a body of text.

Full-text search is optimized for human-readable text and it's pretty clever. It understands that certain words (like *a*, *an*, or *the*) are common and ignores them. It understands how words relate to each other and so if you search for *give*, it matches *gives*, *given*, *giving*, and *gave* too. It ignores punctuation and whitespace.

Expand All @@ -859,6 +859,9 @@ let albums
// finds all albums where the title contains the word 'butterfly'
albums = await albumRepository.search().where('title').match('butterfly').return.all()

// finds all albums using fuzzy matching where the title contains a word which is within 3 Levenshtein distance of the word 'buterfly'
albums = await albumRepository.search().where('title').match('buterfly', { fuzzyMatching: true, levenshteinDistance: 3 }).return.all()

// finds all albums where the title contains the words 'beautiful' and 'children'
albums = await albumRepository.search().where('title').match('beautiful children').return.all()

Expand All @@ -873,11 +876,12 @@ If you want to search for a part of a word. To do it, just tack a `*` on the beg
albums = await albumRepository.search().where('title').match('*right*').return.all()
```

Do not combine partial-word searches with exact matches. Partial-word searches and exact matches are not compatible in RediSearch. If you try to exactly match a partial-word search, you'll get an error.
Do not combine partial-word searches or fuzzy matches with exact matches. Partial-word searches and fuzzy matches with exact matches are not compatible in RediSearch. If you try to exactly match a partial-word search or fuzzy match a partial-word search, you'll get an error.

```javascript
// THIS WILL ERROR
// THESE WILL ERROR
albums = await albumRepository.search().where('title').matchExact('beautiful sto*').return.all()
albums = await albumRepository.search().where('title').matchExact('*buterfly', { fuzzyMatching: true, levenshteinDistance: 3 }).return.all()
```

As always, there are several alternatives to make this a bit more fluent and, of course, negation is available:
Expand Down
8 changes: 6 additions & 2 deletions lib/search/where-field.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,20 @@ export interface WhereField extends Where {
/**
* Adds a full-text search comparison to the query.
* @param value The word or phrase sought.
* @param options.fuzzyMatching Whether to use fuzzy matching to find the sought word or phrase. Defaults to `false`.
* @param options.levenshteinDistance The levenshtein distance to use for fuzzy matching. Supported values are `1`, `2`, and `3`. Defaults to `1`.
* @returns The {@link Search} that was called to create this {@link WhereField}.
*/
match(value: string | number | boolean): Search
match(value: string | number | boolean, options?: { fuzzyMatching?: boolean; levenshteinDistance?: 1 | 2 | 3 }): Search

/**
* Adds a full-text search comparison to the query.
* @param value The word or phrase sought.
* @param options.fuzzyMatching Whether to use fuzzy matching to find the sought word or phrase. Defaults to `false`.
* @param options.levenshteinDistance The levenshtein distance to use for fuzzy matching. Supported values are `1`, `2`, and `3`. Defaults to `1`.
* @returns The {@link Search} that was called to create this {@link WhereField}.
*/
matches(value: string | number | boolean): Search
matches(value: string | number | boolean, options?: { fuzzyMatching?: boolean; levenshteinDistance?: 1 | 2 | 3 }): Search

/**
* Adds a full-text search comparison to the query that matches an exact word or phrase.
Expand Down
22 changes: 20 additions & 2 deletions lib/search/where-text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,19 @@ import { SemanticSearchError } from "../error"
export class WhereText extends WhereField {
private value!: string
private exactValue = false
private fuzzyMatching!: boolean
private levenshteinDistance!: number

match(value: string | number | boolean): Search {
match(
value: string | number | boolean,
options: { fuzzyMatching?: boolean; levenshteinDistance?: 1 | 2 | 3 } = {
fuzzyMatching: false,
levenshteinDistance: 1,
}
): Search {
this.value = value.toString()
this.fuzzyMatching = options.fuzzyMatching ?? false
this.levenshteinDistance = options.levenshteinDistance ?? 1
return this.search
}

Expand All @@ -17,7 +27,13 @@ export class WhereText extends WhereField {
return this.search
}

matches(value: string | number | boolean): Search { return this.match(value) }
matches(
value: string | number | boolean,
options: { fuzzyMatching?: boolean; levenshteinDistance?: 1 | 2 | 3 } = {
fuzzyMatching: false,
levenshteinDistance: 1,
}
): Search { return this.match(value, options) }
matchExactly(value: string | number | boolean): Search { return this.matchExact(value) }
matchesExactly(value: string | number | boolean): Search { return this.matchExact(value) }

Expand All @@ -40,6 +56,8 @@ export class WhereText extends WhereField {

if (this.exactValue) {
return this.buildQuery(`"${escapedValue}"`)
} else if (this.fuzzyMatching) {
return this.buildQuery(`${"%".repeat(this.levenshteinDistance)}${escapedValue}${"%".repeat(this.levenshteinDistance)}`);
} else {
return this.buildQuery(`'${escapedValue}'`)
}
Expand Down
12 changes: 12 additions & 0 deletions spec/unit/search/search-by-text.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,32 +19,38 @@ describe("Search", () => {
const A_NEGATED_TEXT_QUERY = `(-@someText:'${A_STRING}')`
const AN_EXACT_TEXT_QUERY = `(@someText:"${A_STRING}")`
const A_NEGATED_EXACT_TEXT_QUERY = `(-@someText:"${A_STRING}")`
const A_TEXT_QUERY_WITH_FUZZY_MATCHING = `(@someText:%${A_STRING}%)`;

const A_NUMBER_TEXT_QUERY = `(@someText:'${A_NUMBER}')`
const A_NEGATED_NUMBER_TEXT_QUERY = `(-@someText:'${A_NUMBER}')`
const A_NUMBER_EXACT_TEXT_QUERY = `(@someText:"${A_NUMBER}")`
const A_NEGATED_NUMBER_EXACT_TEXT_QUERY = `(-@someText:"${A_NUMBER}")`
const A_NUMBER_TEXT_QUERY_WITH_FUZZY_MATCHING = `(@someText:%${A_NUMBER}%)`

const A_BOOLEAN_TEXT_QUERY = `(@someText:'true')`
const A_NEGATED_BOOLEAN_TEXT_QUERY = `(-@someText:'true')`
const A_BOOLEAN_EXACT_TEXT_QUERY = `(@someText:"true")`
const A_NEGATED_BOOLEAN_EXACT_TEXT_QUERY = `(-@someText:"true")`
const A_BOOLEAN_TEXT_QUERY_WITH_FUZZY_MATCHING = `(@someText:%true%)`

type StringChecker = (search: Search) => void
const expectToBeTextQuery: StringChecker = search => expect(search.query).toBe(A_TEXT_QUERY)
const expectToBeNegatedTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_TEXT_QUERY)
const expectToBeExactTextQuery: StringChecker = search => expect(search.query).toBe(AN_EXACT_TEXT_QUERY)
const expectToBeNegatedExactTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_EXACT_TEXT_QUERY)
const expectToBeTextQueryWithFuzzyMatching: StringChecker = search => expect(search.query).toBe(A_TEXT_QUERY_WITH_FUZZY_MATCHING)

const expectToBeNumberTextQuery: StringChecker = search => expect(search.query).toBe(A_NUMBER_TEXT_QUERY)
const expectToBeNegatedNumberTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_NUMBER_TEXT_QUERY)
const expectToBeNumberExactTextQuery: StringChecker = search => expect(search.query).toBe(A_NUMBER_EXACT_TEXT_QUERY)
const expectToBeNegatedNumberExactTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_NUMBER_EXACT_TEXT_QUERY)
const expectToBeNumberTextQueryWithFuzzyMatching: StringChecker = search => expect(search.query).toBe(A_NUMBER_TEXT_QUERY_WITH_FUZZY_MATCHING)

const expectToBeBooleanTextQuery: StringChecker = search => expect(search.query).toBe(A_BOOLEAN_TEXT_QUERY)
const expectToBeNegatedBooleanTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_BOOLEAN_TEXT_QUERY)
const expectToBeBooleanExactTextQuery: StringChecker = search => expect(search.query).toBe(A_BOOLEAN_EXACT_TEXT_QUERY)
const expectToBeNegatedBooleanExactTextQuery: StringChecker = search => expect(search.query).toBe(A_NEGATED_BOOLEAN_EXACT_TEXT_QUERY)
const expectToBeBooleanTextQueryWithFuzzyMatching: StringChecker = search => expect(search.query).toBe(A_BOOLEAN_TEXT_QUERY_WITH_FUZZY_MATCHING)

beforeAll(() => {
client = new Client()
Expand All @@ -57,8 +63,10 @@ describe("Search", () => {

describe("when generating for a query with a string", () => {
it("generates a query with .match", () => expectToBeTextQuery(where.match(A_STRING)))
it("generates a query with .match with fuzzyMatching enabled", () => expectToBeTextQueryWithFuzzyMatching(where.match(A_STRING, { fuzzyMatching: true })))
it("generates a query with .not.match", () => expectToBeNegatedTextQuery(where.not.match(A_STRING)))
it("generates a query with .matches", () => expectToBeTextQuery(where.matches(A_STRING)))
it("generates a query with .matches with fuzzyMatching enabled", () => expectToBeTextQueryWithFuzzyMatching(where.matches(A_STRING, { fuzzyMatching: true })))
it("generates a query with .does.match", () => expectToBeTextQuery(where.does.match(A_STRING)))
it("generates a query with .does.not.match", () => expectToBeNegatedTextQuery(where.does.not.match(A_STRING)))

Expand All @@ -77,8 +85,10 @@ describe("Search", () => {

describe("when generating a query with a number as a string", () => {
it("generates a query with .match", () => expectToBeNumberTextQuery(where.match(A_NUMBER)))
it("generates a query with .match with fuzzyMatching enabled", () => expectToBeNumberTextQueryWithFuzzyMatching(where.match(A_NUMBER, { fuzzyMatching: true })))
it("generates a query with .not.match", () => expectToBeNegatedNumberTextQuery(where.not.match(A_NUMBER)))
it("generates a query with .matches", () => expectToBeNumberTextQuery(where.matches(A_NUMBER)))
it("generates a query with .matches with fuzzyMatching enabled", () => expectToBeNumberTextQueryWithFuzzyMatching(where.matches(A_NUMBER, { fuzzyMatching: true })))
it("generates a query with .does.match", () => expectToBeNumberTextQuery(where.does.match(A_NUMBER)))
it("generates a query with .does.not.match", () => expectToBeNegatedNumberTextQuery(where.does.not.match(A_NUMBER)))

Expand All @@ -97,8 +107,10 @@ describe("Search", () => {

describe("when generating a query with a boolean as a string", () => {
it("generates a query with .match", () => expectToBeBooleanTextQuery(where.match(true)))
it("generates a query with .match with fuzzyMatching enabled", () => expectToBeBooleanTextQueryWithFuzzyMatching(where.match(true, { fuzzyMatching: true })))
it("generates a query with .not.match", () => expectToBeNegatedBooleanTextQuery(where.not.match(true)))
it("generates a query with .matches", () => expectToBeBooleanTextQuery(where.matches(true)))
it("generates a query with .match with fuzzyMatching enabled", () => expectToBeBooleanTextQueryWithFuzzyMatching(where.match(true, { fuzzyMatching: true })))
it("generates a query with .does.match", () => expectToBeBooleanTextQuery(where.does.match(true)))
it("generates a query with .does.not.match", () => expectToBeNegatedBooleanTextQuery(where.does.not.match(true)))

Expand Down

0 comments on commit 26b9bbc

Please sign in to comment.