-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport 2.x] phone-search
analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input
#17017
[Backport 2.x] phone-search
analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input
#17017
Conversation
phone-search
analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input - 2.x cherry-pickphone-search
analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input
why is the DCO failing here? the sign-off is present in the commit-msg 🤔 |
Yeah, I can clearly see it, I think we could ignore it for now |
…ension & unformatted input (opensearch-project#16993) * `phone-search` analyzer: don't emit int'l prefix this was an oversight in the initial implementation: if the tokenizer emits the international calling prefix in the search analyzer then all documents with the same international calling prefix will match. e.g. when searching for `+1-555-123-4567` not only documents with this number would match but also any other document with a `1` token (i.e. any other number with this prefix). thus the search functionality is currently broken for this analyzer, making it useless. the test coverage has now been extended to cover these and other use-cases. Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com> * `phone-search` analyzer: don't emit extension & unformatted input if these tokens are emitted it meant that phone numbers with other international dialling prefixes still matched. e.g. searching for `+1 1234` would also match a number stored as `+2 1234`, which was wrong. the tokens still need to be emited for the `phone` analyzer, e.g. when the user only enters the extension / local number it should still match, the same is with the other ngrams: these are needed for search-as-you-type style queries where the user input needs to match against partial phone numbers. Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com> * `phone-search` analyzer: don't emit sip/tel prefix in line with the previous two commits, this is something else the search analyzer shouldn't emit since otherwise searching for any number with such a prefix will match _any_ document with the same prefix. Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com> --------- Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>
9f8c241
to
cb2c36b
Compare
i think i found it: since i cherry-picked the merge commit it had the weird noreply.github.com email address as the author so it didn't match the sign-off. since i anyway had to push again (spotless...) i fixed the author as well, so now it should be good |
❕ Gradle check result for cb2c36b: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## 2.x #17017 +/- ##
============================================
- Coverage 71.91% 71.79% -0.12%
+ Complexity 65577 65520 -57
============================================
Files 5318 5318
Lines 305777 305779 +2
Branches 44581 44583 +2
============================================
- Hits 219893 219530 -363
- Misses 67515 67943 +428
+ Partials 18369 18306 -63 ☔ View full report in Codecov by Sentry. |
Description
cherry-pick of #16993 with #17016 squashed into it.
CC @reta
Related Issues
n/a
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.