Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for scored named queries #11626

Conversation

Dharin-shah
Copy link
Contributor

@Dharin-shah Dharin-shah commented Dec 17, 2023

Description

Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it matched. However one of the use case while doing hybrid search with query text and dense vector is to determine individual scores for each query type. At GetYourGuide, this is very useful in further analysis and building offline model to generate better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched query.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Copy link
Contributor

❌ Gradle check result for 16b7c19: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 3aef42e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 17, 2023

Compatibility status:

Checks if related components are compatible with change 9920c36

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/performance-analyzer.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/security.git]

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
@Dharin-shah Dharin-shah requested a review from reta February 2, 2024 19:01
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Copy link
Contributor

github-actions bot commented Feb 3, 2024

❕ Gradle check result for ffdf852: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@Dharin-shah
Copy link
Contributor Author

#5329
#9191
Flaky tests issues reported already

@msfroh possible to take a look and merge this is all good, thanks 🙏

Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Copy link
Contributor

github-actions bot commented Feb 5, 2024

❌ Gradle check result for 2937d8e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Copy link
Contributor

github-actions bot commented Feb 6, 2024

❕ Gradle check result for 9920c36: UNSTABLE

  • TEST FAILURES:
      3 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in getting back on this -- it looks good. Thanks a lot @Dharin-shah!

@msfroh msfroh merged commit 52b27f4 into opensearch-project:main Feb 6, 2024
31 of 33 checks passed
@msfroh msfroh added the backport 2.x Backport to 2.x branch label Feb 6, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-11626-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 52b27f47bca5b3ab52cab237542f32c307d203b4
# Push it to GitHub
git push --set-upstream origin backport/backport-11626-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-11626-to-2.x.

@reta
Copy link
Collaborator

reta commented Feb 6, 2024

@Dharin-shah could you please backport to 2.x manually?

Dharin-shah added a commit to Dharin-shah/OpenSearch that referenced this pull request Feb 6, 2024
Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine
individual scores for each query type. This is very useful in further analysis and building offline model to generate
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
(cherry picked from commit 52b27f4)
andrross pushed a commit to andrross/OpenSearch that referenced this pull request Feb 22, 2024
Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine
individual scores for each query type. This is very useful in further analysis and building offline model to generate
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
(cherry picked from commit 52b27f4)
Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross
Copy link
Member

@reta @Dharin-shah I've opened the backport in #12427

reta added a commit that referenced this pull request Feb 23, 2024
* Add support for scored named queries (#11626)

Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine
individual scores for each query type. This is very useful in further analysis and building offline model to generate
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
(cherry picked from commit 52b27f4)
Signed-off-by: Andrew Ross <andrross@amazon.com>

* Update version checks

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Update test version guard

Co-authored-by: Andriy Redko <drreta@gmail.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Co-authored-by: Andriy Redko <drreta@gmail.com>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine 
individual scores for each query type. This is very useful in further analysis and building offline model to generate 
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched 
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine 
individual scores for each query type. This is very useful in further analysis and building offline model to generate 
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched 
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Opensearch already support labelling the queries, that returns as a list in the returned results, of which query it
matched. However one of the use case while doing hybrid search with query text and dense vector is to determine
individual scores for each query type. This is very useful in further analysis and building offline model to generate
better weights for ranking score. Hence adding this feature that sends the client to add the score for each matched
query.

---------

Signed-off-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Dharin Shah <Dharin-shah@users.noreply.github.com>
Co-authored-by: Dharin Shah <8616130+Dharin-shah@users.noreply.github.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants