Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi queries #307

Merged
merged 10 commits into from
Feb 9, 2023
Merged

Multi queries #307

merged 10 commits into from
Feb 9, 2023

Conversation

pandu-k
Copy link
Collaborator

@pandu-k pandu-k commented Feb 7, 2023

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    Feature

  • What is the current behavior? (You can also link to an open issue here)
    No way for weighted multi query searches

  • What is the new behavior (if this is a feature change)?
    Weighted multi query searches.

  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
    No

  • Have unit tests been run against this PR? (Has there also been any additional testing?)
    Yes

  • Related Python client/documentation etc changes (link commit/PR here)
    Client: Multi query support py-marqo#54
    API tests: API test for multi query marqo-api-tests#9
    Documentation: https://gh-previews.marqo.pages.dev/0.0.13/API-Reference/search/#query-q

  • Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes/features)
  • Docs have been added / updated (for bug fixes / features)

@@ -45,7 +45,7 @@ def vectorise(model_name: str, content: Union[str, List[str]], model_properties:
try:
vectorised = available_models[model_cache_key].encode(content, normalize=normalize_embeddings, **kwargs)
except UnidentifiedImageError as e:
raise VectoriseError from e
raise VectoriseError(str(e)) from e
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the UnidentifiedImageError message to e

raise errors.InvalidArgError("Syntax error, could not parse filter string") from e
raise e
raise errors.BackendCommunicationError(f"Error communicating with Marqo-OS backend:\n{response}")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass through the Marqo-os error as fallback, if not yet handled

@pandu-k pandu-k temporarily deployed to marqo-test-suite February 7, 2023 12:14 — with GitHub Actions Inactive
@pandu-k pandu-k marked this pull request as ready for review February 7, 2023 12:51
@pandu-k pandu-k requested review from jn2clark and vicilliar February 7, 2023 12:51
@@ -6,7 +6,9 @@
import os
import urllib
from tqdm import tqdm
from src.marqo.s2_inference.configs import ModelCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mentioned this bug to li, i think he might have made the same change. looks good.

ordered_queries = None

if isinstance(query, str):
to_be_vectorised = [query, ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just be to_be_vectorised = [query]?

ordered_queries = list(query.items())
if index_info.index_settings[NsField.index_defaults][NsField.treat_urls_and_pointers_as_images]:
text_queries = [k for k, _ in ordered_queries if _is_image(k)]
image_queries = [k for k, _ in ordered_queries if not _is_image(k)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could these be extracted in 1 loop to make it faster? like

for k, _ in ordered_queries:
   if _is_image(k):
      text_queries.append(k)
   else:
      image_queries.append(k)

@pandu-k pandu-k merged commit d6584bf into mainline Feb 9, 2023
@pandu-k pandu-k deleted the multi_queries branch February 9, 2023 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants