-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NPE when LeafReader return null VectorValues #13162
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change overall is good. We shouldn't accidentally throw NPEs and we should protect users from querying with incorrect vector types.
Could you add a CHANGES.txt entry? It belongs under bug fix & lucene 9.11
// The field does not exist or does not index vectors | ||
FloatVectorValues floatVectorValues = getFloatVectorValues(fi.name); | ||
if (floatVectorValues == null) { | ||
FloatVectorValues.checkField(this, field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leaf reader here shouldn't throw. Especially since the companion method that accepts a KnnCollector doesn't.
// The field does not exist or does not index vectors | ||
ByteVectorValues byteVectorValues = getByteVectorValues(fi.name); | ||
if (byteVectorValues == null) { | ||
ByteVectorValues.checkField(this, field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leaf reader here shouldn't throw. Especially since the companion method that accepts a KnnCollector doesn't.
if (vectorValues == null) { | ||
return DoubleValues.EMPTY; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you throw in the byte similarity source, but not here? We need to be consistent. I think throwing here is acceptable as well (via FloatVectorValues.check
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉 Thanks for the contribution! 🎉 🎉 🎉
I will merge and backport in a little while
`LeafReader#getXXXVectorValues` may return null value. **Reproduction**: ``` public class TestKnnByteVectorQuery extends BaseKnnVectorQueryTestCase { public void testVectorEncodingMismatch() throws IOException { try (Directory indexStore = getIndexStore("field", new float[] {0, 1}, new float[] {1, 2}, new float[] {0, 0}); IndexReader reader = DirectoryReader.open(indexStore)) { AbstractKnnVectorQuery query = new KnnFloatVectorQuery("field", new float[] {0, 1}, 10); IndexSearcher searcher = newSearcher(reader); searcher.search(query, 10); } } } ``` **Output**: ``` java.lang.NullPointerException: Cannot invoke "org.apache.lucene.index.FloatVectorValues.size()" because the return value of "org.apache.lucene.index.LeafReader.getFloatVectorValues(String)" is null ```
Related to: #13162 Since this is unreleased, no changelog entry is necessary.
Related to: #13162 Since this is unreleased, no changelog entry is necessary.
FWIW: This commit seems to have duplicated These should probably be refactored to eliminate the duplication?
|
Description
LeafReader#getXXXVectorValues
may return null value.Reproduction:
Output: