Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing bug to prevent NullPointerException while doing PUT mappings #2582

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

oaganesh
Copy link

@oaganesh oaganesh commented Mar 5, 2025

Description

Following code prevents the state of a NullPointerException from occurring while doing PUT mappings. The expected behavior is to block the update by telling the user that we cannot update the mapping.

Related Issues

Resolves #2556

Check List

  • [ ✔️ ] New functionality includes testing.
  • [ ✔️ ] New functionality has been documented.
  • [ ✔️ ] API changes companion pull request created.
  • [ ✔️ ] Commits are signed per the DCO using --signoff.
  • [ ✔️ ] Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Input:

Create index with below mappings:

curl --request PUT \
  --url http://localhost:9200/my-knn-index-1/ \
  --header 'Content-Type: application/json' \
  --data '{
	"settings": {
		"index": {
			"number_of_shards": 1,
			"number_of_replicas": 0,
			"refresh_interval": "1s",
			"knn": true
		}
	},
	"mappings": {
		"properties": {
			"my_vector2": {
				"type": "knn_vector",
				"dimension": 4,
				"method": {
					"engine": "faiss",
					"name": "hnsw"
				}
			}
		}
	}
}'

Try doing put mappings like this:

curl --request PUT \
  --url http://localhost:9200/my-knn-index-1/_mappings \
  --header 'Content-Type: application/json' \
  --data '{
	"properties": {
		"my_vector2": {
			"type": "knn_vector",
			"dimension": 4
		}
	}
}'

Changes

Previous Error Output:

{
	"error": {
		"root_cause": [
			{
				"type": "null_pointer_exception",
				"reason": "Cannot invoke \"org.opensearch.knn.index.engine.KNNMethodContext.getMethodComponentContext()\" because \"m\" is null"
			}
		],
		"type": "null_pointer_exception",
		"reason": "Cannot invoke \"org.opensearch.knn.index.engine.KNNMethodContext.getMethodComponentContext()\" because \"m\" is null"
	},
	"status": 500
}

New Error Output:

{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "Mapping update for knn_vector fields is not supported. Cannot update mapping without the original method configuration."
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "Mapping update for knn_vector fields is not supported. Cannot update mapping without the original method configuration."
    },
    "status": 400
}

oaganesh added 2 commits March 5, 2025 10:28
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
@navneet1v
Copy link
Collaborator

The error message could be better. So the way to think about error message should be, when a user sees it he/she should able to understand what caused this error. With this error message I am not seeing that. adding @jmazanec15 , @shatejas to see how we can improve this message

Comment on lines 164 to 172
}), m -> {
if (m == null) {
throw new IllegalArgumentException(
"Mapping update for knn_vector fields is not supported. "
+ "Cannot update mapping without the original method configuration."
);
}
return m.getMethodComponentContext().getName();
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets also have a UT and IT for this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to follow something like this instead of throwing: https://github.com/opensearch-project/OpenSearch/blob/73882054afcdb74244c07c5be1f54a629ffd0bc2/server/src/main/java/org/opensearch/index/mapper/TextParams.java#L143C13-L143C45. This is the conflictSerializer which is used to build the exception method, so no need to throw.

If I were to redo this field mapper completely, I wouldve added the toString method to KnnMethodContext so we could just follow the default: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/mapper/ParametrizedFieldMapper.java#L199. But, thats out of scope for this change.

@markwu-sde
Copy link
Contributor

markwu-sde commented Mar 5, 2025

The error message could be better. So the way to think about error message should be, when a user sees it he/she should able to understand what caused this error. With this error message I am not seeing that. adding @jmazanec15 , @shatejas to see how we can improve this message

It seems as though when merging the mappings together the expectation is that for knn the method is either both present or absent. However I think when defining the index it can default to FAISS without specifying an engine while the underlying document content remains without a method context.

The scenario listed above would be a failure case where the current value has a method context but the new value doesn't. Interestingly enough the same can happen if a index was inserted without a method context but the mapping was updated with a method context.

With that in mind I think it would beneficial to tell the user that they need to maintain behavior of including/excluding the method when updating. Something like Cannot add or remove method key for existing KNN field [%s] could work better here?

oaganesh added 3 commits March 5, 2025 13:57
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
@jmazanec15
Copy link
Member

@markwu-sde The failure happens in the parameters conflict serializer. This is used to actually build the error message. So we are getting an NPE when building the error message. So I dont think we are allowing users to update any of these parameters, its more so how we are failing.

oaganesh added 2 commits March 6, 2025 18:08
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
Signed-off-by: Arun Ganesh <oaganesh@amazon.com>
@oaganesh oaganesh requested review from navneet1v and markwu-sde March 7, 2025 21:56
).setSerializer(
// Main serializer - handles null values properly
(b, f, v) -> {
if (v != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Is there a way for this to just serialize the null value as opposed to skipping? Like, b.field(f, null) ort something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I dont think we'll hit this because https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/mapper/ParametrizedFieldMapper.java#L195, acceptsNull is false by default, but not sure. Were you able to repro a failure case for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] NPE while doing put mappings
4 participants