Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GET api does not return _ignored by default #107750

Open
salvatore-campagna opened this issue Apr 23, 2024 · 8 comments
Open

GET api does not return _ignored by default #107750

salvatore-campagna opened this issue Apr 23, 2024 · 8 comments
Assignees
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented Apr 23, 2024

Elasticsearch Version

8.5.0 and above

Installed Plugins

No response

Java Version

bundled

OS Version

All

Problem Description

I discovered this issue while working on #101373. The GET api used to return the _ignored field by default and did so up to version 8.4.3. From 8.5.0 the _ignored field is not returned anymore by default and requires users to explicitly ask for the _ignored field to be included by adding _ignored to stored_fields.

Steps to Reproduce

I used the following test to reproduce the issue

# Check Elasticsearch version
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200?pretty"

# Create a mapping suitable to store ignored fields
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X PUT "https://localhost:9200/test-index?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "age":    { "type": "integer", "ignore_malformed": true },  
      "email":  { "type": "keyword", "ignore_above": 128  }, 
      "name":   { "type": "keyword", "ignore_above": 10  }     
    }
  }
}'

# Check the mapping is ok
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200/test-index/_mapping?pretty"

# Index a document with an ingored value (`age` is expected to be numeric)
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X PUT "https://localhost:9200/test-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
  "age": "unknown",
  "email": "bob@gmail.com",
  "name": "bob"
}'

# Verify if `_ignored` is returned
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200/test-index/_doc/1?pretty"

Testing this with version 8.4.3 and 8.5.0 reveals a difference in behaviour which was never reported as a breaking change for 8.5.0. Results are as follows for 8.4.3:

{
  "_index" : "test-index",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "_ignored" : [
    "age"
  ],
  "found" : true,
  "_source" : {
    "age" : "unknown",
    "email" : "bob@gmail.com",
    "name" : "bob"
  }
}

while as follows for 8.5.0:

{
  "_index" : "test-index",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "age" : "unknown",
    "email" : "bob@gmail.com",
    "name" : "bob"
  }
}

Logs (if relevant)

No response

@salvatore-campagna salvatore-campagna added >bug :Search/Search Search-related issues that do not fall into other categories labels Apr 23, 2024
@salvatore-campagna salvatore-campagna self-assigned this Apr 23, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Apr 23, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna javanna added the priority:normal A label for assessing bug priority to be used by ES engineers label Jun 17, 2024
@javanna javanna added :Search Foundations/Search Catch all for Search Foundations and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 17, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine removed the Team:Search Meta label for search team label Jul 17, 2024
@drempapis drempapis self-assigned this Nov 13, 2024
@drempapis
Copy link
Contributor

drempapis commented Nov 14, 2024

The _ignored behavior changed for v8.5+ in this pr #89778, where the _ignored must be in the stored_fields list to be retrieved. In the provided example, setting explicitly the stored_fileds=_ignored

if (leafStoredFieldLoader.storedFields().isEmpty() == false) {

GET /test-index/_doc/1?_source=true&stored_fields=_ignored

We get the response,

{
    "_index": "test-index",
    "_id": "1",
    "_version": 1,
    "_seq_no": 0,
    "_primary_term": 1,
    "_ignored": [
        "age"
    ],
    "found": true,
    "_source": {
        "age": "unknown",
        "email": "bob@gmail.com",
        "name": "bob"
    }
}

I think that GET API / stored_fields Documenation should be updated to reflect that change.

@drempapis
Copy link
Contributor

@javanna, is my assumption correct? Is it something that should be documented, or was an unintentional bug introduced?

@javanna
Copy link
Member

javanna commented Nov 14, 2024

Thanks @drempapis this helps a lot, I asked in the related PR that introduced the change if it was on purpose and what the suggested way to move forward is.

@drempapis
Copy link
Contributor

Hey @nik9000, can you please provide feedback to the question #89778 (comment) ?

@nik9000
Copy link
Member

nik9000 commented Dec 11, 2024

Hey @nik9000, can you please provide feedback to the question #89778 (comment) ?

Hey! Sorry I didn't notice the ping. I'm subscribed to too much of ES and then I filter by subject line. I really should filter down what I receive so I get actual pings.... OK!

It looks like I did comment in the original PR: #89778 (comment) - short version is that "no, I didn't intend to make this change". I really just wanted weird internal details of synthetic_source to stop leaking into the response. If synthetic_source needed, say, message to be loaded from stored fields to rebuild the _source then the message field would also be loaded into the stored_fields.

Anyway! I think this is "just" a bug and we should default to returning _ignored unless it's been explicitly rejected.

@nik9000
Copy link
Member

nik9000 commented Dec 11, 2024

It's probably not a bug I'm going to work on any time soon though. I'm a long way from synthetic_source these days. But I could have a look eventually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

5 participants