Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.14] Avoid attempting to load the same empty field twice in fetch phase (#107551) #107580

Merged
merged 1 commit into from
Apr 18, 2024

Conversation

javanna
Copy link
Member

@javanna javanna commented Apr 17, 2024

Backports the following commits to 8.14:

…lastic#107551)

During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes.

We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead.

This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time).

This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value.

Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`).

Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.
@javanna javanna added :Search/Search Search-related issues that do not fall into other categories >bug backport Team:Search Meta label for search team labels Apr 17, 2024
Copy link
Contributor

Documentation preview:

@javanna javanna merged commit 77a23e5 into elastic:8.14 Apr 18, 2024
15 checks passed
@javanna javanna deleted the backport/8.14/pr-107551 branch April 18, 2024 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.14.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants