-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid attempting to load the same empty field twice in fetch phase #107551
Changes from all commits
463ca9c
4d479fc
7031f16
8292988
a4f6c07
0a39458
888d88f
65be419
411fbd5
5f4cc87
20131ac
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 107551 | ||
summary: Avoid attempting to load the same empty field twice in fetch phase | ||
area: Search | ||
type: bug | ||
issues: [] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,12 +9,16 @@ | |
package org.elasticsearch.search.fetch; | ||
|
||
import org.apache.lucene.index.LeafReaderContext; | ||
import org.apache.lucene.util.SetOnce; | ||
import org.elasticsearch.index.mapper.IdFieldMapper; | ||
import org.elasticsearch.search.lookup.FieldLookup; | ||
import org.elasticsearch.search.lookup.LeafFieldLookupProvider; | ||
|
||
import java.io.IOException; | ||
import java.util.Collections; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.Set; | ||
import java.util.function.Supplier; | ||
|
||
/** | ||
|
@@ -26,15 +30,22 @@ | |
*/ | ||
class PreloadedFieldLookupProvider implements LeafFieldLookupProvider { | ||
|
||
Map<String, List<Object>> storedFields; | ||
LeafFieldLookupProvider backUpLoader; | ||
Supplier<LeafFieldLookupProvider> loaderSupplier; | ||
private final SetOnce<Set<String>> preloadedStoredFieldNames = new SetOnce<>(); | ||
private Map<String, List<Object>> preloadedStoredFieldValues; | ||
private String id; | ||
private LeafFieldLookupProvider backUpLoader; | ||
private Supplier<LeafFieldLookupProvider> loaderSupplier; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I took the chance to make these private and add package private setter/getter methods when needed. I find that it clarifies who does what and when. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ❤️ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a question...do I understand it correctly that Because later on I see we do
which looks like as "if the name is there and the field was preloaded then just get the preloaded values"... I see that one comes from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if they were the same we would not need a separate set. preloadedStoredFieldNames includes all the fields that we know we will attempt to load for all documents. Those include fields that don't have a value, while preloadedStoredFieldValues contains only those fields that were found in the current doc. The overhead was caused by trying to load There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok thank you. |
||
|
||
@Override | ||
public void populateFieldLookup(FieldLookup fieldLookup, int doc) throws IOException { | ||
String field = fieldLookup.fieldType().name(); | ||
if (storedFields.containsKey(field)) { | ||
fieldLookup.setValues(storedFields.get(field)); | ||
|
||
if (field.equals(IdFieldMapper.NAME)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is because it is not part of the ordinary loaded stored fields, hence it needs to be provided and handled separately. |
||
fieldLookup.setValues(Collections.singletonList(id)); | ||
return; | ||
} | ||
if (preloadedStoredFieldNames.get().contains(field)) { | ||
fieldLookup.setValues(preloadedStoredFieldValues.get(field)); | ||
return; | ||
} | ||
// stored field not preloaded, go and get it directly | ||
|
@@ -44,8 +55,26 @@ public void populateFieldLookup(FieldLookup fieldLookup, int doc) throws IOExcep | |
backUpLoader.populateFieldLookup(fieldLookup, doc); | ||
} | ||
|
||
void setPreloadedStoredFieldNames(Set<String> preloadedStoredFieldNames) { | ||
this.preloadedStoredFieldNames.set(preloadedStoredFieldNames); | ||
} | ||
|
||
void setPreloadedStoredFieldValues(String id, Map<String, List<Object>> preloadedStoredFieldValues) { | ||
assert preloadedStoredFieldNames.get().containsAll(preloadedStoredFieldValues.keySet()) | ||
: "Provided stored field that was not expected to be preloaded? " | ||
+ preloadedStoredFieldValues.keySet() | ||
+ " - " | ||
+ preloadedStoredFieldNames; | ||
this.preloadedStoredFieldValues = preloadedStoredFieldValues; | ||
this.id = id; | ||
} | ||
|
||
void setNextReader(LeafReaderContext ctx) { | ||
backUpLoader = null; | ||
loaderSupplier = () -> LeafFieldLookupProvider.fromStoredFields().apply(ctx); | ||
} | ||
|
||
LeafFieldLookupProvider getBackUpLoader() { | ||
return backUpLoader; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a consequence of exposing the correct stored fields spec in FetchFieldsPhase, that takes stored metadata fields into account.
_ignored
will be removed again once it's no longer stored. The problem is that the field should have been there since its introduction, but it was never added toStoredFieldLoader#fieldsToLoad
which is where the other three fields come from. Note thatStoredFieldsPhase
did not include in its stored fields spec the default metadata fields that it always requested.For posterity, why is
_type
not there if fetch fields phase requests it by default? Because it is not mapped in recent clusters, and it is only part of the stored fields spec, hence loaded, when it is mapped, which is the case only in very old archive indices.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍