-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add synthetic_source
support to aggregate_metric_double
fields
#88909
Conversation
Hi @csoulios, I've created a changelog YAML for you. |
Pinging @elastic/es-search (Team:Search) |
public Leaf leaf(LeafReader reader, int[] docIdsInLeaf) throws IOException { | ||
Map<Metric, SortedNumericDocValues> metricDocValues = new EnumMap<>(Metric.class); | ||
for (Metric m : metrics) { | ||
SortedNumericDocValues dv = dv(reader, m); | ||
if (dv != null) { | ||
metricDocValues.put(m, dv); | ||
} | ||
} | ||
|
||
if (metricDocValues.isEmpty()) { | ||
return SourceLoader.SyntheticFieldLoader.NOTHING_LEAF; | ||
} | ||
|
||
return new AggregateMetricSyntheticFieldLoader.ImmediateLeaf(metricDocValues); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find an easy way to leverage the NumericSyntheticFieldLoader
class, so I had to rewrite the same implementation so that it uses one SortedNumericDocValues
instance per metric.
I skipped the singleton optimization because having mutliple docs in leaf requires an array of values. This combined with multiple metrics per doc (a Map<Metric, SortedNumericDocValues>
will make the code extremely complex. I can do it if we think this optimization is worth the complexity
NumericDocValues single = reader.getNumericDocValues(fieldName); | ||
if (single != null) { | ||
return DocValues.singleton(single); | ||
} | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all metric subfields are stored using the SortedNumericDocValuesField
, all doc values will be stored as SortedNumericDocValues
. So, I am not sure reader.getNumericDocValues(fieldName)
will ever return non-null values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try removing it! I believe if you store single values it'll flip to NumericDocValues
anyway.
If you do have to keep loading from both of there it's probably worth making this method static
and public
in NumericFieldMapper
or something like that.
return; | ||
} | ||
b.startObject(simpleName); | ||
for (Map.Entry<Metric, SortedNumericDocValues> entry : metricDocValues.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metricDocValues
is an EnumMap
instance. This means that the order that sub fields will be returned in the source, will be the order in the Metric
enum. Order ismin
,max
, sum
, value_count
. It is not alphabetical, but it will always be consistent. Does it make sense to change this so subfields are retuned in an alphabetical order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cool. Consistency is lovely for reading these docs. Alphabetical just gets the consistency in an easy way. No need to do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few questions. I think this works fine though! I'm just poking around the edges, suggesting some maybe silly ideas. Either way, I think it's right.
@@ -292,6 +296,7 @@ private NumberFieldMapper.NumberFieldType delegateFieldType(Metric metric) { | |||
|
|||
/** | |||
* Return a delegate field type for the default metric sub-field | |||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does your formatter do this? I think it's more standard, but I don't tend to do it myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I undertand the question well, this has nothing to do with formatting the field in the output.
This has to do with the following:
- Querying the field: When running a
term
or arange
query, the query is delegated to the subfield metric that has been marked asdefault_metric
in the field mapping. - When retrieving the field using the fields api, the value of the
default_metric
is returned (this will change when Support aggregate_double_metric fields in the Field API #88534 is merged)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, no, my question was about the extra line inserted into the javadoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry! I wasn't clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it was me messing with IntelliJ formatting. Reverted it now.
NumericDocValues single = reader.getNumericDocValues(fieldName); | ||
if (single != null) { | ||
return DocValues.singleton(single); | ||
} | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try removing it! I believe if you store single values it'll flip to NumericDocValues
anyway.
If you do have to keep loading from both of there it's probably worth making this method static
and public
in NumericFieldMapper
or something like that.
return; | ||
} | ||
b.startObject(simpleName); | ||
for (Map.Entry<Metric, SortedNumericDocValues> entry : metricDocValues.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cool. Consistency is lovely for reading these docs. Alphabetical just gets the consistency in an easy way. No need to do it here.
...in/java/org/elasticsearch/xpack/aggregatemetric/mapper/AggregateDoubleMetricFieldMapper.java
Outdated
Show resolved
Hide resolved
@elasticmachine run elasticsearch-ci/packaging-tests-unix-sample |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a tiny thing about switching types on the tracked map. Otherwise LGTM.
|
||
@Override | ||
public boolean empty() { | ||
return metricDocValues.isEmpty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never empty, right? You could maybe just return false
here. And In the ctor you could assert metricDocValues.empty == false
// it is enough to check for the first docValue. However, in the future | ||
// we may relax the requirement of all metrics existing. In this case | ||
// we should check the doc value for each metric separately | ||
dvHasValue = new EnumMap<>(Metric.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use an EnumSet
instead? And maybe clear
it instead of recreate it?
dvHasValue.put(e.getKey(), leafHasValues); | ||
} | ||
|
||
return hasValues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe dvHasValue.isEmpty() == false
is enough here?
This PR implements
synthetic_source
support to theaggregate_metric_double
field typeRelates to #86603