-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] add deployed native models to inference_stats in trained model stats response #88187
[ML] add deployed native models to inference_stats in trained model stats response #88187
Conversation
Hi @benwtrent, I've created a changelog YAML for you. |
public InferenceStats getOverallInferenceStats() { | ||
return new InferenceStats( | ||
0L, | ||
nodeStats.stream().filter(n -> n.getInferenceCount().isPresent()).mapToLong(n -> n.getInferenceCount().get()).sum(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inference_count
is the only duplicate field between these two stats objects.
But, it seems weird to remove it from the assignment stats. Having just a single duplicate field doesn't seem that big of a deal to me.
@dimitris-athanasiou @davidkyle what say you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually isn't a duplicate as it is the sum of inferences across all nodes. I think that's useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a duplicate but I don't think that's a problem. AssignmentStats collects various counts across all nodes
Line 467 in 68c51f3
long totalInferenceCount = nodeStats.stream() |
Pinging @elastic/ml-core (Team:ML) |
…nwtrent/elasticsearch into feature/ml-fix-pytorch-stats-response
@elasticmachine update branch |
…nwtrent/elasticsearch into feature/ml-fix-pytorch-stats-response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Btw, I think a unit test that checks we sum the node inference counts and failures might be useful to add. |
…torch-stats-response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This fixes the discrepancy where DFA models have inference stats but NLP models do not?
correct @davidkyle |
This adds a valid
inference_stats
section for deployed native models.inference_stats
is effectively a sub-set of thedeployment_stats
. It's a high level view of the overall stats of the model, deployment_stats contains more detailed information around types of errors seen, throughput, etc.