-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle a default/request pipeline and a final pipeline with minimal additional overhead #93329
Handle a default/request pipeline and a final pipeline with minimal additional overhead #93329
Conversation
It used to make sense for this to live in the ingest service, because we avoided allocating an ingest document if the pipeline was empty. Now we already have the document regardless, so this can just live in IngestDocument anyway. In any case, this would be a rare and unusual thing to have happen at all. I don't want to drop the logic completely, but I'm also not worried about the performance implications of where it lives.
Pinging @elastic/es-data-management (Team:Data Management) |
// shortcut if the pipeline is empty | ||
if (pipeline.getProcessors().isEmpty()) { | ||
handler.accept(this, null); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to potentially confuse people who are looking at ingest metrics? I would think it would be a pretty rare case -- how much does this optimization save us? Is it worth the potential future "why are my metrics wrong?" support tickets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be wrong about the metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that this is the same logic as before, it's just in a slightly different place. (See c2fbb08 for the deets.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't know what would be wrong about the metrics -- I thought I had traced where this would impact them last week, but now I have no idea what I was seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, it's certainly a valid question to ask and we are indeed in a maze of twisty passages all alike. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
onFinished.onResponse(null); | ||
// update the index request's source and (potentially) cache the timestamp for TSDB | ||
updateIndexRequestSource(indexRequest, ingestDocument); | ||
cacheRawTimestamp(indexRequest, ingestDocument); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice work
Closes #81244
Closes #92843
Closes #93118
Tightens up the document handling aspects of
executePipelines
and its callees.innerExecute
becomes trivial, and nearly drops out (renamed toexecutePipeline
where it remains just to adapt handler shapes).At a high level, the execution goes from:
to
The difference in the flame graph is pretty clear. Before:
After:
And the performance is much better, as one would expect, with the total time spent in any ingest code for the nightly security benchmark dropping from 4994128 to 3568490 millis -- a decrease of 29%.
This is the direct follow up to #93213, but I've been working up to over a while -- #93119 and #93120 added the tests that are now made passing by this PR, while #92203, #92308, #92455 laid some of the groundwork for the eventual document listener cleanup.