-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metricbeat/module/mongodb/replstatus: Update getOpTimestamp
in replstatus
to fix sort and temp files generation issue
#37688
metricbeat/module/mongodb/replstatus: Update getOpTimestamp
in replstatus
to fix sort and temp files generation issue
#37688
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
❕ Build Aborted
Expand to view the summary
Build stats
Steps errors
Expand to view the steps failures
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
…ats into mongodb_replstatus_sort_8683
getOpTimestamp
in replstatus
to fix sort and temp files generation issue.
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
💔 Build Failed
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
@ritalwar Is there any unit test case file present that you need to update as per code changes? |
Also, please make sure that unit tests are also updated. |
Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com>
💔 Build Failed
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
💔 Build Failed
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially doubtful when I saw pipeline aggregation being used instead of FindOne in MongoDB. However, I now understand the benefits of using it, such as executing it in a single pipeline, which is particularly useful for large oplogs. The changes look good.
Pipeline aggregation is available in MongoDB starting from version 4.2. For more information, please refer to this.
As per the MongoDB integration documentation, the compatibility is guaranteed for versions >= 2.8. It appears that @aliabbas-elastic has tested it with v7.x, which could be the reason why we didn't face any issues.
I suggest to add:
- New method for v4.1 and older versions too:
// kind of like this?
firstOp := bson.D{{"ts", bson.D{{"$min", 1}}}}
lastOp := bson.D{{"ts", bson.D{{"$max", 1}}}}
// followed by FindOne?
- Keep this pipeline aggregation implementation as it is but limited to v4.2 and above.
- @aliabbas-elastic After the change can you please test with any v3.x?
cc: @ritalwar @aliabbas-elastic
Also, there is one thing missing in the description of natural is that it is not the proper order in the events arrived in MongoDB but the order when written to disk. Usually, the disk order is similar to the insertion order except when documents move internally because of document growth due to update operations. https://www.mongodb.com/docs/v2.2/reference/glossary/#term-natural-order It is also an expensive op. So a good thing we moved away from it. |
getOpTimestamp
in replstatus
to fix sort and temp files generation issue.getOpTimestamp
in replstatus
to fix sort and temp files generation issue
getOpTimestamp
in replstatus
to fix sort and temp files generation issuegetOpTimestamp
in replstatus
to fix sort and temp files generation issue
Sure |
Thanks @shmsr. Updated in the description |
|
Also, the integration documentation needs an update since the beats code and version support have been modified while ago. |
Yes. I see https://www.mongodb.com/docs/v3.6/reference/method/db.collection.aggregate/#db.collection.aggregate with v3.6 LGTM then. Let's update documentation immediately right after the next version of beats is released. |
…lstatus` to fix sort and temp files generation issue (elastic#37688) * Update getOpTimestamp implementation
…lstatus` to fix sort and temp files generation issue (elastic#37688) * Update getOpTimestamp implementation
Proposed commit message
Refactor
getOpTimestamp
method to use min and max aggregations for calculating first and last timestamps, addressing the sorting issue and reducing the generation of temporary files in the MongoDB.Details:
In the MongoDB oplog, which records operations, each entry has a timestamp stored in a field called
ts
, refer this. This timestamp uses a special format called BSON. To find the first and last timestamps in the entire oplog efficiently, a MongoDB aggregation pipeline is used. This pipeline groups all documents into a singular group ("_id": 1), and the $min and $max operators are subsequently applied to thets
field within this consolidated group. This method, executed directly on the MongoDB server, proves to be both efficient and flexible. MongoDB, makes sure to always compare the part of the timestamp related to time (time_t) before the part related to the order of operations (ordinal), refer. This consistency ensures that timestamps are correctly understood, no matter the platform.In contrast to an earlier implementation that utilized the $natural operator for sorting, the adoption of $min and $max eliminates inefficiencies associated with excessive memory consumption during sorting operations. The removal of natural ordering, which was unsuitable for large datasets, also addresses errors related to memory limits and the automatic creation of temporary files, a behavior introduced in MongoDB version 6.0 for processes exceeding 100 MB of RAM usage as per this doc.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues