Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryTracker*: Use std::shared_ptr to manage MemoryTracker. #5765

Merged
merged 3 commits into from
Sep 5, 2022

Conversation

JinheLin
Copy link
Contributor

@JinheLin JinheLin commented Sep 1, 2022

What problem does this PR solve?

Issue Number: ref #5689

What is changed and how it works?

  1. Computing layer uses MemoryTracker to track memory usage of MPPTask . And also uses MemoryTracker to estimate the size of the data and different execution strategies, such as multi-thread or single-thread, are adopted for different data sizes.
  2. The reading threads of storage will set the MemoryTracker object of the corresponding MPPTask to current_memory_tracker when reading data and avoid inaccurate data size estimates.
  3. If the MPPTask exists, the corresponding MemoryTracker will be destroyed. But the reading threads of storage can still use this object and cause the process crash.
  4. For safety, reading threads should take a shared_ptr of MemoryTracker instead of a raw pointer.
  5. In this PR, creating all the MemoryTracker objects by MemoryTracker::create which returns a std::shared_ptr<MemoryTracker> object.
  6. In DeltaMerge::read, SegmentReadTaskPool will call current_memory_tracker->shared_from_this() to get the shared pointer.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 1, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • JaySon-Huang
  • bestwoody

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 1, 2022
@JinheLin
Copy link
Contributor Author

JinheLin commented Sep 1, 2022

/run-all-tests

@JinheLin
Copy link
Contributor Author

JinheLin commented Sep 1, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 1, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Common/MemoryTracker.h                               18                 6    66.67%          15                 6    60.00%          27                 9    66.67%           2                 0   100.00%
Common/tests/gtest_dynamic_thread_pool.cpp          102                44    56.86%          19                 0   100.00%         155                 3    98.06%          30                26    13.33%
Common/tests/gtest_memtracker.cpp                    58                21    63.79%           3                 0   100.00%          72                 0   100.00%          12                12     0.00%
Interpreters/ProcessList.cpp                        144                97    32.64%          11                 6    45.45%         199               102    48.74%          98                73    25.51%
Interpreters/ProcessList.h                           25                10    60.00%          19                 7    63.16%          70                27    61.43%           6                 3    50.00%
Storages/BackgroundProcessingPool.cpp                94                27    71.28%          10                 1    90.00%         179                26    85.47%          50                14    72.00%
Storages/DeltaMerge/SegmentReadTaskPool.cpp          97                89     8.25%          21                17    19.05%         185               173     6.49%          56                54     3.57%
Storages/DeltaMerge/SegmentReadTaskPool.h            39                11    71.79%          16                 7    56.25%          63                23    63.49%          10                 5    50.00%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               577               305    47.14%         114                44    61.40%         950               363    61.79%         264               187    29.17%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18540      8319             55.13%    214455  85808        59.99%

full coverage report (for internal network access only)

@JinheLin JinheLin marked this pull request as ready for review September 1, 2022 09:42
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 1, 2022
@@ -139,7 +139,7 @@ class SegmentReadTaskPool : private boost::noncopyable
, log(&Poco::Logger::get("SegmentReadTaskPool"))
, unordered_input_stream_ref_count(0)
, exception_happened(false)
, mem_tracker(current_memory_tracker)
, mem_tracker(current_memory_tracker == nullptr ? nullptr : current_memory_tracker->shared_from_this())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will segReadTaskPool be destructed? Is it shared by multiple MPPTasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentReadTaskPool will be destructed after inputstreams which returned by DeltaMergeStore::read are destructed.

One SegmentReadTaskPool object will be created for each DeltaMergeStore::read function call. It is not shared by multiple MPPTasks.

MemoryTracker memory_tracker;
memory_tracker.setMetric(CurrentMetrics::MemoryTrackingInBackgroundProcessingPool);
current_memory_tracker = &memory_tracker;
auto memory_tracker = MemoryTracker::create();
Copy link
Contributor

@bestwoody bestwoody Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a test code? since this memTracker did not setNext a root memTracker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems not a test code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the background threads that handle background tasks such as compaction and delta merge.

What is the use of setNext a root MemoryTracker?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root memtracker is total_mem_tracker in ProcessLsit.
if not setNext a root memtracker,the memory usage actually not be detected by process level of TiFlash.
The config max_memory_usage_for_all_queries will not work on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the background threads that handle background tasks such as compaction and delta merge.

What is the use of setNext a root MemoryTracker?

ref:https://pingcap.feishu.cn/wiki/wikcnvA5vz6Mfk9VocSssEqPp3b

Copy link
Contributor Author

@JinheLin JinheLin Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It just do some background tasks, such as delta comapction, delta merge, segment split, segment merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since segment's size is not very large, the average is about 1GB, it doesn't consume a lot of memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root memtracker is total_mem_tracker in ProcessLsit.
if not setNext a root memtracker,the memory usage actually not be detected by process level of TiFlash.
The config max_memory_usage_for_all_queries will not work on it.

The memory used in here is not caused by queries, so I think it should not be a child of the ProcessList::total_mem_tracker(limited by max_memory_usage_for_all_queries).
But we do need another root MemoryTracker for all background tasks and maybe a new config item for controlling the memory usage for it.

Copy link
Contributor

@bestwoody bestwoody Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are two root memTrackers,none of them can track all the memory of the tiflash process. In that case,we can't track and throttle the process memory. I think we can use a special user level memory tracker to tracker the background storage mem usage, and expose a setting to user, so that it can be tracked and throttled

Copy link
Contributor

@JaySon-Huang JaySon-Huang Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree that a root memory tracker that track all memory of a TiFlash instance can avoid OOM under some case. But it is weird that config with the name max_memory_usage_for_all_queries limit the memory usage for non-queries task. Maybe we need to refactor the config ;)
  2. Some background tasks are lower priority than queries, so those background tasks can be delayed or canceled if the total memory usage reaches the high water mark. We need another bg_tasks_memory_tracker besides queires_memory_tracker.

Copy link
Contributor

@bestwoody bestwoody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 2, 2022
@JinheLin
Copy link
Contributor Author

JinheLin commented Sep 2, 2022

@windtalker PTAL

Copy link
Contributor

@JaySon-Huang JaySon-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 5, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 5, 2022
@JinheLin
Copy link
Contributor Author

JinheLin commented Sep 5, 2022

/merge

@ti-chi-bot
Copy link
Member

@JinheLin: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 6fdcf6c

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 5, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 5, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Common/MemoryTracker.h                               18                 6    66.67%          15                 6    60.00%          27                 9    66.67%           2                 0   100.00%
Common/tests/gtest_dynamic_thread_pool.cpp          102                44    56.86%          19                 0   100.00%         155                 3    98.06%          30                26    13.33%
Common/tests/gtest_memtracker.cpp                    58                21    63.79%           3                 0   100.00%          72                 0   100.00%          12                12     0.00%
Interpreters/ProcessList.cpp                        144                97    32.64%          11                 6    45.45%         199               102    48.74%          98                73    25.51%
Interpreters/ProcessList.h                           25                10    60.00%          19                 7    63.16%          70                27    61.43%           6                 3    50.00%
Storages/BackgroundProcessingPool.cpp                94                26    72.34%          10                 1    90.00%         179                25    86.03%          50                13    74.00%
Storages/DeltaMerge/SegmentReadTaskPool.cpp          97                89     8.25%          21                17    19.05%         185               173     6.49%          56                54     3.57%
Storages/DeltaMerge/SegmentReadTaskPool.h            39                11    71.79%          16                 7    56.25%          63                23    63.49%          10                 5    50.00%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               577               304    47.31%         114                44    61.40%         950               362    61.89%         264               186    29.55%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18629      8331             55.28%    215418  85822        60.16%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot merged commit d32bf10 into pingcap:master Sep 5, 2022
@JinheLin JinheLin deleted the shared-mem-tracker branch September 19, 2022 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants