-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](move-memtable) fix move memtable core when use multi table load #35458
[fix](move-memtable) fix move memtable core when use multi table load #35458
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 42153 ms
|
TPC-DS: Total hot run time: 168886 ms
|
TeamCity be ut coverage result: |
ClickBench: Total hot run time: 30.6 s
|
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 40034 ms
|
TPC-DS: Total hot run time: 169100 ms
|
ClickBench: Total hot run time: 30.58 s
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 42156 ms
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 41673 ms
|
TPC-DS: Total hot run time: 173022 ms
|
ClickBench: Total hot run time: 30.77 s
|
…#35458) ## Proposed changes Move memtable core when use multi table load: ``` 0x51f000c73860 is located 3040 bytes inside of 3456-byte region [0x51f000c72c80,0x51f000c73a00) freed by thread T4867 (FragmentMgrThre) here: #0 0x558f6ad7f43d in operator delete(void*) (/mnt/hdd01/STRESS_ENV/be/lib/doris_be+0x22eec43d) (BuildId: b46f73d1f76dfcd6) #1 0x558f6e6cea2c in std::__new_allocator<doris::PTabletID>::deallocate(doris::PTabletID*, unsigned long) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/new_allocator.h:168:2 #2 0x558f6e6ce9e7 in std::allocator<doris::PTabletID>::deallocate(doris::PTabletID*, unsigned long) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/allocator.h:210:25 #3 0x558f6e6ce9e7 in std::allocator_traits<std::allocator<doris::PTabletID>>::deallocate(std::allocator<doris::PTabletID>&, doris::PTabletID*, unsigned long) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/alloc_traits.h:516:13 #4 0x558f6e6ce9e7 in std::_Vector_base<doris::PTabletID, std::allocator<doris::PTabletID>>::_M_deallocate(doris::PTabletID*, unsigned long) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/stl_vector.h:387:4 #5 0x558f6e6d0780 in void std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>::_M_range_insert<__gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>>(__gnu_cxx::__normal_iterator<doris::PTabletID*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, __gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, __gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, std::forward_iterator_tag) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/vector.tcc:832:3 #6 0x558f6e6c54c5 in __gnu_cxx::__normal_iterator<doris::PTabletID*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>> std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>::insert<__gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, void>(__gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, __gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>, __gnu_cxx::__normal_iterator<doris::PTabletID const*, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>>>) /mnt/disk2/xujianxu/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/stl_vector.h:1483:4 #7 0x558f9b4b214f in doris::LoadStreamMap::save_tablets_to_commit(long, std::vector<doris::PTabletID, std::allocator<doris::PTabletID>> const&) /mnt/disk2/xujianxu/doris/be/src/vec/sink/load_stream_map_pool.cpp:90:13 #8 0x558f9b7258dd in doris::vectorized::VTabletWriterV2::_calc_tablets_to_commit() /mnt/disk2/xujianxu/doris/be/src/vec/sink/writer/vtablet_writer_v2.cpp:650:27 #9 0x558f9b7229f1 in doris::vectorized::VTabletWriterV2::close(doris::Status) /mnt/disk2/xujianxu/doris/be/src/vec/sink/writer/vtablet_writer_v2.cpp:547:9 ``` Multiple sinks with different table loads use the load id, causing confusion in the use of shared data structures between sinks.
We meet OOM when using single stream multi table data:image/s3,"s3://crabby-images/65bfc/65bfc3be22eb8c4a65226d0f0d569c4129509c9d" alt="image" It exist memory leak, and heap profile like: data:image/s3,"s3://crabby-images/25497/25497250df393d57be45cef814ee586e2ab8919e" alt="image" The stream load context will not release in some exception conditions as plan failed for high concurrency causing timeout when obtaining read lock. It is introduced by #35458 The solution effect is shown in the following figure, which can run stably with a small amount of memory data:image/s3,"s3://crabby-images/664f3/664f325ead57088a101db472fd9182e6f76d784c" alt="image"
We meet OOM when using single stream multi table data:image/s3,"s3://crabby-images/65bfc/65bfc3be22eb8c4a65226d0f0d569c4129509c9d" alt="image" It exist memory leak, and heap profile like: data:image/s3,"s3://crabby-images/25497/25497250df393d57be45cef814ee586e2ab8919e" alt="image" The stream load context will not release in some exception conditions as plan failed for high concurrency causing timeout when obtaining read lock. It is introduced by #35458 The solution effect is shown in the following figure, which can run stably with a small amount of memory data:image/s3,"s3://crabby-images/664f3/664f325ead57088a101db472fd9182e6f76d784c" alt="image"
…#38824) pick (#38255) We meet OOM when using single stream multi table data:image/s3,"s3://crabby-images/65bfc/65bfc3be22eb8c4a65226d0f0d569c4129509c9d" alt="image" It exist memory leak, and heap profile like: data:image/s3,"s3://crabby-images/25497/25497250df393d57be45cef814ee586e2ab8919e" alt="image" The stream load context will not release in some exception conditions as plan failed for high concurrency causing timeout when obtaining read lock. It is introduced by #35458 The solution effect is shown in the following figure, which can run stably with a small amount of memory data:image/s3,"s3://crabby-images/664f3/664f325ead57088a101db472fd9182e6f76d784c" alt="image"
Proposed changes
Move memtable core when use multi table load:
Multiple sinks with different table loads use the load id, causing confusion in the use of shared data structures between sinks.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...