Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cherry-pick](branch-2.1) Pick "[Enhancement](group commit)Optimize be select for group commit #35558" #37830

Merged
merged 2 commits into from
Jul 24, 2024

Conversation

Yukang-Lian
Copy link
Collaborator

Proposed changes

Issue Number: close #xxx

Pick #35558

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Yukang-Lian
Copy link
Collaborator Author

run buildall

@Yukang-Lian
Copy link
Collaborator Author

run buildall

1 similar comment
@Yukang-Lian
Copy link
Collaborator Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.53% (9262/25354)
Line Coverage: 28.05% (75665/269798)
Region Coverage: 26.87% (38903/144773)
Branch Coverage: 23.60% (19747/83680)
Coverage Report: http://coverage.selectdb-in.cc/coverage/f63973e256e537fe8ae6da0bc09ba743428e9590_f63973e256e537fe8ae6da0bc09ba743428e9590/report/index.html

@Yukang-Lian
Copy link
Collaborator Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.52% (9261/25357)
Line Coverage: 28.04% (75662/269874)
Region Coverage: 26.86% (38901/144806)
Branch Coverage: 23.60% (19755/83704)
Coverage Report: http://coverage.selectdb-in.cc/coverage/3dad8a00e862eb2d45103f4cd70173bd3a87bc7f_3dad8a00e862eb2d45103f4cd70173bd3a87bc7f/report/index.html

…e#35558)

1. Streamload and insert into, if batched and sent to the master FE,
should use a consistent BE strategy (previously, insert into reused the
first selected BE, while streamload used round robin). First, a map
<table id, be id> records a fixed be id for a certain table. The first
time a table is imported, a BE is randomly selected, and this table id
and be id are recorded in the map permanently. Subsequently, all data
imported into this table will select the BE corresponding to the table
id recorded in the map. This ensures that batching is maximized to a
single BE.
To address the issue of excessive load on a single BE, a variable
similar to a bvar window is used to monitor the total data volume sent
to a specific BE for a specific table during the batch interval (default
10 seconds). A second map <be id, window variable> is used to track
this. If a new import finds that its corresponding BE's window variable
is less than a certain value (e.g., 1G), the new import continues to be
sent to the corresponding BE according to map1. If it exceeds this
value, the new import is sent to another BE with the smallest window
variable value, and map1 is updated. If every BE exceeds this value, the
one with the smallest value is still chosen. This helps to alleviate
excessive pressure on a single BE.

2. For streamload, if batched and sent to a BE, it will batch directly
on this BE and will commit the transaction at the end of the import. At
this point, a request is sent to the FE, which records the size of this
import and adds it to the window variable.

3. Streamload sent to observer FE, as well as insert into sent to
observer FE, follow the logic in 1 by RPC, passing the table id to the
master FE to obtain the selected be id.
@Yukang-Lian
Copy link
Collaborator Author

run buildall

@Yukang-Lian
Copy link
Collaborator Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.45% (9236/25341)
Line Coverage: 27.99% (75494/269759)
Region Coverage: 26.81% (38812/144793)
Branch Coverage: 23.55% (19708/83682)
Coverage Report: http://coverage.selectdb-in.cc/coverage/92200a538731921a614a987b7f568920e915f0ba_92200a538731921a614a987b7f568920e915f0ba/report/index.html

@Yukang-Lian
Copy link
Collaborator Author

run feut

@dataroaring dataroaring merged commit 792bd7c into apache:branch-2.1 Jul 24, 2024
19 of 21 checks passed
yiguolei pushed a commit that referenced this pull request Aug 23, 2024
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
@yiguolei yiguolei mentioned this pull request Sep 5, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants