Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement](memtable) make memtable memusage more accurate #40912

Merged
merged 14 commits into from
Sep 19, 2024

Conversation

yiguolei
Copy link
Contributor

@yiguolei yiguolei commented Sep 18, 2024

Proposed changes

  1. Add memtype to memtable, and save a weak ptr vector in memtable writer, so that we could get different memory usage by traverse the vector.
  2. Using scoped memory usage to compute the mem usage of a memtable.
  3. CHECK if the tracker is 0 when the memtable flush success.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yiguolei
Copy link
Contributor Author

run buildall

@yiguolei
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2710510b6b7fa4a02cc09514865c43a3aaeb7bd8, data reload: false

------ Round 1 ----------------------------------
q1	18000	7514	7381	7381
q2	2541	166	161	161
q3	11211	1171	1236	1171
q4	10425	754	742	742
q5	7799	3081	3096	3081
q6	240	153	152	152
q7	1029	633	608	608
q8	9595	2047	2070	2047
q9	6887	6360	6388	6360
q10	7004	2305	2295	2295
q11	428	247	247	247
q12	410	216	221	216
q13	17766	2974	2977	2974
q14	237	212	215	212
q15	583	524	526	524
q16	657	624	613	613
q17	980	824	775	775
q18	7250	6788	6765	6765
q19	1409	1041	1015	1015
q20	594	282	284	282
q21	3994	3249	3210	3210
q22	1129	1009	1006	1006
Total cold run time: 110168 ms
Total hot run time: 41837 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7335	7216	7211	7211
q2	336	230	230	230
q3	2886	2757	2780	2757
q4	1908	1737	1706	1706
q5	5394	5414	5388	5388
q6	229	145	141	141
q7	2100	1697	1727	1697
q8	3160	3331	3333	3331
q9	8345	8388	8445	8388
q10	3377	3339	3336	3336
q11	564	461	458	458
q12	795	614	583	583
q13	5977	3001	2977	2977
q14	279	258	276	258
q15	555	522	516	516
q16	693	680	680	680
q17	1760	1547	1535	1535
q18	7842	7235	7438	7235
q19	1651	1577	1489	1489
q20	2047	1786	1808	1786
q21	5426	5093	5189	5093
q22	1130	1030	1001	1001
Total cold run time: 63789 ms
Total hot run time: 57796 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.34% (9585/25669)
Line Coverage: 28.71% (79216/275870)
Region Coverage: 28.21% (41030/145466)
Branch Coverage: 24.82% (20903/84224)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2710510b6b7fa4a02cc09514865c43a3aaeb7bd8_2710510b6b7fa4a02cc09514865c43a3aaeb7bd8/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 194582 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2710510b6b7fa4a02cc09514865c43a3aaeb7bd8, data reload: false

query1	942	385	381	381
query2	6537	2055	2037	2037
query3	6706	213	223	213
query4	34387	23373	23518	23373
query5	4347	476	467	467
query6	275	172	168	168
query7	4619	301	304	301
query8	320	225	223	223
query9	9538	2625	2608	2608
query10	480	281	290	281
query11	18116	15204	15281	15204
query12	151	100	96	96
query13	1630	411	398	398
query14	10510	7351	7432	7351
query15	328	172	176	172
query16	8060	447	454	447
query17	1722	567	545	545
query18	2132	292	300	292
query19	349	139	164	139
query20	116	104	105	104
query21	214	101	101	101
query22	4630	4467	4157	4157
query23	34731	33847	34237	33847
query24	11270	2855	2950	2855
query25	623	373	418	373
query26	1199	152	160	152
query27	2587	281	280	280
query28	7936	2451	2427	2427
query29	836	404	410	404
query30	331	164	153	153
query31	994	765	795	765
query32	88	62	53	53
query33	770	297	284	284
query34	970	491	512	491
query35	846	738	700	700
query36	1086	931	920	920
query37	161	87	92	87
query38	4001	4028	3919	3919
query39	1445	1458	1407	1407
query40	208	95	93	93
query41	48	48	46	46
query42	120	97	98	97
query43	514	483	487	483
query44	1248	809	774	774
query45	193	166	161	161
query46	1137	753	746	746
query47	1925	1787	1870	1787
query48	441	356	362	356
query49	1128	402	386	386
query50	819	401	400	400
query51	6886	7013	6977	6977
query52	96	86	86	86
query53	256	180	176	176
query54	1091	461	456	456
query55	77	78	73	73
query56	281	257	253	253
query57	1199	1054	1063	1054
query58	235	263	230	230
query59	3089	2905	2872	2872
query60	291	256	282	256
query61	101	102	133	102
query62	864	651	694	651
query63	217	186	184	184
query64	4074	636	646	636
query65	3279	3175	3171	3171
query66	850	299	308	299
query67	16251	15688	15814	15688
query68	3102	675	558	558
query69	449	299	287	287
query70	1097	1137	1126	1126
query71	328	266	267	266
query72	6040	4129	4239	4129
query73	750	323	331	323
query74	9386	9010	8938	8938
query75	3372	2687	2654	2654
query76	2111	888	929	888
query77	412	296	309	296
query78	10027	9376	9246	9246
query79	950	883	876	876
query80	689	594	598	594
query81	511	257	254	254
query82	248	237	237	237
query83	172	168	172	168
query84	247	113	109	109
query85	811	434	413	413
query86	309	320	319	319
query87	4392	4441	4401	4401
query88	5122	4081	4073	4073
query89	368	366	364	364
query90	1882	326	327	326
query91	177	205	182	182
query92	83	76	77	76
query93	914	904	903	903
query94	789	374	481	374
query95	449	457	417	417
query96	490	481	481	481
query97	3126	3205	3131	3131
query98	237	231	230	230
query99	1440	1321	1275	1275
Total cold run time: 293381 ms
Total hot run time: 194582 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.78 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2710510b6b7fa4a02cc09514865c43a3aaeb7bd8, data reload: false

query1	0.05	0.05	0.05
query2	0.06	0.02	0.02
query3	0.22	0.06	0.06
query4	1.65	0.10	0.09
query5	0.53	0.52	0.49
query6	1.13	0.73	0.73
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.56	0.52	0.50
query10	0.57	0.57	0.54
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.58	0.59
query14	3.12	3.12	2.96
query15	0.89	0.82	0.82
query16	0.38	0.40	0.39
query17	1.03	1.03	1.01
query18	0.21	0.20	0.20
query19	1.93	1.81	1.97
query20	0.00	0.01	0.01
query21	15.35	0.60	0.58
query22	2.31	2.95	2.36
query23	17.38	0.88	0.91
query24	2.93	0.32	1.77
query25	0.16	0.21	0.14
query26	0.30	0.15	0.13
query27	0.04	0.04	0.04
query28	10.91	1.08	1.06
query29	12.56	3.25	3.26
query30	0.26	0.06	0.06
query31	2.87	0.38	0.37
query32	3.29	0.46	0.46
query33	3.00	3.02	3.03
query34	16.81	4.41	4.37
query35	4.46	4.43	4.43
query36	0.68	0.48	0.50
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.03	0.02	0.02
query40	0.16	0.12	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 107.09 s
Total hot run time: 32.78 s

@yiguolei
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9582/25674)
Line Coverage: 28.71% (79200/275862)
Region Coverage: 28.20% (41027/145506)
Branch Coverage: 24.81% (20906/84248)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ae5d261b7c02ede3b86e5b471447875e5d393075_ae5d261b7c02ede3b86e5b471447875e5d393075/report/index.html

@@ -217,7 +215,6 @@ Status MemTable::insert(const vectorized::Block* input_block,
auto input_size = size_t(input_block->bytes() * num_rows / input_block->rows() *
config::memtable_insert_memory_ratio);
_mem_usage += input_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove _mem_usage, config::memtable_insert_memory_ratio now.
Maybe remove g_memtable_input_block_allocated_size as well.

@doris-robot
Copy link

TPC-H: Total hot run time: 41650 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ae5d261b7c02ede3b86e5b471447875e5d393075, data reload: false

------ Round 1 ----------------------------------
q1	18236	7533	7345	7345
q2	2698	171	170	170
q3	11885	1166	1215	1166
q4	11200	770	768	768
q5	9314	3119	3087	3087
q6	234	148	147	147
q7	1002	618	608	608
q8	9412	2041	2058	2041
q9	6925	6450	6433	6433
q10	7014	2247	2290	2247
q11	436	249	238	238
q12	412	219	221	219
q13	17780	2975	2991	2975
q14	239	204	212	204
q15	582	526	505	505
q16	663	597	620	597
q17	985	788	812	788
q18	7280	6597	6767	6597
q19	1403	1084	1027	1027
q20	595	303	287	287
q21	4020	3289	3193	3193
q22	1100	1008	1013	1008
Total cold run time: 113415 ms
Total hot run time: 41650 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7258	7180	7195	7180
q2	325	239	227	227
q3	2905	2762	2784	2762
q4	1934	1637	1730	1637
q5	5399	5405	5404	5404
q6	226	141	143	141
q7	2105	1702	1716	1702
q8	3195	3326	3330	3326
q9	8449	8441	8422	8422
q10	3381	3329	3342	3329
q11	584	478	470	470
q12	760	570	594	570
q13	5617	3002	2992	2992
q14	308	261	268	261
q15	563	513	515	513
q16	706	678	666	666
q17	1769	1585	1528	1528
q18	7570	7388	7316	7316
q19	1683	1463	1556	1463
q20	2045	1829	1862	1829
q21	5173	5123	5087	5087
q22	1107	993	1010	993
Total cold run time: 63062 ms
Total hot run time: 57818 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194289 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ae5d261b7c02ede3b86e5b471447875e5d393075, data reload: false

query1	943	385	368	368
query2	6546	2094	2027	2027
query3	6694	214	220	214
query4	34213	23268	23627	23268
query5	4404	470	484	470
query6	278	173	167	167
query7	4609	298	301	298
query8	281	222	223	222
query9	9694	2694	2674	2674
query10	496	284	288	284
query11	18429	15247	15229	15229
query12	152	101	99	99
query13	1627	406	398	398
query14	10909	7018	7422	7018
query15	289	173	172	172
query16	7929	460	453	453
query17	1799	568	564	564
query18	2115	304	299	299
query19	350	145	149	145
query20	117	108	110	108
query21	219	112	105	105
query22	4755	4302	4038	4038
query23	34793	34384	34146	34146
query24	11242	2897	2861	2861
query25	653	403	423	403
query26	1268	160	159	159
query27	2808	281	296	281
query28	8303	2492	2464	2464
query29	853	426	422	422
query30	316	154	157	154
query31	1000	849	791	791
query32	91	52	55	52
query33	774	278	289	278
query34	996	480	489	480
query35	848	738	718	718
query36	1092	940	953	940
query37	154	84	88	84
query38	3938	3865	3893	3865
query39	1487	1431	1401	1401
query40	279	95	93	93
query41	49	46	46	46
query42	111	96	95	95
query43	512	485	476	476
query44	1267	807	781	781
query45	193	165	165	165
query46	1147	750	762	750
query47	1956	1826	1834	1826
query48	458	378	352	352
query49	1124	404	411	404
query50	834	396	396	396
query51	7077	6938	6811	6811
query52	97	86	85	85
query53	249	188	177	177
query54	1236	455	465	455
query55	79	76	79	76
query56	278	265	259	259
query57	1225	1083	1107	1083
query58	243	241	248	241
query59	3301	3166	3097	3097
query60	296	260	264	260
query61	102	99	104	99
query62	849	672	664	664
query63	226	188	183	183
query64	5236	648	627	627
query65	3260	3170	3175	3170
query66	1424	297	294	294
query67	16012	15750	15500	15500
query68	3079	546	567	546
query69	440	281	293	281
query70	1193	1074	1138	1074
query71	325	266	263	263
query72	5906	3930	3992	3930
query73	765	326	331	326
query74	9407	9007	9014	9007
query75	3382	2671	2640	2640
query76	2153	987	982	982
query77	448	297	285	285
query78	9927	9525	10166	9525
query79	958	885	868	868
query80	673	578	561	561
query81	501	256	259	256
query82	254	233	244	233
query83	234	166	165	165
query84	238	113	107	107
query85	743	365	370	365
query86	310	319	309	309
query87	4332	4239	4359	4239
query88	4851	4085	4037	4037
query89	373	370	362	362
query90	1973	314	313	313
query91	163	237	163	163
query92	78	71	78	71
query93	920	910	906	906
query94	739	367	382	367
query95	450	408	412	408
query96	488	494	488	488
query97	3140	3100	3159	3100
query98	218	225	220	220
query99	1427	1308	1308	1308
Total cold run time: 296672 ms
Total hot run time: 194289 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.99 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ae5d261b7c02ede3b86e5b471447875e5d393075, data reload: false

query1	0.04	0.05	0.04
query2	0.07	0.03	0.03
query3	0.22	0.07	0.06
query4	1.65	0.10	0.09
query5	0.53	0.50	0.50
query6	1.13	0.73	0.72
query7	0.01	0.01	0.01
query8	0.04	0.03	0.03
query9	0.55	0.49	0.49
query10	0.54	0.58	0.54
query11	0.15	0.10	0.10
query12	0.14	0.11	0.11
query13	0.60	0.59	0.59
query14	2.97	3.01	2.94
query15	0.88	0.83	0.82
query16	0.38	0.37	0.38
query17	1.07	1.05	0.99
query18	0.21	0.21	0.20
query19	1.96	1.92	2.01
query20	0.02	0.01	0.01
query21	15.36	0.58	0.56
query22	2.36	3.05	1.58
query23	17.29	0.87	0.70
query24	2.82	1.14	0.58
query25	0.28	0.17	0.06
query26	0.37	0.14	0.13
query27	0.04	0.04	0.03
query28	11.14	1.10	1.05
query29	12.52	3.29	3.28
query30	0.27	0.05	0.05
query31	2.88	0.38	0.38
query32	3.28	0.45	0.45
query33	2.97	3.03	3.07
query34	16.79	4.39	4.39
query35	4.40	4.38	4.44
query36	0.66	0.48	0.49
query37	0.08	0.06	0.06
query38	0.04	0.03	0.04
query39	0.04	0.02	0.02
query40	0.15	0.12	0.14
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.04 s
Total hot run time: 31.99 s

@yiguolei
Copy link
Contributor Author

run buildall

Copy link
Contributor

@kaijchen kaijchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9583/25676)
Line Coverage: 28.72% (79222/275852)
Region Coverage: 28.19% (41019/145517)
Branch Coverage: 24.81% (20905/84248)
Coverage Report: http://coverage.selectdb-in.cc/coverage/23d3585eba7db35d2288dbfdecac4e3586eab0e5_23d3585eba7db35d2288dbfdecac4e3586eab0e5/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41488 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 23d3585eba7db35d2288dbfdecac4e3586eab0e5, data reload: false

------ Round 1 ----------------------------------
q1	17609	7781	7207	7207
q2	2040	167	153	153
q3	10795	1034	1145	1034
q4	10210	799	753	753
q5	7729	3038	3039	3038
q6	236	150	147	147
q7	1005	627	600	600
q8	9442	2044	2041	2041
q9	6855	6375	6435	6375
q10	7499	2243	2298	2243
q11	442	253	253	253
q12	416	211	215	211
q13	18110	3022	3041	3022
q14	248	225	230	225
q15	583	510	543	510
q16	673	611	615	611
q17	993	813	913	813
q18	7382	6729	6740	6729
q19	1399	974	1021	974
q20	583	285	293	285
q21	3951	3252	3325	3252
q22	1106	1012	1026	1012
Total cold run time: 109306 ms
Total hot run time: 41488 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7269	7240	7254	7240
q2	330	232	239	232
q3	3179	3162	3105	3105
q4	2137	1938	1892	1892
q5	5801	5792	5883	5792
q6	241	152	150	150
q7	2360	1923	1898	1898
q8	3334	3431	3408	3408
q9	8873	9035	8707	8707
q10	3530	3425	3476	3425
q11	580	477	484	477
q12	833	603	598	598
q13	17353	3154	3205	3154
q14	295	278	270	270
q15	566	516	528	516
q16	744	669	685	669
q17	1784	1593	1587	1587
q18	8244	7872	7846	7846
q19	1738	1612	1617	1612
q20	2126	1889	1849	1849
q21	5439	5438	5409	5409
q22	1120	1098	1050	1050
Total cold run time: 77876 ms
Total hot run time: 60886 ms

@yiguolei yiguolei merged commit eda303f into apache:master Sep 19, 2024
23 of 28 checks passed
yiguolei added a commit that referenced this pull request Sep 19, 2024
## Proposed changes

1. Add memtype to memtable, and save a weak ptr vector in memtable
writer, so that we could get different memory usage by traverse the
vector.
2. Using scoped memory usage to compute the mem usage of a memtable.
3. CHECK if the tracker is 0 when the memtable flush success.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
yiguolei pushed a commit that referenced this pull request Sep 20, 2024
## Proposed changes

#40912 has changed meaning of `write_mem` in memtable memory limiter.
This PR is a followup to change the active memtable flush policy
accordingly.

It also changed:
1. The amount of memtable writers selected in one flush.
2. The memtable writers are selected in orders of its size.
yiguolei pushed a commit that referenced this pull request Sep 20, 2024
## Proposed changes

#40912 has changed meaning of `write_mem` in memtable memory limiter.
This PR is a followup to change the active memtable flush policy
accordingly.

It also changed:
1. The amount of memtable writers selected in one flush.
2. The memtable writers are selected in orders of its size.
kaijchen added a commit to kaijchen/doris that referenced this pull request Sep 24, 2024
yiguolei pushed a commit that referenced this pull request Sep 25, 2024
## Proposed changes

Previously, `mem_usage = write_mem + flush_mem`, because `active_mem` is
included in `write_mem`.

After #40912, `write_mem` becomes `queue_mem`, which no longer includes
`active_mem`.
This PR fixes this problem, by setting `mem_usage = active_mem +
queue_mem + flush_mem`
yiguolei added a commit to yiguolei/incubator-doris that referenced this pull request Oct 19, 2024
…40912)

## Proposed changes

1. Add memtype to memtable, and save a weak ptr vector in memtable
writer, so that we could get different memory usage by traverse the
vector.
2. Using scoped memory usage to compute the mem usage of a memtable.
3. CHECK if the tracker is 0 when the memtable flush success.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
kaijchen added a commit to kaijchen/doris that referenced this pull request Oct 21, 2024
## Proposed changes

apache#40912 has changed meaning of `write_mem` in memtable memory limiter.
This PR is a followup to change the active memtable flush policy
accordingly.

It also changed:
1. The amount of memtable writers selected in one flush.
2. The memtable writers are selected in orders of its size.
kaijchen added a commit to kaijchen/doris that referenced this pull request Oct 21, 2024
## Proposed changes

Previously, `mem_usage = write_mem + flush_mem`, because `active_mem` is
included in `write_mem`.

After apache#40912, `write_mem` becomes `queue_mem`, which no longer includes
`active_mem`.
This PR fixes this problem, by setting `mem_usage = active_mem +
queue_mem + flush_mem`
kaijchen added a commit to kaijchen/doris that referenced this pull request Nov 18, 2024
## Proposed changes

Previously, `mem_usage = write_mem + flush_mem`, because `active_mem` is
included in `write_mem`.

After apache#40912, `write_mem` becomes `queue_mem`, which no longer includes
`active_mem`.
This PR fixes this problem, by setting `mem_usage = active_mem +
queue_mem + flush_mem`
kaijchen pushed a commit to kaijchen/doris that referenced this pull request Nov 18, 2024
…40912)

1. Add memtype to memtable, and save a weak ptr vector in memtable
writer, so that we could get different memory usage by traverse the
vector.
2. Using scoped memory usage to compute the mem usage of a memtable.
3. CHECK if the tracker is 0 when the memtable flush success.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
kaijchen added a commit to kaijchen/doris that referenced this pull request Nov 18, 2024
## Proposed changes

Previously, `mem_usage = write_mem + flush_mem`, because `active_mem` is
included in `write_mem`.

After apache#40912, `write_mem` becomes `queue_mem`, which no longer includes
`active_mem`.
This PR fixes this problem, by setting `mem_usage = active_mem +
queue_mem + flush_mem`
kaijchen added a commit to kaijchen/doris that referenced this pull request Nov 18, 2024
## Proposed changes

Previously, `mem_usage = write_mem + flush_mem`, because `active_mem` is
included in `write_mem`.

After apache#40912, `write_mem` becomes `queue_mem`, which no longer includes
`active_mem`.
This PR fixes this problem, by setting `mem_usage = active_mem +
queue_mem + flush_mem`
dataroaring pushed a commit that referenced this pull request Nov 26, 2024
…44304)

### What problem does this PR solve?

Bvar "g_memtable_input_block_allocated_size" is no longer needed after
#40912.
Calling `MutableBlock::allocated_bytes()` in `Memtable::insert()` has
some performance penalties.
So we should remove it.
github-actions bot pushed a commit that referenced this pull request Nov 26, 2024
…44304)

### What problem does this PR solve?

Bvar "g_memtable_input_block_allocated_size" is no longer needed after
#40912.
Calling `MutableBlock::allocated_bytes()` in `Memtable::insert()` has
some performance penalties.
So we should remove it.
yiguolei added a commit that referenced this pull request Jan 15, 2025
…#46997)

### What problem does this PR solve?

Related PR: #40912

Problem Summary:

Do not reset _arena in MemTable::to_block(), because it is still used in
~MemTable() when releasing agg places

Fix the following use-after-free

Use:

==3628099==ERROR: AddressSanitizer: heap-use-after-free on address
0x52100381be60 at pc 0x5648f30893f8 bp 0x7f8842433310 sp 0x7f8842433308
READ of size 8 at 0x52100381be60 thread T4767 (wg_flush_broker)
#0 0x5648f30893f7 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::destroy_slots()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1992:14
#1 0x5648f30936f6 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~raw_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1236:23
#2 0x5648f3089276 in phmap::flat_hash_set<unsigned long,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~flat_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:4577:7
#3 0x5648f308922a in doris::BitmapValue::~BitmapValue()
doris/be/src/util/bitmap_value.h:824:7
#4 0x56490d319fa6 in
doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>::~AggregateFunctionBitmapData()
doris/be/src/vec/aggregate_functions/aggregate_function_bitmap.h:127:8
#5 0x56490d49636a in
doris::vectorized::IAggregateFunctionDataHelper<doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>,
doris::vectorized::AggregateFunctionBitmapOp<doris::vectorized::AggregateFunctionBitmapUnionOp>>::destroy(char*)
const doris/be/src/vec/aggregate_functions/aggregate_function.h:563:92
#6 0x5648f68376e9 in doris::MemTable::~MemTable()
doris/be/src/olap/memtable.cpp:159:27
Free:

0x52100381be60 is located 352 bytes inside of 4096-byte region
[0x52100381bd00,0x52100381cd00)
freed by thread T4767 (wg_flush_broker) here:
#0 0x5648f2f3ee46 in free (doris/output/be/lib/doris_be+0x57418e46)
(BuildId: 298b9c91a1ec8fe0)
#1 0x5648f3080dfc in DefaultMemoryAllocator::free(void*)
doris/be/src/vec/common/allocator.h:108:41
#2 0x5648f3080b3f in Allocator<false, false, false,
DefaultMemoryAllocator>::free(void*, unsigned long)
doris/be/src/vec/common/allocator.h:323:13
#3 0x5648f30b6dee in doris::vectorized::Arena::Chunk::~Chunk()
doris/be/src/vec/common/arena.h:77:31
#4 0x5648f30b6d1f in doris::vectorized::Arena::~Arena()
doris/be/src/vec/common/arena.h:151:16
#5 0x5648f30b695a in
std::default_delete<doris::vectorized::Arena>::operator()(doris::vectorized::Arena*)
const
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:99:2
#6 0x5648f30b67c8 in std::__uniq_ptr_impl<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:211:4
#7 0x5648f30b5d8c in std::unique_ptr<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:509:7
#8 0x5648f684253b in
doris::MemTable::_to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:522:12
#9 0x5648f6842ac5 in
doris::MemTable::to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:528:5
#10 0x5648f6907a72 in
doris::FlushToken::_do_flush_memtable(doris::MemTable*, int, long*)
doris/be/src/olap/memtable_flush_executor.cpp:144:9
#11 0x5648f690932c in
doris::FlushToken::_flush_memtable(std::shared_ptr<doris::MemTable>,
int, long) doris/be/src/olap/memtable_flush_executor.cpp:183:16
#12 0x5648f6915d18 in doris::MemtableFlushTask::run()
doris/be/src/olap/memtable_flush_executor.cpp:60:20
github-actions bot pushed a commit that referenced this pull request Jan 15, 2025
…#46997)

### What problem does this PR solve?

Related PR: #40912

Problem Summary:

Do not reset _arena in MemTable::to_block(), because it is still used in
~MemTable() when releasing agg places

Fix the following use-after-free

Use:

==3628099==ERROR: AddressSanitizer: heap-use-after-free on address
0x52100381be60 at pc 0x5648f30893f8 bp 0x7f8842433310 sp 0x7f8842433308
READ of size 8 at 0x52100381be60 thread T4767 (wg_flush_broker)
#0 0x5648f30893f7 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::destroy_slots()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1992:14
#1 0x5648f30936f6 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~raw_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1236:23
#2 0x5648f3089276 in phmap::flat_hash_set<unsigned long,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~flat_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:4577:7
#3 0x5648f308922a in doris::BitmapValue::~BitmapValue()
doris/be/src/util/bitmap_value.h:824:7
#4 0x56490d319fa6 in
doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>::~AggregateFunctionBitmapData()
doris/be/src/vec/aggregate_functions/aggregate_function_bitmap.h:127:8
#5 0x56490d49636a in
doris::vectorized::IAggregateFunctionDataHelper<doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>,
doris::vectorized::AggregateFunctionBitmapOp<doris::vectorized::AggregateFunctionBitmapUnionOp>>::destroy(char*)
const doris/be/src/vec/aggregate_functions/aggregate_function.h:563:92
#6 0x5648f68376e9 in doris::MemTable::~MemTable()
doris/be/src/olap/memtable.cpp:159:27
Free:

0x52100381be60 is located 352 bytes inside of 4096-byte region
[0x52100381bd00,0x52100381cd00)
freed by thread T4767 (wg_flush_broker) here:
#0 0x5648f2f3ee46 in free (doris/output/be/lib/doris_be+0x57418e46)
(BuildId: 298b9c91a1ec8fe0)
#1 0x5648f3080dfc in DefaultMemoryAllocator::free(void*)
doris/be/src/vec/common/allocator.h:108:41
#2 0x5648f3080b3f in Allocator<false, false, false,
DefaultMemoryAllocator>::free(void*, unsigned long)
doris/be/src/vec/common/allocator.h:323:13
#3 0x5648f30b6dee in doris::vectorized::Arena::Chunk::~Chunk()
doris/be/src/vec/common/arena.h:77:31
#4 0x5648f30b6d1f in doris::vectorized::Arena::~Arena()
doris/be/src/vec/common/arena.h:151:16
#5 0x5648f30b695a in
std::default_delete<doris::vectorized::Arena>::operator()(doris::vectorized::Arena*)
const
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:99:2
#6 0x5648f30b67c8 in std::__uniq_ptr_impl<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:211:4
#7 0x5648f30b5d8c in std::unique_ptr<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:509:7
#8 0x5648f684253b in
doris::MemTable::_to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:522:12
#9 0x5648f6842ac5 in
doris::MemTable::to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:528:5
#10 0x5648f6907a72 in
doris::FlushToken::_do_flush_memtable(doris::MemTable*, int, long*)
doris/be/src/olap/memtable_flush_executor.cpp:144:9
#11 0x5648f690932c in
doris::FlushToken::_flush_memtable(std::shared_ptr<doris::MemTable>,
int, long) doris/be/src/olap/memtable_flush_executor.cpp:183:16
#12 0x5648f6915d18 in doris::MemtableFlushTask::run()
doris/be/src/olap/memtable_flush_executor.cpp:60:20
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…apache#46997)

### What problem does this PR solve?

Related PR: apache#40912

Problem Summary:

Do not reset _arena in MemTable::to_block(), because it is still used in
~MemTable() when releasing agg places

Fix the following use-after-free

Use:

==3628099==ERROR: AddressSanitizer: heap-use-after-free on address
0x52100381be60 at pc 0x5648f30893f8 bp 0x7f8842433310 sp 0x7f8842433308
READ of size 8 at 0x52100381be60 thread T4767 (wg_flush_broker)
#0 0x5648f30893f7 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::destroy_slots()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1992:14
apache#1 0x5648f30936f6 in
phmap::priv::raw_hash_set<phmap::priv::FlatHashSetPolicy<unsigned long>,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~raw_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1236:23
apache#2 0x5648f3089276 in phmap::flat_hash_set<unsigned long,
phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>,
std::allocator<unsigned long>>::~flat_hash_set()
doris/thirdparty/installed/include/parallel_hashmap/phmap.h:4577:7
apache#3 0x5648f308922a in doris::BitmapValue::~BitmapValue()
doris/be/src/util/bitmap_value.h:824:7
apache#4 0x56490d319fa6 in
doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>::~AggregateFunctionBitmapData()
doris/be/src/vec/aggregate_functions/aggregate_function_bitmap.h:127:8
apache#5 0x56490d49636a in
doris::vectorized::IAggregateFunctionDataHelper<doris::vectorized::AggregateFunctionBitmapData<doris::vectorized::AggregateFunctionBitmapUnionOp>,
doris::vectorized::AggregateFunctionBitmapOp<doris::vectorized::AggregateFunctionBitmapUnionOp>>::destroy(char*)
const doris/be/src/vec/aggregate_functions/aggregate_function.h:563:92
apache#6 0x5648f68376e9 in doris::MemTable::~MemTable()
doris/be/src/olap/memtable.cpp:159:27
Free:

0x52100381be60 is located 352 bytes inside of 4096-byte region
[0x52100381bd00,0x52100381cd00)
freed by thread T4767 (wg_flush_broker) here:
#0 0x5648f2f3ee46 in free (doris/output/be/lib/doris_be+0x57418e46)
(BuildId: 298b9c91a1ec8fe0)
apache#1 0x5648f3080dfc in DefaultMemoryAllocator::free(void*)
doris/be/src/vec/common/allocator.h:108:41
apache#2 0x5648f3080b3f in Allocator<false, false, false,
DefaultMemoryAllocator>::free(void*, unsigned long)
doris/be/src/vec/common/allocator.h:323:13
apache#3 0x5648f30b6dee in doris::vectorized::Arena::Chunk::~Chunk()
doris/be/src/vec/common/arena.h:77:31
apache#4 0x5648f30b6d1f in doris::vectorized::Arena::~Arena()
doris/be/src/vec/common/arena.h:151:16
apache#5 0x5648f30b695a in
std::default_delete<doris::vectorized::Arena>::operator()(doris::vectorized::Arena*)
const
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:99:2
apache#6 0x5648f30b67c8 in std::__uniq_ptr_impl<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:211:4
apache#7 0x5648f30b5d8c in std::unique_ptr<doris::vectorized::Arena,
std::default_delete<doris::vectorized::Arena>>::reset(doris::vectorized::Arena*)
env/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/unique_ptr.h:509:7
apache#8 0x5648f684253b in
doris::MemTable::_to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:522:12
apache#9 0x5648f6842ac5 in
doris::MemTable::to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block>>*)
doris/be/src/olap/memtable.cpp:528:5
apache#10 0x5648f6907a72 in
doris::FlushToken::_do_flush_memtable(doris::MemTable*, int, long*)
doris/be/src/olap/memtable_flush_executor.cpp:144:9
apache#11 0x5648f690932c in
doris::FlushToken::_flush_memtable(std::shared_ptr<doris::MemTable>,
int, long) doris/be/src/olap/memtable_flush_executor.cpp:183:16
apache#12 0x5648f6915d18 in doris::MemtableFlushTask::run()
doris/be/src/olap/memtable_flush_executor.cpp:60:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants