Add query replayer #10897

duanmeng · 2024-08-30T04:24:47Z

Velox can record the query metadata (query plan and configs)
during task creation and input vectors of the traced operator,
see #10774 and #10815.

This PR adds a query replayer, it can be used to replay a query locally
using the metadata and input vectors from the production environment.
It supports showing the summary of a query at present, and more traced
operators' replaying supports will be added in the future.

Also, this PR adds two query configs query_trace_max_bytes and
query_trace_task_reg_exp to constraint the record input data size
and trace tasks respectively to ensure the stability of the cluster
in the prod.

Part of #9668

netlify · 2024-08-30T04:25:03Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`c4b3f48`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/66f0ccc0bf45bc00083a72ce

xiaoxmeng

@duanmeng looks good % minors. Thanks!

velox/exec/trace/QueryTraceUtil.h

velox/exec/trace/QueryTraceUtil.cpp

velox/tool/QueryTraceToolBase.cpp

velox/exec/trace/QueryTraceUtil.h

velox/tool/QueryTraceTool.cpp

xiaoxmeng

@duanmeng thanks for the update!

velox/exec/trace/QueryTraceUtil.cpp

velox/tool/trace/QueryTraceTool.cpp

velox/tool/trace/QueryTraceReplayer.cpp

velox/exec/trace/test/QueryTraceTest.cpp

velox/docs/develop/tracing.rst

velox/core/QueryCtx.cpp

velox/exec/trace/QueryDataWriter.cpp

velox/exec/trace/QueryTraceTraits.h

xiaoxmeng

@duanmeng thanks for the update % miniors

velox/exec/trace/QueryDataWriter.cpp

velox/exec/trace/QueryDataWriter.h

velox/exec/Task.cpp

velox/core/QueryConfig.h

velox/core/QueryCtx.cpp

velox/exec/trace/test/QueryTraceTest.cpp

mbasmanova · 2024-09-21T12:08:08Z

velox/core/QueryConfig.h

@@ -357,6 +357,14 @@ class QueryConfig {
  /// Empty string if only want to trace the query metadata.
  static constexpr const char* kQueryTraceNodeIds = "query_trace_node_ids";

+  /// The max trace bytes limit, if it is zero, then tracing is disabled.


These changes are nice. Can they be extracted into a separate PR? Can we include these in the PR description?

Do you mean to add trace size and task limit in a follow-up PR?

Can we include these in the PR description?

I've updated these in the PR description.

mbasmanova · 2024-09-21T12:08:58Z

velox/core/QueryConfig.h

@@ -689,6 +697,16 @@ class QueryConfig {
    return get<std::string>(kQueryTraceNodeIds, "");
  }

+  uint64_t queryTraceMaxBytes() const {
+    // The default query trace bytes, 0 by default.


This comment is redundant. It simply repeats the code on the next line. Let's remove.

Got it, fixed.

mbasmanova · 2024-09-21T12:12:01Z

velox/docs/configs.rst

+   * - query_trace_max_bytes
+     - integer
+     - 0
+     - The max trace bytes limit, if it is zero, then tracing is disabled.


nit: The max trace bytes limit. Tracing is disabled if zero.

mbasmanova · 2024-09-21T12:13:40Z

velox/docs/develop/debugging/tracing.rst

+issues. It helps prevent interference from the test noises in a production
+environment (such as storage, network etc) by allowing replay of a part of the
+query plan and data set in an isolated environment such as a local machine.
+This is much more efficient for query performance debugging as we don't have to


This is useful for debugging query performance ..

mbasmanova · 2024-09-21T12:14:22Z

velox/docs/develop/debugging/tracing.rst

+
+- When the query starts, the task records the metadata including query plan fragment,
+  query configuration, and connector properties.
+- During the query running, each traced operator records the input vectors and saves


running -> execution

Do we store each vector in a separate file? Or do we store all vectors in the same file?

We store all the vectors in a single file per operator per driver, say there will be 3 files if the degree of parallelism of the traced operator is 3 (3 drivers).

mbasmanova · 2024-09-21T12:15:28Z

velox/docs/develop/debugging/tracing.rst

+  query configuration, and connector properties.
+- During the query running, each traced operator records the input vectors and saves
+  in the specified storage location.
+- The metadata are serialized using json format and operator data inputs are serialized


I assume not all connectors support this. What happens if connector is not serializable?

Does this also apply to TableScan operator? I assume no, but I can't find any discussion about TableScan here.

Yes, only the hive connector is supported at present. We plan to support TableScan by only tracing the input splits and update the document accordingly, and the other operators as well. The operator supporting plan is listed at #9668 (comment).

mbasmanova · 2024-09-21T12:17:56Z

velox/docs/develop/debugging/tracing.rst

+- Apply the recorded query configuration and connector properties to replay the query/task
+  with the same input and configuration setup as in production.
+
+**NOTE**: the presto serialization might lose the input vector encoding such as lazy vector


The Presto serialization doesn't always preserve vector encoding (lazy vectors are loaded, nested dictionaries are flattened). Hence, replay may differ from the original run.

mbasmanova · 2024-09-21T12:18:27Z

velox/docs/develop/debugging/tracing.rst

+
+The tracing framework consists of three components:
+
+1. **Query Trace Writer**: metadata writer and the data writer.


metadata and data writer

metadata and data reader

mbasmanova · 2024-09-21T12:20:01Z

velox/docs/develop/debugging/tracing.rst

+- Plan fragment of the task (also known as a plan node tree). It can be serialized
+  as a JSON object, which is already supported in Velox.
+
+**QueryDataWriter** records the input vectors from the target operator, which are


Does this apply to TableScan operator?

No, only input splits are traced for TableScan operator.

mbasmanova · 2024-09-21T12:21:13Z

velox/docs/develop/debugging/tracing.rst

+It is used as the utility to replay the input data as a source operator in the target
+operator replay.
+
+**NOTE**: `QueryDataWriter` serializes and flushes the input vectors in batches,


Can we use this tool to replay crashes? Will the tool be able to read "partial traces" produced before the crash? CC: @xiaoxmeng

Yes, I believe so. We will record the plan node tree during the task creation, and record the input vectors in Operator::addInput, so we can use the partial input data to replay the crashed query.

Yeah, then we need to make sure the replay doesn't depends on the summary file.

mbasmanova · 2024-09-21T12:22:36Z

velox/docs/develop/debugging/tracing.rst

+
+.. code-block:: c++
+
+  query_trace_tool --root $root_dir --summary --pretty


Is replay a separate tool?

I still think that showing plan as JSON is not very useful. Can we show it in a more user-friendly format?

Is replay a separate tool?

No, I forget to update the name, it should be query_replayer.

I still think that showing plan as JSON is not very useful. Can we show it in a more user-friendly format?

The plan is recorded during the task creation so no stats information in it. It is a bonus that allows users to get a preliminary understanding of the tracked data before replaying the query.

Should we remove the plan JSON showing, and only list the traced task ids? @mbasmanova @xiaoxmeng

It would be nice to allow for showing the plan, just not using JSON. Why can't we display it similar to printPlanWithStats, just without the stats?

Got it, we need to use a human-friendly way to show the plan instead of using JSON. It should be similar to the way used in the printPlanWithStats although without the stats. Do I understand correctly? @mbasmanova

Fixed by using queryPlan->toString(true, true). Now the plan showing is human-friendly as the follows :)

-- HashJoin[5][INNER c0=u0, filter: lt(ROW["c0"],135)] -> c0:BIGINT, c1:SMALLINT, c2:TINYINT -- Project[1][expressions: (c0:BIGINT, ROW["c0"]), (c1:SMALLINT, ROW["c1"]), (c2:TINYINT, ROW["c2"])] -> c0:BIGINT, c1:SMALLINT, c2:TINYINT -- Values[0][1 rows in 1 vectors] -> c0:BIGINT, c1:SMALLINT, c2:TINYINT, c3:VARCHAR -- Project[4][expressions: (u0:BIGINT, ROW["c0"]), (u1:SMALLINT, ROW["c1"]), (u2:TINYINT, ROW["a0"])] -> u0:BIGINT, u1:SMALLINT, u2:TINYINT -- Aggregation[3][SINGLE [c0, c1] a0 := min(ROW["c2"])] -> c0:BIGINT, c1:SMALLINT, a0:TINYINT -- Values[2][1 rows in 1 vectors] -> c0:BIGINT, c1:SMALLINT, c2:TINYINT, c3:VARCHAR

duanmeng

@mbasmanova Thanks for your review, could you help take another look?

duanmeng · 2024-09-21T12:32:01Z

velox/core/QueryConfig.h

@@ -357,6 +357,14 @@ class QueryConfig {
  /// Empty string if only want to trace the query metadata.
  static constexpr const char* kQueryTraceNodeIds = "query_trace_node_ids";

+  /// The max trace bytes limit, if it is zero, then tracing is disabled.


Do you mean to add trace size and task limit in a follow-up PR?

duanmeng · 2024-09-21T12:33:34Z

velox/core/QueryConfig.h

@@ -689,6 +697,16 @@ class QueryConfig {
    return get<std::string>(kQueryTraceNodeIds, "");
  }

+  uint64_t queryTraceMaxBytes() const {
+    // The default query trace bytes, 0 by default.


Got it, fixed.

duanmeng · 2024-09-21T12:37:55Z

velox/docs/configs.rst

+   * - query_trace_max_bytes
+     - integer
+     - 0
+     - The max trace bytes limit, if it is zero, then tracing is disabled.


duanmeng · 2024-09-21T12:39:02Z

velox/docs/develop/debugging/tracing.rst

+issues. It helps prevent interference from the test noises in a production
+environment (such as storage, network etc) by allowing replay of a part of the
+query plan and data set in an isolated environment such as a local machine.
+This is much more efficient for query performance debugging as we don't have to


duanmeng · 2024-09-21T12:44:25Z

velox/docs/develop/debugging/tracing.rst

+
+- When the query starts, the task records the metadata including query plan fragment,
+  query configuration, and connector properties.
+- During the query running, each traced operator records the input vectors and saves


We store all the vectors in a single file per operator per driver, say there will be 3 files if the degree of parallelism of the traced operator is 3 (3 drivers).

duanmeng · 2024-09-21T13:03:47Z

velox/docs/develop/debugging/tracing.rst

+- Apply the recorded query configuration and connector properties to replay the query/task
+  with the same input and configuration setup as in production.
+
+**NOTE**: the presto serialization might lose the input vector encoding such as lazy vector


duanmeng · 2024-09-21T13:05:08Z

velox/docs/develop/debugging/tracing.rst

+
+The tracing framework consists of three components:
+
+1. **Query Trace Writer**: metadata writer and the data writer.


duanmeng · 2024-09-21T13:06:24Z

velox/docs/develop/debugging/tracing.rst

+- Plan fragment of the task (also known as a plan node tree). It can be serialized
+  as a JSON object, which is already supported in Velox.
+
+**QueryDataWriter** records the input vectors from the target operator, which are


No, only input splits are traced for TableScan operator.

duanmeng · 2024-09-21T13:08:57Z

velox/docs/develop/debugging/tracing.rst

+It is used as the utility to replay the input data as a source operator in the target
+operator replay.
+
+**NOTE**: `QueryDataWriter` serializes and flushes the input vectors in batches,


Yes, I believe so. We will record the plan node tree during the task creation, and record the input vectors in Operator::addInput, so we can use the partial input data to replay the crashed query.

duanmeng · 2024-09-21T13:13:18Z

velox/docs/develop/debugging/tracing.rst

+
+.. code-block:: c++
+
+  query_trace_tool --root $root_dir --summary --pretty


Is replay a separate tool?

No, I forget to update the name, it should be query_replayer.

I still think that showing plan as JSON is not very useful. Can we show it in a more user-friendly format?

The plan is recorded during the task creation so no stats information in it. It is a bonus that allows users to get a preliminary understanding of the tracked data before replaying the query.

facebook-github-bot · 2024-09-23T15:56:49Z

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-09-24T07:20:57Z

@xiaoxmeng merged this pull request in cc46d81.

conbench-facebook · 2024-09-24T08:03:38Z

Conbench analyzed the 1 benchmark run on commit cc46d81e.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Summary: Velox can record the query metadata (query plan and configs) during task creation and input vectors of the traced operator, see facebookincubator#10774 and facebookincubator#10815. This PR adds a query replayer, it can be used to replay a query locally using the metadata and input vectors from the production environment. It supports showing the summary of a query at present, and more traced operators' replaying supports will be added in the future. Also, this PR adds two query configs `query_trace_max_bytes` and `query_trace_task_reg_exp` to constraint the record input data size and trace tasks respectively to ensure the stability of the cluster in the prod. Part of facebookincubator#9668 Pull Request resolved: facebookincubator#10897 Reviewed By: tanjialiang Differential Revision: D62336733 Pulled By: xiaoxmeng fbshipit-source-id: d196738dfa92c29fe5de67a944f652a328903814

duanmeng requested a review from xiaoxmeng August 30, 2024 04:24

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 30, 2024

duanmeng marked this pull request as draft August 30, 2024 04:24

duanmeng force-pushed the trace_tool branch 4 times, most recently from a051b07 to f2f11ee Compare August 31, 2024 11:47

duanmeng changed the title ~~Add trace tool~~ Add query trace tool Aug 31, 2024

duanmeng marked this pull request as ready for review August 31, 2024 11:52

duanmeng force-pushed the trace_tool branch 9 times, most recently from 4a7719a to df172db Compare August 31, 2024 15:30

xiaoxmeng reviewed Aug 31, 2024

View reviewed changes

duanmeng force-pushed the trace_tool branch 4 times, most recently from 489beac to c7bad5b Compare September 1, 2024 04:47

xiaoxmeng reviewed Sep 1, 2024

View reviewed changes

duanmeng force-pushed the trace_tool branch from c7bad5b to 3269980 Compare September 1, 2024 08:00

xiaoxmeng reviewed Sep 1, 2024

View reviewed changes

velox/docs/develop/tracing.rst Outdated Show resolved Hide resolved

duanmeng force-pushed the trace_tool branch from 3269980 to c35d326 Compare September 3, 2024 02:46

xiaoxmeng reviewed Sep 6, 2024

View reviewed changes

velox/docs/develop/tracing.rst Outdated Show resolved Hide resolved

duanmeng force-pushed the trace_tool branch from c35d326 to 010e84e Compare September 6, 2024 01:40

duanmeng force-pushed the trace_tool branch 2 times, most recently from 05cd44b to 13bd5a0 Compare September 18, 2024 03:46

xiaoxmeng reviewed Sep 18, 2024

View reviewed changes

duanmeng force-pushed the trace_tool branch from 13bd5a0 to 0b12aec Compare September 18, 2024 05:05

xiaoxmeng reviewed Sep 18, 2024

View reviewed changes

velox/exec/trace/QueryDataWriter.cpp Show resolved Hide resolved

velox/exec/trace/QueryDataWriter.h Show resolved Hide resolved

velox/exec/Task.cpp Show resolved Hide resolved

velox/exec/Task.cpp Outdated Show resolved Hide resolved

duanmeng force-pushed the trace_tool branch 3 times, most recently from 7ee3e0a to 03f98ff Compare September 20, 2024 04:04

xiaoxmeng reviewed Sep 20, 2024

View reviewed changes

duanmeng force-pushed the trace_tool branch 2 times, most recently from 2d6f151 to 2219367 Compare September 20, 2024 06:24

xiaoxmeng reviewed Sep 20, 2024

View reviewed changes

velox/exec/trace/test/QueryTraceTest.cpp Outdated Show resolved Hide resolved

duanmeng force-pushed the trace_tool branch 3 times, most recently from bfa1813 to d781eff Compare September 20, 2024 08:22

mbasmanova reviewed Sep 21, 2024

View reviewed changes

duanmeng commented Sep 21, 2024

View reviewed changes

duanmeng force-pushed the trace_tool branch 3 times, most recently from edc1c41 to 78d2057 Compare September 22, 2024 10:11

Add trace replayer

c4b3f48

duanmeng force-pushed the trace_tool branch from 78d2057 to c4b3f48 Compare September 23, 2024 02:04

facebook-github-bot closed this in cc46d81 Sep 24, 2024

facebook-github-bot added the Merged label Sep 24, 2024


		The tracing framework consists of three components:

		1. Query Trace Writer: metadata writer and the data writer.


		.. code-block:: c++

		query_trace_tool --root $root_dir --summary --pretty

Add query replayer #10897

Add query replayer #10897

Conversation

duanmeng commented Aug 30, 2024 • edited Loading

netlify bot commented Aug 30, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

xiaoxmeng left a comment

Choose a reason for hiding this comment

xiaoxmeng left a comment

Choose a reason for hiding this comment

xiaoxmeng left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

duanmeng left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Sep 23, 2024

facebook-github-bot commented Sep 24, 2024

conbench-facebook bot commented Sep 24, 2024

duanmeng commented Aug 30, 2024 •

edited

Loading

netlify bot commented Aug 30, 2024 •

edited

Loading