[NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? #503

SemyonSinchenko · 2024-06-03T14:31:08Z

Describe the bug

I'm running a query that do the following:

Read parquet files
Generate a lot of case-when columns
Run groupBy + agg on top of that columns (sum, min, max, mean)

I have the following logical plan (I manually truncated some parts):

== Parsed Logical Plan ==
'Aggregate ['customer_id], ['customer_id, sum('DC_food-and-household_7d_flag) AS DC_food-and-household_7d_count#18, ..., ..., ... 2057 more fields]
+- Project [customer_id AS customer_id#5422, CASE WHEN (((true AND (t_minus#5L <= cast(7 as bigint))) AND (card_type#1 = DC)) AND (trx_type#2 = food-and-household)) THEN 1 ELSE 0 END AS DC_food-and-household_7d_flag#14, ..., ... 1225 more fields]
   +- Relation [customer_id#0L,card_type#1,trx_type#2,channel#3,trx_amnt#4,t_minus#5L,part_col#6] parquet

It is converted to the following Comet plan:

== Physical Plan ==
HashAggregate(keys=[customer_id#10003], functions=[sum(DC_food-and-household_7d_flag#14), ..., ... 2057 more fields])
+- Exchange hashpartitioning(customer_id#10003, 11), ENSURE_REQUIREMENTS, [plan_id=35]
   +- ColumnarToRow
      +- CometHashAggregate [DC_food-and-household_7d_flag#14, DC_food-and-household_7d_or_none#15, ..., ... 2056 more fields]
         +- CometProject [DC_food-and-household_7d_flag#14, DC_food-and-household_7d_or_none#15, ..., ... 1224 more fields], [CASE WHEN (((t_minus#5L <= 7) AND (card_type#1 = DC)) AND (trx_type#2 = food-and-household)) THEN 1 ELSE 0 END AS DC_food-and-household_7d_flag#14, ..., ... 1224 more fields]
            +- CometScan parquet [...] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:...], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<...>

Visualization:

Steps to reproduce

I'm running my own benchmark:

Generation of the dataset (link to github): generator --prefix test_data_tiny;
PySpark code (link to github);
Entry point (link to github)

Expected behavior

I expected to see full native plan, but for some reason the last HashAggregate is running on Spark. It looks to me that it is running even in "spark interpreter mode" (I guess because I want too much aggregations and it exceed the limit of the code size for the "Whole stage CodeGet" but I'm not 100% sure).

I checked the documentation of the Comet project and it looks like case-when expressions, sum/min/max/mean expressions are supported. HashAggregate is supported too. Exchange should be supported too because I turned on Comet shuffle (--conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager, --conf spark.comet.exec.shuffle.enabled=true, --conf spark.comet.exec.shuffle.mode=native).

Why if partial aggregation is in Comet the final one isn't and I have a ColumnarToRow instead?

Additional context

I'm ready to provide any additional information or to run any debug query.

Thanks in advance!

The text was updated successfully, but these errors were encountered:

viirya · 2024-06-03T15:48:21Z

Could you run a simple query to verify if Comet shuffle can be triggered?

viirya · 2024-06-03T15:51:23Z

Oh, could you disable spark.sql.adaptive.coalescePartitions.enabled and retry? Comet shuffle will be disabled if spark.sql.adaptive.coalescePartitions.enabled is enabled. Although you disable AQE, but this config is still enabled by default (i.e, it won't be set to false even you disable AQE).

SemyonSinchenko · 2024-06-03T16:33:20Z

Wow! With a disabled spark.sql.adaptive.coalescePartitions.enabled it works! May I open a PR with updates to documentation? Looks like I need to update this page (source)

viirya · 2024-06-03T16:37:00Z

Yea, you can open a PR to update the document. Although it should be just a temporary limit and we are working on to remove it. We can update the document again once the limitation is removed.

andygrove · 2024-06-04T04:08:48Z

You may also want to set spark.comet.explainFallback.enabled=true so that you can see the reasons why parts of your query are not native (this would show up as logging in the driver log).

I wonder if we should default this to true.

## Which issue does this PR close? Closes #503 Closes #191 ## Rationale for this change 1. Provide a way to build Comet from the source on an isolated environments with an access to github.com 2. Update documentation in part, related to compatibility of Spark AQE and Comet Shuffle ## What changes are included in this PR? - Update tuning section about the compatibility of Shuffle and Spark AQE - Add `release-nogit` for building on an isolated environments - Update docs in the section about an installation process Changes to be committed: modified: Makefile modified: docs/source/user-guide/installation.md modified: docs/source/user-guide/tuning.md ## How are these changes tested? I run both `make release` and `make release-nogit`. The first one created properties file in `common/target/classes` but the second did not. The flag `-Dmaven.gitcommitid.skip=true` is described in [this comment](git-commit-id/git-commit-id-maven-plugin#392 (comment)).

## Which issue does this PR close? Closes apache#503 Closes apache#191 ## Rationale for this change 1. Provide a way to build Comet from the source on an isolated environments with an access to github.com 2. Update documentation in part, related to compatibility of Spark AQE and Comet Shuffle ## What changes are included in this PR? - Update tuning section about the compatibility of Shuffle and Spark AQE - Add `release-nogit` for building on an isolated environments - Update docs in the section about an installation process Changes to be committed: modified: Makefile modified: docs/source/user-guide/installation.md modified: docs/source/user-guide/tuning.md ## How are these changes tested? I run both `make release` and `make release-nogit`. The first one created properties file in `common/target/classes` but the second did not. The flag `-Dmaven.gitcommitid.skip=true` is described in [this comment](git-commit-id/git-commit-id-maven-plugin#392 (comment)).

SemyonSinchenko added the bug Something isn't working label Jun 3, 2024

SemyonSinchenko mentioned this issue Jun 4, 2024

docs: changes in documentation #512

Merged

kazuyukitanimura closed this as completed in #512 Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? #503

[NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? #503

SemyonSinchenko commented Jun 3, 2024 •

edited

Loading

viirya commented Jun 3, 2024

viirya commented Jun 3, 2024 •

edited

Loading

SemyonSinchenko commented Jun 3, 2024

viirya commented Jun 3, 2024 •

edited

Loading

andygrove commented Jun 4, 2024

[NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? #503

[NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? #503

Comments

SemyonSinchenko commented Jun 3, 2024 • edited Loading

Describe the bug

Steps to reproduce

Expected behavior

Additional context

viirya commented Jun 3, 2024

viirya commented Jun 3, 2024 • edited Loading

SemyonSinchenko commented Jun 3, 2024

viirya commented Jun 3, 2024 • edited Loading

andygrove commented Jun 4, 2024

SemyonSinchenko commented Jun 3, 2024 •

edited

Loading

viirya commented Jun 3, 2024 •

edited

Loading

viirya commented Jun 3, 2024 •

edited

Loading