Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Add 3 configs of spill #5088

Merged
merged 4 commits into from
Apr 2, 2024
Merged

[VL] Add 3 configs of spill #5088

merged 4 commits into from
Apr 2, 2024

Conversation

FelixYBW
Copy link
Contributor

The 3 configs are missing in Gluten.

  val COLUMNAR_VELOX_MAX_SPILL_RUN_ROWS =
    buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillRunRows")
      .internal()
      .doc("The maximum row size of a single spill run")
      .bytesConf(ByteUnit.BYTE)
      .createWithDefaultString("12M")

  val COLUMNAR_VELOX_MAX_SPILL_BYTES =
    buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillBytes")
      .internal()
      .doc("The maximum file size of a query")
      .bytesConf(ByteUnit.BYTE)
      .createWithDefaultString("100G")

  val COLUMNAR_VELOX_MAX_SPILL_WRITE_BUFFER_SIZE =
    buildConf("spark.gluten.sql.columnar.backend.velox.spillWriteBufferSize")
      .internal()
      .doc("The maximum write buffer size")
      .bytesConf(ByteUnit.BYTE)
      .createWithDefaultString("4M")

@FelixYBW FelixYBW requested a review from zhztheplayer March 22, 2024 22:15
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor Author

To bypass the result mismatch issue of facebookincubator/velox#9219, you may set a very large MaxSpillRunRows and maxSpillFileSize

Copy link

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Apr 2, 2024

Run Gluten Clickhouse CI

Comment on lines 1274 to 1286
val COLUMNAR_VELOX_MAX_SPILL_RUN_ROWS =
buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillRunRows")
.internal()
.doc("The maximum row size of a single spill run")
.bytesConf(ByteUnit.BYTE)
.createWithDefaultString("12M")

val COLUMNAR_VELOX_MAX_SPILL_BYTES =
buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillBytes")
.internal()
.doc("The maximum file size of a query")
.bytesConf(ByteUnit.BYTE)
.createWithDefaultString("100G")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

MaxSpillRunRows-> maxSpillRunRows
MaxSpillBytes -> maxSpillBytes

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for a nit. Thanks!

Copy link

github-actions bot commented Apr 2, 2024

Run Gluten Clickhouse CI

@FelixYBW FelixYBW merged commit 8060cea into apache:main Apr 2, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants