Why does Spark GPU Plugin only support 1 gpu per executor? #5367

Matrix-World · 2021-02-04T07:52:40Z

Matrix-World
Feb 4, 2021

Hello @abellina
I did some Queries in GPU and CPU mode, and found that the execution performance of SQL in GPU mode has been improved greatly. But I would like to ask a question. When the Rapids-4-Spark plugin uses Rmm to initialize GPU resources, an Executor can only bind one GPU resource. Spark3.0 itself can set multiple GPUs for a single Executor through spark.executor.resource.gpu.amount, so why does Spark GPU Plugin only support 1 gpu per executor?
What are the reasons for this setting?
If so, can you provide multiple GPUs per executor?
Is this feature in your current development work? I am very interested in participating.
I am looking forward to your reply

Answered by jlowe

Feb 4, 2021

The RAPIDS Accelerator only supports a single GPU per executor because that was a limitation of RAPIDS cudf which is the foundation of the Accelerator. I think libcudf may support multiple GPUs, but if so it's been a very recent addition.

It would be interesting to know why you want to run multiple GPUs per executor. Typically it's not recommended to run too many concurrent tasks per Spark executor, so we've actually received requests to do the opposite of this request, i.e.: multiple executors per GPU.

Also note that we have no plans for a single task leveraging multiple GPUs with the Accelerator because libcudf does not support this nor has any plans to do this in the future.

If you hav…

View full answer

jlowe · 2021-02-04T14:50:02Z

jlowe
Feb 4, 2021

The RAPIDS Accelerator only supports a single GPU per executor because that was a limitation of RAPIDS cudf which is the foundation of the Accelerator. I think libcudf may support multiple GPUs, but if so it's been a very recent addition.

It would be interesting to know why you want to run multiple GPUs per executor. Typically it's not recommended to run too many concurrent tasks per Spark executor, so we've actually received requests to do the opposite of this request, i.e.: multiple executors per GPU.

Also note that we have no plans for a single task leveraging multiple GPUs with the Accelerator because libcudf does not support this nor has any plans to do this in the future.

If you have a use-case that requires this feature it would be good to document it here, otherwise I would recommend running multiple executors to leverage multiple GPUs, just as many effective Spark installations run multiple executors per node rather than one large executor per node.

0 replies

revans2 · 2021-02-04T15:07:24Z

revans2
Feb 4, 2021
Maintainer

@jlowe we really should add this to the FAQ. We have gotten this question multiple times now, or different variants of it.

0 replies

Matrix-World · 2021-02-04T15:11:13Z

Matrix-World
Feb 4, 2021
Author

The RAPIDS Accelerator only supports a single GPU per executor because that was a limitation of RAPIDS cudf which is the foundation of the Accelerator. I think libcudf may support multiple GPUs, but if so it's been a very recent addition.

It would be interesting to know why you want to run multiple GPUs per executor. Typically it's not recommended to run too many concurrent tasks per Spark executor, so we've actually received requests to do the opposite of this request, i.e.: multiple executors per GPU.

Also note that we have no plans for a single task leveraging multiple GPUs with the Accelerator because libcudf does not support this nor has any plans to do this in the future.

If you have a use-case that requires this feature it would be good to document it here, otherwise I would recommend running multiple executors to leverage multiple GPUs, just as many effective Spark installations run multiple executors per node rather than one large executor per node.

@jlowe
thank you for your reply.
My initial idea was that, similar to the traditional way, an Executor can have a lot of cpu cores, and an Executor can perform a lot of GPU tasks. Your explanation answered my confusion, GPU does not have the concept of cores. Too many tasks will also affect the execution performance of Executor. Excuse me, are you developing multiple executors per GPU?

0 replies

jlowe · 2021-02-04T15:16:16Z

jlowe
Feb 4, 2021

we really should add this to the FAQ.

Excellent point, I'll post a PR later today.

are you developing multiple executors per GPU?

We're not planning on it, at least for now. Apache Spark does not allow scheduling a fraction of GPUs to an executor.

0 replies

Matrix-World · 2021-02-04T15:21:48Z

Matrix-World
Feb 4, 2021
Author

we really should add this to the FAQ.

Excellent point, I'll post a PR later today.

are you developing multiple executors per GPU?

We're not planning on it, at least for now. Apache Spark does not allow scheduling a fraction of GPUs to an executor.

Ok, thank you very much for your positive reply, have a nice day.

0 replies

Matrix-World · 2021-02-04T15:52:58Z

Matrix-World
Feb 4, 2021
Author

Hello @jlowe
I want to ask another question. For an executing SQL, before the physical plan is executed, Rapids-4-Spark will judge the physical plan. If the plan can be executed by the GPU, the corresponding GPU operation will be executed. If it cannot be executed by the GPU, it is executed by the CPU. I found that when GPU is executed, RDD[Columnar Batch] is used to read data, and RDD[Internal Row] is for CPU. Why?
I guess whether it is because the GPU gets the data from the memory, which reads the data very quickly? And this requires the CPU to read the data from the disk to the memory through the RDD [Internal Row] at the first. Um, can the CPU also support RDD [Columnar Batch] batch data reading?

0 replies

jlowe · 2021-02-04T16:01:23Z

jlowe
Feb 4, 2021

I found that when GPU is executed, RDD[Columnar Batch] is used to read data, and RDD[Internal Row] is for CPU. Why?

I suggest checking out this section of the FAQ which touches on why the GPU operates on columnar data vs. the row-at-a-time processing done by the CPU.

can the CPU also support RDD [Columnar Batch] batch data reading?

Yes, it could. However Apache Spark currently only uses columnar data when fetching data from columnar formats like Parquet and ORC. After the columnar data is fetched it is transformed back into row format because the rest of Spark processes in rows. Apache Spark could support columnar data through the entire query on the CPU as is done via the RAPIDS Accelerator when running on the GPU, but this would be a significant change to the code base.

The closest thing I've seen to Apache Spark processing columnar data throughout on the CPU is the announcement of the Photon engine by Databricks, but there aren't a lot of details.

0 replies

Matrix-World · 2021-02-04T16:12:06Z

Matrix-World
Feb 4, 2021
Author

I found that when GPU is executed, RDD[Columnar Batch] is used to read data, and RDD[Internal Row] is for CPU. Why?

I suggest checking out this section of the FAQ which touches on why the GPU operates on columnar data vs. the row-at-a-time processing done by the CPU.

can the CPU also support RDD [Columnar Batch] batch data reading?

Yes, it could. However Apache Spark currently only uses columnar data when fetching data from columnar formats like Parquet and ORC. After the columnar data is fetched it is transformed back into row format because the rest of Spark processes in rows. Apache Spark could support columnar data through the entire query on the CPU as is done via the RAPIDS Accelerator when running on the GPU, but this would be a significant change to the code base.

The closest thing I've seen to Apache Spark processing columnar data throughout on the CPU is the announcement of the Photon engine by Databricks, but there aren't a lot of details.

Thank you for your reply ,haha, smiling~

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does Spark GPU Plugin only support 1 gpu per executor? #5367

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Why does Spark GPU Plugin only support 1 gpu per executor? #5367

Matrix-World Feb 4, 2021

Replies: 8 comments

jlowe Feb 4, 2021

revans2 Feb 4, 2021 Maintainer

Matrix-World Feb 4, 2021 Author

jlowe Feb 4, 2021

Matrix-World Feb 4, 2021 Author

Matrix-World Feb 4, 2021 Author

jlowe Feb 4, 2021

Matrix-World Feb 4, 2021 Author

Matrix-World
Feb 4, 2021

jlowe
Feb 4, 2021

revans2
Feb 4, 2021
Maintainer

Matrix-World
Feb 4, 2021
Author

jlowe
Feb 4, 2021

Matrix-World
Feb 4, 2021
Author

Matrix-World
Feb 4, 2021
Author

jlowe
Feb 4, 2021

Matrix-World
Feb 4, 2021
Author