Why does Spark GPU Plugin only support 1 gpu per executor? #5367
-
Hello @abellina |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments
-
The RAPIDS Accelerator only supports a single GPU per executor because that was a limitation of RAPIDS cudf which is the foundation of the Accelerator. I think libcudf may support multiple GPUs, but if so it's been a very recent addition. It would be interesting to know why you want to run multiple GPUs per executor. Typically it's not recommended to run too many concurrent tasks per Spark executor, so we've actually received requests to do the opposite of this request, i.e.: multiple executors per GPU. Also note that we have no plans for a single task leveraging multiple GPUs with the Accelerator because libcudf does not support this nor has any plans to do this in the future. If you have a use-case that requires this feature it would be good to document it here, otherwise I would recommend running multiple executors to leverage multiple GPUs, just as many effective Spark installations run multiple executors per node rather than one large executor per node. |
Beta Was this translation helpful? Give feedback.
-
@jlowe we really should add this to the FAQ. We have gotten this question multiple times now, or different variants of it. |
Beta Was this translation helpful? Give feedback.
-
@jlowe |
Beta Was this translation helpful? Give feedback.
-
Excellent point, I'll post a PR later today.
We're not planning on it, at least for now. Apache Spark does not allow scheduling a fraction of GPUs to an executor. |
Beta Was this translation helpful? Give feedback.
-
Ok, thank you very much for your positive reply, have a nice day. |
Beta Was this translation helpful? Give feedback.
-
Hello @jlowe |
Beta Was this translation helpful? Give feedback.
-
I suggest checking out this section of the FAQ which touches on why the GPU operates on columnar data vs. the row-at-a-time processing done by the CPU.
Yes, it could. However Apache Spark currently only uses columnar data when fetching data from columnar formats like Parquet and ORC. After the columnar data is fetched it is transformed back into row format because the rest of Spark processes in rows. Apache Spark could support columnar data through the entire query on the CPU as is done via the RAPIDS Accelerator when running on the GPU, but this would be a significant change to the code base. The closest thing I've seen to Apache Spark processing columnar data throughout on the CPU is the announcement of the Photon engine by Databricks, but there aren't a lot of details. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your reply ,haha, smiling~ |
Beta Was this translation helpful? Give feedback.
The RAPIDS Accelerator only supports a single GPU per executor because that was a limitation of RAPIDS cudf which is the foundation of the Accelerator. I think libcudf may support multiple GPUs, but if so it's been a very recent addition.
It would be interesting to know why you want to run multiple GPUs per executor. Typically it's not recommended to run too many concurrent tasks per Spark executor, so we've actually received requests to do the opposite of this request, i.e.: multiple executors per GPU.
Also note that we have no plans for a single task leveraging multiple GPUs with the Accelerator because libcudf does not support this nor has any plans to do this in the future.
If you hav…