[QST] Question about tuning #5341

tregodev · 2021-07-19T07:29:46Z

tregodev
Jul 19, 2021

Hey!

I've been reading your guide about tuning Spark-RAPIDS.

I am currently getting errors running a job due to poor tuning, I assume. The error is

java.lang.OutOfMemoryError: Could not allocate native memory: std::bad_alloc: RMM failure at:/home/jenkins/agent/workspace/jenkins-cudf-release-9-cuda11.0/cpp/build/_deps/rmm-src/include/rmm/mr/device/detail/arena.hpp:382: Maximum pool size exceeded

I am running with one node that has a Tesla P100 16GB, 117GB of RAM, and 14 CPUs.

These are the parameters I have used, with an explanation how I came to choose it:

       --driver-memory 8G \ // Reasonable amount as to not bottleneck.
       --num-executors 1 \  // Only one GPU means that only one executor.
       --conf spark.rapids.sql.concurrentGpuTasks=2 \ // Up to 5 concurrent tasks, a there are 5 cores used.
       --conf spark.executor.memory=100G \ // Might aswell use most of the memory to speed up the run.
       --conf spark.executor.cores=5 \ // >5 CPUs per executors may slow up the run, so 5 is enough.
       --conf spark.task.cpus=1 \ // One core runs one task.
       --conf spark.executor.resource.gpu.amount=1 \ // This should not be changed.
       --conf spark.task.resource.gpu.amount=0.2 \ // Up to to 5 concurrent tasks per executor
       --conf spark.rapids.memory.pinnedPool.size=4G \ // Some amount of pinned memory
       --conf spark.locality.wait=0s \ // Default
       --conf spark.sql.files.maxPartitionBytes=512m \  //Default

I am still kinda confused what configurations should I try to make the system not run out of memory or even better, get the best possible run time. Can anyone clarify how I should be tuning in this situation? I especially am having issues with seeing how spark.task.resource.gpu.amount, spark.rapids.sql.concurrentGpuTasks, spark.executor.cores and spark.task.cpus should be tuned together. I cant tell the difference between spark.task.resource.gpu.amount and spark.rapids.sql.concurrentGpuTasks, as they both seem to control the amount of concurrent tasks run on a GPU?

For context, my input is about 650GB of strings and output is about 300GB two column tables.

Thanks!

Answered by tgravescs

Jul 19, 2021

we try to explain this in this section: https://nvidia.github.io/spark-rapids/docs/tuning-guide.html#number-of-tasks-per-executor. Sounds like we need to clarify it more.

spark.task.resource.gpu.amount

This controls how many tasks can run per executor on the CPU side. Our plugin only supports 1 GPU per executor so this is the reciprocal of how many tasks you want on the CPU which is conotrolled by the spark.executor.cores configuration. So in your case you have 5 cores per executor which means you can have 5 tasks per executor if spark.task.resource.gpu.amount is set to 1/5. This is purely on the Spark side how it does the scheduling without the Rapids plugin enabled. if you change eith…

View full answer

tgravescs · 2021-07-19T13:56:08Z

tgravescs
Jul 19, 2021
Maintainer

we try to explain this in this section: https://nvidia.github.io/spark-rapids/docs/tuning-guide.html#number-of-tasks-per-executor. Sounds like we need to clarify it more.

spark.task.resource.gpu.amount

This controls how many tasks can run per executor on the CPU side. Our plugin only supports 1 GPU per executor so this is the reciprocal of how many tasks you want on the CPU which is conotrolled by the spark.executor.cores configuration. So in your case you have 5 cores per executor which means you can have 5 tasks per executor if spark.task.resource.gpu.amount is set to 1/5. This is purely on the Spark side how it does the scheduling without the Rapids plugin enabled. if you change either the executor cores or the spark.task.resource.gpu.amount it can affect the number of tasks per executor allowed to run from Spark scheduling point of view.

spark.rapids.sql.concurrentGpuTasks

This is purely a rapids plugin for spark configuration that control how many of those tasks can run GPU operations at a time.
So for instance you may have 5 Spark tasks running per executor and they can all do certain processing in parallel (like fetch files to read), but then when it actually goes to run a GPU kernel for the plugin then it gets limited to less, with your configs that is 2. If the other 3 tasks were trying to also run GPU operations at the same time, the other 3 tasks would block. But many times we see not all tasks wait on GPU at the same time, some might be fetching data from a distributed filesystems while others do GPU operations for instance.
The reason to limit the number of concurrent on the GPU is many times memory related. If you have operations that use a lot of memory you could end up running out of memory by running to many tasks at once.

if you are running out of memory you either need to decrease the size of data per task being processed at once (https://nvidia.github.io/spark-rapids/docs/tuning-guide.html#columnar-batch-size) or you can also decrease the spark.rapids.sql.concurrentGpuTasks. You may have saw this in third paragraph of https://nvidia.github.io/spark-rapids/docs/tuning-guide.html#number-of-concurrent-tasks-per-gpu

Can you tell us what type of operation is happening when it runs out of memory?

0 replies

Salonijain27 · 2021-07-27T20:48:29Z

Salonijain27
Jul 27, 2021

Please reopen the issue if you still have any questions

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Question about tuning #5341

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

[QST] Question about tuning #5341

tregodev Jul 19, 2021

Replies: 2 comments

tgravescs Jul 19, 2021 Maintainer

Salonijain27 Jul 27, 2021

tregodev
Jul 19, 2021

tgravescs
Jul 19, 2021
Maintainer

Salonijain27
Jul 27, 2021