Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala Map class not found when executing the benchmark on Spark 3.5.0 with Scala 2.13 #9714

Closed
alexvk opened this issue Nov 14, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@alexvk
Copy link

alexvk commented Nov 14, 2023

Describe the bug
Downloaded the latest spark-3.5.0-bin-hadoop3-scala2.13.tgz and built rapids-4-spark with Spark 2.13 according to https://github.com/NVIDIA/spark-rapids/tree/branch-23.12/scala2.13 ./build/buildall --profile=350 --scala213. Deployed rapids-4-spark_2.13-23.12.0-SNAPSHOT-cuda11.jar

Steps/Code to reproduce bug
Running one of the benchmarks ./spark-submit-template power_run_gpu.template nds_power.py parquet_sf3k ./query_streams/query_0.sql time_gpu.csv from https://github.com/NVIDIA/spark-rapids-benchmarks/tree/dev/nds
Getting a java.lang.NoSuchMethod exception:

====== Run query96 ======
Traceback (most recent call last):
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 376, in
run_query_stream(args.input_prefix,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 257, in run_query_stream
summary = q_report.report_on(run_one_query,spark_session,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/PysparkBenchReport.py", line 79, in report_on
listener.register()
File "/home/volzok/Src/spark-rapids-benchmarks/nds/python_listener/PythonListener.py", line 27, in register
self.uuid = manager.register(self)
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in call
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 179, in deco
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:com.nvidia.spark.rapids.listener.Manager.register.
: java.lang.NoSuchMethodError: scala.collection.immutable.Map$.apply(Lscala/collection/Seq;)Lscala/collection/GenMap;
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala:8)
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala)
at com.nvidia.spark.rapids.listener.Manager.register(Manager.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)

Expected behavior
Class found, normal termination, the same result as for a CPU run (runs)

Environment details (please complete the following information)
Spark local execution, here is power_run_gpu.template:

source base.template
export CONCURRENT_GPU_TASKS=${CONCURRENT_GPU_TASKS:-2}
export SHUFFLE_PARTITIONS=${SHUFFLE_PARTITIONS:-200}
export SCALA_HOME=/home/volzok/.sdkman/candidates/scala/current
export SCALA_LIBRARY_PATH=$SCALA_HOME/lib
CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-library.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-compiler.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/jline.jar"
export CLASSPATH

export SPARK_CONF=("--master" "${SPARK_MASTER}"
"--deploy-mode" "client"
"--conf" "spark.driver.maxResultSize=2GB"
"--conf" "spark.driver.memory=${DRIVER_MEMORY}"
"--conf" "spark.executor.cores=${EXECUTOR_CORES}"
"--conf" "spark.executor.instances=${NUM_EXECUTORS}"
"--conf" "spark.executor.memory=${EXECUTOR_MEMORY}"
"--conf" "spark.sql.shuffle.partitions=${SHUFFLE_PARTITIONS}"
"--conf" "spark.sql.files.maxPartitionBytes=2gb"
"--conf" "spark.sql.adaptive.enabled=true"
"--conf" "spark.executor.resource.gpu.amount=1"
"--conf" "spark.executor.resource.gpu.discoveryScript=./getGpusResources.sh"
"--conf" "spark.plugins=com.nvidia.spark.SQLPlugin"
"--conf" "spark.rapids.memory.host.spillStorageSize=32G"
"--conf" "spark.rapids.memory.pinnedPool.size=8g"
"--conf" "spark.rapids.sql.concurrentGpuTasks=${CONCURRENT_GPU_TASKS}"
"--conf" "spark.sql.legacy.charVarcharAsString=true"
"--files" "$SPARK_HOME/examples/src/main/scripts/getGpusResources.sh"
"--jars" "$SPARK_RAPIDS_PLUGIN_JAR,$NDS_LISTENER_JAR,$SCALA_LIBRARY_PATH/jline-3.21.0.jar,$SCALA_LIBRARY_PATH/scala-compiler.jar,$SCALA_LIBRARY_PATH/scala-library.jar”)

$ env|fgrep SPARK
SPARK_RAPIDS_PLUGIN_JAR=/home/volzok/bin/rapids-4-spark_2.13-23.12.0-SNAPSHOT-cuda11.jar
SPARK_HOME=/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13
SPARK_MASTER=local
$ env|fgrep JAVA
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

Additional context
Running on a A100 DGX Station with CUDA 12.0

@alexvk alexvk added ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 14, 2023
@tgravescs
Copy link
Collaborator

definitely looks like a scala version mismatch issue somewhere.

CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-library.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-compiler.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/jline.jar"
export CLASSPATH

Was there a reason you are adding this? that shouldn't be needed.

Note you are running in local mode and you should not be specying gpu resources (--conf" "spark.executor.resource.gpu.amount=1"). See https://docs.nvidia.com/spark-rapids/user-guide/latest/getting-started/on-premise.html#local-mode.

Also please remove all other jars like NDS_LISTENER_JAR and anything other then SPARK_RAPIDS_PLUGIN_JAR from --jars just to see if that gets rid of the error. Just try starting like spark-shell and running something basic. Is your NDS listener jar built for scala 2.13?

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Nov 22, 2023
@alexvk
Copy link
Author

alexvk commented Nov 28, 2023

Yes, the problem was the nds-benchmark-listener-1.0-SNAPSHOT.jar. It does have an explicit Scala-2.12 dependency. Removing it from the classpath fixes the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants