Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Unable to submit files from local systems to pyspark #603

Open
ravi-ramadoss opened this issue Jan 13, 2018 · 2 comments
Open

Unable to submit files from local systems to pyspark #603

ravi-ramadoss opened this issue Jan 13, 2018 · 2 comments

Comments

@ravi-ramadoss
Copy link

I am trying to test a local spark script.
Whenever I try to upload a file from local Mac system to qinikube cluster, I get the below error.

$SPARK_HOME/bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://192.168.99.100:8443 \
  --kubernetes-namespace spark \
  --conf spark.executor.instances=1 \
  --conf spark.executor.memory=512m \
  --conf spark.driver.memory=512m \
  --conf spark.app.name=spark-pi \
  --conf spark.executor.cores=0.2 \
  --conf spark.driver.cores=0.2 \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
  --jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
  --py-files schools.py \
  schools.py

I see the below Error in dashboard for the driver pod

MountVolume.SetUp failed for volume "spark-init-properties" : configmaps "spark-pi-1515860438062-init-config" not found

Image: kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0
Environment variables

SPARK_DRIVER_MEMORY: 896m
SPARK_DRIVER_CLASS: org.apache.spark.deploy.PythonRunner
SPARK_DRIVER_ARGS: 
SPARK_MOUNTED_FILES_DIR: /var/spark-data/spark-files
PYSPARK_PRIMARY: /var/spark-data/spark-files/schools.py
PYSPARK_FILES: /var/spark-data/spark-files/schools.py
SPARK_DRIVER_JAVA_OPTS: -Dspark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 -Dspark.executor.memory=512m -Dspark.kubernetes.initcontainer.executor.configmapkey=download-submitted-files -Dspark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 -Dspark.app.name=spark-pi -Dspark.submit.deployMode=cluster -Dspark.executor.cores=0.2 -Dspark.kubernetes.driver.pod.name=spark-pi-1515860438062-driver -Dspark.master=k8s://https://192.168.99.100:8443 -Dspark.driver.memory=512m -Dspark.kubernetes.namespace=spark -Dspark.kubernetes.executor.podNamePrefix=spark-pi-1515860438062 -Dspark.files=/var/spark-data/spark-files/schools.py,/var/spark-data/spark-files/schools.py -Dspark.kubernetes.initcontainer.executor.configmapname=spark-pi-1515860438062-init-config -Dspark.app.id=spark-79017a9be721488e8be810480838412a -Dspark.executor.instances=1 -Dspark.driver.cores=0.2

Commands: -
Args: -

@ifilonenko
Copy link
Member

You need init container and rss defined as part of conf. Look at usage docs to see an example

@ravi-ramadoss
Copy link
Author

I followed the steps from the page https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html

I am sure I am missing something. Is there any walkthrough or example to do this?

kubectl create -f conf/kubernetes-resource-staging-server.yaml
$SPARK_HOME/bin/spark-submit \
  --deploy-mode cluster \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://https://192.168.99.100:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=5 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.resourceStagingServer.uri=http://192.168.99.100:31000 \
  --py-files pi.py \
  pi.py

Still I get the same error

MountVolume.SetUp failed for volume "spark-init-properties" : configmaps "spark-pi-1516935374044-init-config" not found

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants