Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up Oozie Spark example #22

Open
pregazzoni opened this issue Aug 12, 2016 · 3 comments
Open

Speed up Oozie Spark example #22

pregazzoni opened this issue Aug 12, 2016 · 3 comments

Comments

@pregazzoni
Copy link
Collaborator

In order for oozie spark job to run in Yarn we need the spark-assembly.jar to be in job path. Right now we get the jar for the cluster (webhdfs) and then put (webhdfs) it into the $jobDir/lib directory. This takes over few minutes.

Another way would be too have the lib in the oozie shared lib directory by default.

As oozie, you can do:

# Copy spark-assembly jar to Oozie shared lib directory
hdfs dfs -put /usr/iop/current/spark-client/lib/spark-assembly.jar /user/oozie/share/lib/lib_20160805191701/spark/.

# Set oozie environment
source /usr/iop/current/oozie-client/bin/oozie-env.sh
export OOZIE_URL=http://<replace with oozie node>:11000/oozie

# Update shared lib
oozie admin -sharelibupdate

Once this is done, there is no need to put the jar under $jobDir/lib as it will be automatically picked from the oozie shared lib.

@snowch
Copy link
Collaborator

snowch commented Aug 17, 2016

This looks good Pierre. Would these steps fo into a new task called something like Setup that the user would just run once with gradle?

Will it also work on basic clusters?

@pregazzoni
Copy link
Collaborator Author

@snowch need to look into this more closely as I believe you would need to become oozie user to do this (so need root). Same is true for basic.

I am also inquiring if this could become default though so it is there by default in the shared lib to start with.

@snowch
Copy link
Collaborator

snowch commented Aug 18, 2016

Ah, cool. Thanks @pregazzoni

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants