From 6ea1381992232a3b5b99caf6a7b1bf05cceab648 Mon Sep 17 00:00:00 2001 From: Manuel Lagang Date: Wed, 18 May 2016 21:54:34 +0000 Subject: [PATCH] Use `hadoop classpath` to set HADOOP_CLASSPATH even if HADOOP_CLASSPATH is set See Github issue https://github.com/Factual/drake/issues/213 It's possible for HADOOP_CLASSPATH to be set to something that doesn't include necessary core hadoop libraries. This isn't an issue for normal hadoop commands, since hadoop will include the core hadoop libraries along with HADOOP_CLASSPATH when constructing the full classpath. However, currently drake will just use HADOOP_CLASSPATH by itself, so operations on hdfs will fail. This patch removes the check for the existence of HADOOP_CLASSPATH in the environment and unconditionally calls `hadoop classpath` to obtain the full classpath from hadoop itself. This does remove the capability to set HADOOP_CLASSPATH without the presence of a local hadoop binary, but that seems like an rare case in practice. --- bin/drake | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/bin/drake b/bin/drake index 12faca9..8429224 100755 --- a/bin/drake +++ b/bin/drake @@ -42,11 +42,9 @@ fi # Tries to include local hadoop library on classpath for HDFS support. # If Hadoop client is not installed locally, defaults to built-in hadoop version. # If you don't need HDFS support with your Drake workflows, none of this matters. -if [ -z ${HADOOP_CLASSPATH:+x} ]; then - if [[ `which hadoop` ]]; then - HADOOP_CLASSPATH=`hadoop classpath 2>/dev/null` - HADOOP_VERSION=`hadoop version | head -1` - fi +if [[ `which hadoop` ]]; then + HADOOP_CLASSPATH=`hadoop classpath 2>/dev/null` + HADOOP_VERSION=`hadoop version | head -1` fi if [ "$1" = "--hadoop-version" ]; then