Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

[WIP] Add Hadoop configuration files to driver pod. #190

Closed
wants to merge 1 commit into from

Conversation

mccheah
Copy link

@mccheah mccheah commented Mar 17, 2017

Closes #130

@mccheah
Copy link
Author

mccheah commented Mar 17, 2017

The idea is here but missing adding the configuration files to the executors as well as tests.

@ash211
Copy link

ash211 commented Mar 17, 2017

@kimoonkim do you often see credentials in your *-site.xml files? We've tried pretty hard to eliminate sensitive material from those files over here and don't have any remaining, so aren't sure how sensitive to be treating these files in general.

@foxish are k8s config maps a safe enough way to send these files to the driver pod (pre-launch) or, if these contain sensitive materials, should we be using secrets instead?

.filter { file =>
val isFile = file.isFile
// We could theoretically make a way to load directories, but this isn't supported by YARN
// and simplifies the logic greatly. Might be worth considering in a future iteration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding /etc/hadoop/conf to the classpath makes new Configuration(true) load defaults from files in that directory. Would have to verify that

mountPath: String): MountedHadoopConfiguration = {
val confFilesContents: Map[String, String] = sys.env.get(confEnvVar)
.map(new File(_))
.filter(_.isDirectory)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind popping an INFO log line in here somewhere saying something along the lines of Sending the contents of /path/to/dir as HADOOP_CONF_DIR to the driver

@kimoonkim
Copy link
Member

@kimoonkim do you often see credentials in your *-site.xml files? We've tried pretty hard to eliminate sensitive material from those files over here and don't have any remaining, so aren't sure how sensitive to be treating these files in general.

@ash211 The majority of the time, YARN uses them for storing cluster-wide information, and I think that's the context here as well. If yes, I don't think we see a lot of sensitive information in these files. (@ssuchter would know more) Some uneducated users might mistake these for storing job specific information which could be potentially sensitive, but I am not sure what we can do about that.

@ssuchter
Copy link
Member

ssuchter commented Mar 17, 2017 via email

@ash211
Copy link

ash211 commented Mar 17, 2017

Based on that I think it's safe to defer on confidentially transferring these files to the pods -- in practical terms config maps probably the right choice here, and k8s secrets or more intricate mechanisms not worth the effort.

getMountedHadoopConfiguration(
confEnvVar = "HADOOP_CONF_DIR",
configMapName = s"$kubernetesAppId-hadoopConfDir",
mountPath = "/etc/hadoop/hadoopConfDir"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not /etc/hadoop/conf ? I think it's the default hadoop conf dir for most hadoop distros.

logWarning(s"Failed to list files from $confEnvVar at ${files._1.getAbsolutePath}")
}
isNull
}.flatMap(_._2.toSeq)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a scala compilation error here.

@foxish
Copy link
Member

foxish commented Mar 21, 2017

@foxish are k8s config maps a safe enough way to send these files to the driver pod (pre-launch) or, if these contain sensitive materials, should we be using secrets instead?

Neither of them is opaque really, but I'd go with secrets and not configmaps for credentials.

@mccheah
Copy link
Author

mccheah commented May 16, 2017

Will re-do this on top of submission v2.

@mccheah mccheah closed this May 16, 2017
ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants