Ship Hadoop configuration files to the driver and add to its classpath #130

mccheah · 2017-02-21T20:04:42Z

Currently we do not do this, so the only way Hadoop configuration options can be set is by setting spark.hadoop.* parameters on the Spark configuration.

The text was updated successfully, but these errors were encountered:

ash211 · 2017-06-09T01:02:22Z

@kimoonkim as you've been testing the HDFS locality changes recently, are you passing the *-site.xml config files into Spark in some way? Are you passing all the configuration as spark.hadoop.*, or is there no required config in the clusters you're testing?

kimoonkim · 2017-06-09T01:28:18Z

@ash211 Good question. For the HDFS node-level locality tests so far, Spark only needed the namenode address. I passed it as spark.hadoop.fs.defaultFS.

But we plan to work on the rack locality part soon, which involves more. Spark driver needs several config keys for the rack topology plugin. It also needs to access a script or text file that topology plugins refer to. (There are multiple topology plugin choices) Those files are usually in the same hadoop conf dir.

So It would be better to pass *-site.xml and other files in hadoop conf dir. I couldn't come up with any good approach yet, though. Do you have specific ideas?

mccheah · 2017-06-09T01:30:14Z

We can build a ConfigMap instance, or allow the user to specify an existing one, which contains the core-site.xml that the job should use. Then, based on the path we mount the files to, we can set the HADOOP_CONF_DIR environment variable accordingly on our containers.

Depending on whether or not we expect core-site.xml to contain sensitive data, we might want to use a Secret instead.

ifilonenko · 2017-07-11T23:05:50Z

In what cases would we see core-site.xml or hdfs-site.xml containing sensitive data that might need for it to be contained in a secret? Any thoughts on why ConfigMap wouldn't work, or why such .xml files cant simply be distributed via the resource staging server?

mccheah · 2017-07-11T23:09:06Z

One case where the XML files might have sensitive data is when configuring Spark to communicate with S3. In those cases these XML files might contain AWS credentials.

ifilonenko · 2017-07-11T23:13:48Z

If the user isn't specifying an existing ConfigMap how should the user be expected to specify the file locations which the ConfigMap will use in the creation step via the Submission Client.

mccheah · 2017-07-11T23:14:26Z

The submission client can set it for the user and just set HADOOP_CONF_DIR for the user accordingly as well.

ifilonenko · 2017-07-18T23:02:01Z

#373 Should handle this

kimoonkim mentioned this issue Mar 1, 2017

Executors should get the metrics.properties file before the metrics system is initialized #162

Open

mccheah mentioned this issue Mar 17, 2017

[WIP] Add Hadoop configuration files to driver pod. #190

Closed

ash211 mentioned this issue Mar 17, 2017

HDFS access umbrella issue #128

Open

ifilonenko self-assigned this Jul 11, 2017

ifilonenko pushed a commit to ifilonenko/spark that referenced this issue Feb 25, 2019

Correct Boms (apache-spark-on-k8s#130)

3d8e66f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ship Hadoop configuration files to the driver and add to its classpath #130

Ship Hadoop configuration files to the driver and add to its classpath #130

mccheah commented Feb 21, 2017

ash211 commented Jun 9, 2017

kimoonkim commented Jun 9, 2017

mccheah commented Jun 9, 2017

ifilonenko commented Jul 11, 2017 •

edited

Loading

mccheah commented Jul 11, 2017

ifilonenko commented Jul 11, 2017

mccheah commented Jul 11, 2017

ifilonenko commented Jul 18, 2017

Ship Hadoop configuration files to the driver and add to its classpath #130

Ship Hadoop configuration files to the driver and add to its classpath #130

Comments

mccheah commented Feb 21, 2017

ash211 commented Jun 9, 2017

kimoonkim commented Jun 9, 2017

mccheah commented Jun 9, 2017

ifilonenko commented Jul 11, 2017 • edited Loading

mccheah commented Jul 11, 2017

ifilonenko commented Jul 11, 2017

mccheah commented Jul 11, 2017

ifilonenko commented Jul 18, 2017

ifilonenko commented Jul 11, 2017 •

edited

Loading