Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent module shadowing in pyspark_runner.py #2232

Merged
merged 1 commit into from
Sep 19, 2017
Merged

Prevent module shadowing in pyspark_runner.py #2232

merged 1 commit into from
Sep 19, 2017

Commits on Sep 11, 2017

  1. Prevent module shadowing in pyspark_runner.py

    PySparkTask uses spark-submit to run the script pyspark_runner.py.
    Because it is run as a script the modules and packages from its
    directory (luigi/contrib/) shadow the global modules. In particular the
    package `hdfs` which is used by webhdfs_client.py is shadowed by the
    `luigi.contrib.hdfs` package. Thus PySparkTask does not work together
    with webhdfs.
    
    The problem is resolved by putting the current directory at the end of
    the path list `sys.path`.
    adaitche committed Sep 11, 2017
    Configuration menu
    Copy the full SHA
    3744700 View commit details
    Browse the repository at this point in the history