Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use relpath? #284

Closed
floriankrb opened this issue Feb 16, 2022 · 4 comments
Closed

Do not use relpath? #284

floriankrb opened this issue Feb 16, 2022 · 4 comments

Comments

@floriankrb
Copy link
Member

This use of relpath :

return os.path.relpath(self.fieldset.path)

is causing an issue on window when path and start are on different drives.

Indeed in the os.path.relpath documentation:

Return a relative filepath to path either from the current directory or from an optional start directory. 
This is a path computation: the filesystem is not accessed to confirm the existence or nature of path or start. 
On Windows, [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) is raised 
when path and start are on different drives.

This happens in real use cases:

https://github.com/Climdyn/climetlab-eumetnet-postprocessing-benchmark/runs/5217438193?check_suite_focus=true#step:8:508

ValueError: path is on mount 'C:', start on mount 'D:'

Perhaps a try/catch would help here?

@iainrussell
Copy link
Member

Hi @floriankrb,

So what's the correct thing to do here? Something like this?

def source(self) -> str:
    try:
        return os.path.relpath(self.fieldset.path)
    except ValueError:
        return self.fieldset.path

@alexamici
Copy link
Contributor

@iainrussell your fix is one possibility, the other option is to return "N/A".

The idea after the relpath is that the source is written in the history attribute and it may end up leaking the folder structure of the machine (possibly including the user home folder and username) in a netCDF file. This is especially a problem in production environments.

@floriankrb
Copy link
Member Author

Yes, I think that any of these two solutions would work.

I would go for what @iainrussell proposed, based on the (very biased) assumption that if you use windows in a production environment, security and information leak are not your main concern.

@iainrussell
Copy link
Member

Actually, after reading what @alexamici wrote, I think I'm more inclined to return basename(path) in this case, so we get the filename of the GRIB file. In many cases this is probably what relpath(path) would return anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants