Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File/Dir - cannot use path made of non-utf8 bytestrings #79

Open
gabriel-v opened this issue Aug 18, 2023 · 0 comments
Open

File/Dir - cannot use path made of non-utf8 bytestrings #79

gabriel-v opened this issue Aug 18, 2023 · 0 comments

Comments

@gabriel-v
Copy link

gabriel-v commented Aug 18, 2023

All the code in redun.File and friends assumes we have a single valid utf-8 string for the path.

But python accepts bytes as the path objects too. This is needed when we're working with filesystems that encode filenames using something else than UTF-8.

There's some functions that crash when trying to give File a bytes path:

  • File: get_filesystem_class() - get_proto() - the urlparse method fails on non-utf8 byte strings
  • Dir: all of the above, and also concatenating the glob pattern - complains that TypeError: Can't mix strings and bytes in path components

The workaround is to hack:

I also tried changing the self.classes.File but it can't be overwritten (uses getitem) - so one would have to replace this whole FileClasses thing.

I think one of two things can be done here:

  • either fix File, Dir and friends to work with non-uft8 bytestrings paths
  • or, allow the user of the library to override the FileClasses, get_filesystem_class and friends, without so much monkeypatching
  • refactor the whole thing to only use pathlib.Path as requested in Use pathlib.Path instead of strings for path #8
    • through I think the get_proto() and urlparse would still crash when given non-utf8 bytestrings

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant