Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fsspec's infer_storage_options #1714

Closed
wants to merge 2 commits into from

Conversation

antonymilne
Copy link
Contributor

@antonymilne antonymilne commented Jul 19, 2022

Description

Closes: #1632.

Thanks to fsspec/filesystem_spec#988 we no longer need to write our own version of fsspec's infer_storage_options. However, we do need to wait until they do a release before we can use that code.

The tests in this PR will only pass once fsspec do that release and we edit our requirements.txt to include that new version of fsspec. So we need to keep an eye on fsspec to see when that release happens. When it does:

  • edit requirements.txt as in the TODO to get the new fsspec version in as a dependency
  • re-run tests
  • assuming they pass, remove test_parse_filepath completely (it's no longer needed since it will all be tested on the fsspec side)
  • merge this PR

Note that removing CLOUD_PROTOCOLS was only ever used in _parse_filepath and never advertised anywhere, so even though it's not got _ in front of it I think it's fine to remove. If anyone wanted to use it to modify the list then they would have raised an issue to modify it on the kedro side (which has happened in the past). So I think we can safely remove this variable.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

@antonymilne antonymilne requested a review from idanov as a code owner July 19, 2022 20:01
@merelcht
Copy link
Member

merelcht commented Aug 2, 2022

Looks like fsspec has done a release a couple of days ago 😄

@antonymilne
Copy link
Contributor Author

antonymilne commented Aug 2, 2022

@MerelTheisenQB indeed, but unfortunately it got reverted a couple of days later since apparently it broke other stuff 😬 I'm trying to figure out whether there's a solution on their side: fsspec/filesystem_spec#1002. If not then I'm afraid we'll have to keep our own infer_storage_options function :(

@merelcht
Copy link
Member

merelcht commented Aug 3, 2022

@MerelTheisenQB indeed, but unfortunately it got reverted a couple of days later since apparently it broke other stuff 😬 I'm trying to figure out whether there's a solution on their side: fsspec/filesystem_spec#1002. If not then I'm afraid we'll have to keep our own infer_storage_options function :(

Ah that's a pity.. 😞

@antonymilne
Copy link
Contributor Author

antonymilne commented Sep 9, 2022

Unfortunately I don't think this is going to be fixed on the fsspec side soon, so we will continue to maintain our own infer_storage_options function. tbh I don't understand why that apparently works when the fix on their side doesn't :/

The immediate call for this was to add the gdrive protocol, which I'm doing here: #1708

@antonymilne antonymilne closed this Sep 9, 2022
@antonymilne antonymilne deleted the fix/use-fsspec-infer-storage-options branch September 9, 2022 15:53
@noklam
Copy link
Contributor

noklam commented Sep 26, 2022

The original issue of replacing infer_storage_option
https://github.com/quantumblacklabs/private-kedro/pull/545

Quote

We need to split filepath without using infer_storage_options() from fsspec since it returns different results for different protocols that doesn't satisfy our needs (we are already making exceptions for HTTP protocol, we should avoid piling on exceptions e.g. for Azure).

I feel like if we just move this line it should work?

        if protocol in CLOUD_PROTOCOLS:

@noklam noklam mentioned this pull request Sep 26, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a feature to allow user customize CLOUD_PROTOCOLS when developing new fsspec protocols
4 participants