Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch error for issues with reading/writing data to object stores (e.g. S3) #1768

Closed
merelcht opened this issue Aug 8, 2022 · 0 comments · Fixed by #1881
Closed

Catch error for issues with reading/writing data to object stores (e.g. S3) #1768

merelcht opened this issue Aug 8, 2022 · 0 comments · Fixed by #1881
Assignees

Comments

@merelcht
Copy link
Member

merelcht commented Aug 8, 2022

The original Discord discussion https://discord.com/channels/778216384475693066/989568455512555540

Context

williamc — 06/23/2022
Let's say I just cloned my kedro project repo to another machine, and its datasets are versioned and configured to use S3 for storage. If I try to run a pipeline that depends on those datasets I get the infamous kedro.io.core.VersionNotFoundError . Bucket has versions all the way up to 2022-06-07T22.04.39.460Z/ and the error says 2022-06-23T16.20.52.945Z. Is this the intended behavior? Thanks

Solution

User report that this is resolved after he changes the permission to s3:* from s3:*Object. I suspect we need ListObjects or ListObjectsV2 only.

What's the problem?

I expect it to throw an error like Insufficient permissions to list objects, these logic is likely handled by fsspec and the glob_function we pass into VersionedDataSet. Some more investigation is needed

Action:

  1. Show a better error like Insufficient permissions to list objects instead of VersionedNotFound? This is confusing because the user can see the files but kedro would complain the file isn't there. It's not clear currently if the error doesn't come from Kedro directly but from e.g. fsspec
  2. Check what kind of Policy are needed for kedro to work? (p.s. * is definitely not a good suggestion!) and document this clearly
@merelcht merelcht changed the title VersionedDataSet not found error - Potentially permission issue Catch error for issues with reading/writing data to object stores (e.g. S3) Aug 8, 2022
@ankatiyar ankatiyar self-assigned this Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
3 participants