Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add anonymous access to s3 storage #65

Open
Nyanraltotlapun opened this issue Nov 21, 2019 · 8 comments
Open

Add anonymous access to s3 storage #65

Nyanraltotlapun opened this issue Nov 21, 2019 · 8 comments

Comments

@Nyanraltotlapun
Copy link

Perhaps I am missing something, but I cannot find a way to query public bucket

Amazone states that "Every interaction with Amazon S3 is either authenticated or anonymous"
https://docs.aws.amazon.com/en_us/AmazonS3/latest/dev/MakingRequests.html

@greghendershott
Copy link
Owner

greghendershott commented Nov 21, 2019

When a bucket is configured to allow anonymous public access, I think the intent is for the usual plain old HTTP HEAD or GET requests to work, from the usual generic tools like curl, wget, or a web browser.

In that case -- where an Authorization header with an Amazon v4 signature is not required -- things from aws/s3 like ls and get/bytes don't add much value compared to simply using net/url or net/http-client.

As a result, I don't think it even occurred to me to support the non-authenticated scenario.

It is an interesting enhancement idea. I'll tag it with that label.

(I'm not sure how many people would use it... so I'm not sure it's worth the time or the risk of breaking something where authentication is desired.. and I'm not sure when/if this might get added. So in the meantime if you need to do this I'd suggest using net/url or net/http-client or similar.)

@Nyanraltotlapun
Copy link
Author

Thank you.

It is logical to use object database from corresponding program api, parsing this things may seams easy, but writing such thing over and over again introduces errors and require time that can be spend elsewhere.
This is why I (and I think lots of other people) really appreciate libraries like this.

I do not look inside lib just yet, but I can suggest to implement this by adding (credentials-anonymous!) method.

@greghendershott
Copy link
Owner

Do you have an example public bucket in mind? If so, have you gone ahead and tried to use the library? (I don't and haven't.)

I ask because maybe it already works?! The private and public keys default to "". So if you don't call any credentials-from-XXX function, probably an Authorization header with sigv4 using those values will be added. And maybe S3 will ignore that header when the bucket allows public/anon access?

[Even if that happens to work, it would be good to document that and preserve that behavior going forward. i.e. I'm asking you can help by doing a quick experiment -- not proposing this as the final answer.]

@greghendershott
Copy link
Owner

greghendershott commented Nov 22, 2019

Oh never mind. I forgot. The code has many ensure-have-keys calls that check for the keys being "" and error -- to help users understand the situation where things won't work because they haven't set the keys.

That's the kind of thing I was talking about initially. The package currently assumes authenticated, and uses that assumption to e.g. provide helpful error message. Of course it would be possible to somehow preserve that and also support anonymous. It's not rocket science. It's "only" time to do it, update docs, find and fix resulting bugs.

@greghendershott
Copy link
Owner

I did a few quick hacks to experiment with not supplying any Authorization header at all, when the public or private keys are blank.

It works, but the only thing that S3 allows anonymously seems to be "getter" functions like get/bytes. Not "listers" like ls. (And obviously not "putters".)


If you know a bucket name and object path, forming the URI is simply:

(string-append "https://" bucket "." endpoint "/" path)

where endpoint is e.g. "s3.amazonaws.com" or maybe a specific location endpoint.

And you give that URI to net/url or net/http-client or curl or wget or whatever and... that's all you need to do.

People requiring this whole AWS package just for that string-append seems like... not the best use case to spend time supporting?

@greghendershott
Copy link
Owner

p.s. If you think it would be helpful, I'd be happy to add to the documentation something like: "Tip: If you only need to get things from a public S3 bucket, then you don't need this package; instead simply "?

@Nyanraltotlapun
Copy link
Author

Do you have an example public bucket in mind? If so, have you gone ahead and tried to use the library? (I don't and haven't.)

https://s3-eu-west-1.amazonaws.com/public.bitmex.com?delimiter=/&prefix=data/

I ask because maybe it already works?!

I tried:

#lang racket
(require aws/s3)

(s3-host "s3.eu-west-1.amazonaws.com")
(s3-region "eu-west-1")
(s3-scheme "https")
(ls "public.bitmex.com/data/trade")

This gives me:

open-input-file: cannot open input file
path: /home/test/.aws/credentials

@pschmied
Copy link

pschmied commented May 7, 2022

If it is useful, there are petabytes of open data in various formats in open S3 buckets listed at https://registry.opendata.aws/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants