Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for upstream/#116 #118

Merged
merged 3 commits into from
Jul 24, 2023
Merged

Conversation

stefanofornari
Copy link
Contributor

Issue
#116

Description of changes:
Fix for #116

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@markjschreiber
Copy link
Contributor

Hmm,

Fun fact, S3 URIs aren't strictly speaking URIs because they can contain characters that are illegal in URIs. For example in an S3 bucket you can create

"file with spaces.txt"
"file+with+spaces.txt"
"file%20with%20spaces.txt"

All of these will be unique files and their respective URIs are:

s3://bucket/file with spaces.txt
s3://bucket/file+with+spaces.txt
s3://bucket/file%20with%20spaces.txt

And S3 will not consider these to be the same URI (which is fun)

As you have spotted, implementations of the Path interface do need to return URI and that class will require URL encoding. URI is final (as it should be) so we cannot override this.

I need to add a note into the README.md to explain the potential differences that people may see and advise that, where possible people should where possible avoid using characters that need to be URL encoded so they don't see these conflicts.

@@ -612,10 +615,32 @@ private boolean isEmpty(){
*/
@Override
public URI toUri() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note in the Java doc how this URI might not be equal to the "S3 URI" that will be seen in the S3 console due to the need for the URI to be encoded to avoid illegal characters?

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. If you can add a note to the JavaDocs about how true URIs and S3 URIs might differ then we can merge this.

Having a similar note in the README.md would also be great.

@stefanofornari
Copy link
Contributor Author

stefanofornari commented Jul 20, 2023

Fun fact, S3 URIs aren't strictly speaking URIs because they can contain characters that are illegal in URIs. For example in an S3 bucket you can create

ah interesting. we need to think a little bit more then. Paths.get(URI.create("s3://bucket/file with space")); breaks because URI.create() fails. We need to figure out then how Paths.get(uri) translates into a S3Path object and make sure it works if the urls is encoded.

@stefanofornari
Copy link
Contributor Author

I amended the javadoc, @markjschreiber does it work?
What I think we should be clear about is that a URI has a defined syntax, specified in one or more RFCs. It is somehow improper to say that "s3://something with space" is a S3 URI (of course we can use that expression for simplicity).
Anyway, like I tried to say in the comment, the current implementation is coherent:

S3Path p = (S3Path)Paths.get("s3://mybucket/with+blank+and+%25"); // -> s3://mybucket/with blank and %
String s = p.toString; // -> /mybucket/with blank and %
URI u = p.toUri(); --> // -> s3://mybucket/with+blank+and+%25
...
String s = p.getFileSystem().get("with space").toString(); // -> /with space 

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Thanks for the contribution

@markjschreiber markjschreiber merged commit 8ae86f8 into awslabs:main Jul 24, 2023
marco-luzzara pushed a commit to marco-luzzara/aws-java-nio-spi-for-s3 that referenced this pull request Jul 28, 2023
* fix for upstream/awslabs#116

* improved javadoc

---------

Co-authored-by: sterapido <[email protected]>
@stefanofornari stefanofornari deleted the upstream#116 branch November 7, 2023 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants