Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing issue #30 about NIO FileSystem lifecycle #133

Merged
merged 20 commits into from
Sep 6, 2023

Conversation

stefanofornari
Copy link
Contributor

@stefanofornari stefanofornari commented Jul 29, 2023

Issue #, if available:
#30

Description of changes:
As per the discussion on #30 we want to implement NIO FileSystem semantic, which is lead by the methods newFileSystem() and getFileSystem(); the former shall return a new instance or throw an exception, while the latter should return an existing instance. This means that S3FileSystem instances shall be cached (as opposed to the current implementation that caches AWSClients).
Another change introduced and discussed in other thread is the use of S3ClientProvider at S3FileSystem level in order to better decouple file system logic from the aws client creation logic (discussed in #50). This obsoletes IMHO the methods in S3FileSystemProvider that receives a AWSAsyncClient, which I would remove in future version to keep S3FileSystemProvider cleaner and simpler.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

PS: as often gradle complains, this time about javadoc I think :( it builds locally with both gradle and maven ...

…plemented, improved test

- using assertj everywhere
- FilsSystem lifecycle
- adding S3ClientProvider to isolate how the aws client is built
@stefanofornari
Copy link
Contributor Author

Hi @markjschreiber, do you mind looking at this one? It's probably the last one with remarkable architectural changes to the current codebase. After that we should be moving smoothly with the S3X provider.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting in this PR. I have some changes requested but I think we can get there.

@stefanofornari
Copy link
Contributor Author

thanks! I am going to work on it in the next days

@stefanofornari
Copy link
Contributor Author

@markjschreiber I should be done with your comments, thanks a lot for your detailed review. I left open just one comment that was not clear to me. Let me know if I can do anything else.

@markjschreiber
Copy link
Contributor

Hi, thanks for this update. I will be out of the office for the next week and with very limited connectivity. I will attempt to provide reviews as I am able.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A couple of minor comments and some suggestions on possible refactoring of some of the protected/ utility pieces.

src/main/java/software/amazon/nio/spi/s3/S3FileSystem.java Outdated Show resolved Hide resolved
@@ -323,6 +402,9 @@ public Iterator<Path> iterator() {
* @param listObjectsV2Publisher the publisher that returns objects and common prefixes that are iterated on.
* @return an iterator for {@code Path}s constructed from the {@code ListObjectsV2Publisher}s responses.
*/
//
// TODO: shall this be private?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative way to make this testable would be to refactor it out of this class and into a helper/ utility class as a public method. That will avoid this class being polluted with protected method that isn't central to the function of the class.

Copy link
Contributor Author

@stefanofornari stefanofornari Aug 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually found the proper (I think) solution: cover the cases by using newDirectoryStream and getting the iterator from it:

Iterator<Path> pathIterator =
            provider.newDirectoryStream(Paths.get(URI.create(pathUri+"/")), filter).iterator();

HOWEVER, I was not able to make the test on pagination to work. I think I have done it quite right, but I miss how pagination is controlled. Can you please have a look at it (method pathIteratorForPublisher_withPagination)? For now I disabled it and added a TODO.

@stefanofornari
Copy link
Contributor Author

Hi, thanks for this update. I will be out of the office for the next week and with very limited connectivity. I will attempt to provide reviews as I am able.

have fun!

@stefanofornari
Copy link
Contributor Author

Hi, thanks for this update. I will be out of the office for the next week and with very limited connectivity. I will attempt to provide reviews as I am able.

have fun!

Hey @markjschreiber , I hope you had a good time! can we resume this work? The PR should be ready to be merged IMHO, but I am indeed available for updates.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in the review. A few small changes requested but it should be quick to fix them.

@stefanofornari
Copy link
Contributor Author

stefanofornari commented Aug 22, 2023

@markjschreiber, I should have done with everything. let me know.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the pagination test disabled?

@stefanofornari stefanofornari deleted the upstream#30 branch August 28, 2023 05:12
@stefanofornari
Copy link
Contributor Author

stefanofornari commented Aug 29, 2023

sorry @markjschreiber , I closed it by miskate... can you still see the changes and are you able to merge?

@markjschreiber
Copy link
Contributor

Re-opening

@markjschreiber
Copy link
Contributor

I just did some quick checks with the applications in examples. None of them currently work so something has broken in this PR for (at least) S3 native buckets/ URIs.

@stefanofornari
Copy link
Contributor Author

uh... unexpected : ) can you give me a bucket I can try with?

@stefanofornari
Copy link
Contributor Author

do you get the NPE: Exception in thread "main" java.lang.NullPointerException: Cannot invoke "software.amazon.nio.spi.s3.S3ClientProvider.generateAsyncClient(String) ?

@stefanofornari
Copy link
Contributor Author

yes, there is a bug here I believe. I am working on it

@markjschreiber
Copy link
Contributor

Yes, I get a null pointer exception for main

The other two example apps give "no such filesystem" errors. For testing purposes any of the buckets listed in the "registry of open data" are usually OK. I tend to use s3://broad-references/ or some text file therein.

@stefanofornari
Copy link
Contributor Author

ok, I fixed the issue. I introduced couple of regressions not captured by unit tests. I tested it with sample applications and it works now.

@markjschreiber
Copy link
Contributor

ok, I fixed the issue. I introduced couple of regressions not captured by unit tests. I tested it with sample applications and it works now.

Nice. If you haven't already can you open an issue(s) that outline the code parts/ scenarios where we need to add unit tests to spot similar issues in the future? It would be better to rely on automated testing rather than manually running the example apps.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Will squash and merge.

@markjschreiber markjschreiber merged commit 88c6308 into awslabs:main Sep 6, 2023
1 check passed
@stefanofornari
Copy link
Contributor Author

ok, I fixed the issue. I introduced couple of regressions not captured by unit tests. I tested it with sample applications and it works now.

Nice. If you haven't already can you open an issue(s) that outline the code parts/ scenarios where we need to add unit tests to spot similar issues in the future? It would be better to rely on automated testing rather than manually running the example apps.

#165

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants