-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add S3 tests to the regular integration test suite #6783
Comments
Thanks @landreev - I'm very interested in us working on this. @scolapasta I added this to the board in "Needs discussion before ready" feel free to discuss briefly in a tech hour, add an estimate, and move to ready. |
As an added bonus, we can use this to test the failure case for #6558 (validate physical files on publish): we can create an S3-stored dataset, upload a file, then mess with the file - delete it/overwrite it via the standard S3 API directly so that the md5 we have in the database is no longer valid - then attempt to publish the dataset, and watch Dataverse do its preventive magic. |
This might be linked to my current efforts to come up with more integration testing. For #6694 I'm introducing a new Maven profile und JUnit tests using testcontainers.org. You might be interested in taking a look at develop...poikilotherm:6694-refactor-auth-config#diff-2487ecfd7d3cac06640e742876b1e14e for an example using a Postgres container to do an integration test for entity object marshaling/unmarshaling. It's fast and doesn't need a full deployment. |
And this might be a duplicate of #5068 I opened a while ago 😉 |
Thanks @poikilotherm, I'm always interested in closing out duplicate issues. 👍 I'll leave it to someone more technical than me (@landreev @scolapasta ) to check out that issue and to work with you to pull out any details. |
@pdurbin @poikilotherm @landreev I cobbled together a basic Ansible task to stand up an S3 LocalStack container at https://github.com/IQSS/dataverse-ansible/compare/164_localstack Corrections, suggestions, etc. welcomed. |
|
@landreev It looks like LocalStack still exists and can easily be cobbled into Dataverse-Ansible, docker-compose (GitHub actions?), or both. As noted above, @poikilotherm is a big fan of TestContainers though I wonder how many (in)significant differences are present between a LocalStack container and S3 proper. |
I just check with @poikilotherm and he recommends merging this PR of his... ... before we pick up this issue because he has added a lot of testing goodies there such as Testcontainers that should make working on this issue easier. |
I just tested at PR above and Testcontainers magic seems to be working! Hopefully off to QA soon. Also, here are some related issues: |
Just a quick note that these days, thanks to all the MPCONFIG efforts by @poikilotherm it's straightforward to get S3AccessIT to pass locally on my laptop by adding the following environment variables to
Above I'm using the magic trick described at https://guides.dataverse.org/en/6.0/container/app-image.html#tunables Here's a screenshot: As part of this effort, we should probably add first class support for these configuration options, rather than depending on the magic trick. Finally, as discussed in Slack, in the EC2 instances spun up by Jenkins, we'd like to keep "file" as the default store but add an additional "s3" store. If it's not possible to do all the configuration already, we'll add APIs as needed. For example, we'll need to be able to configure the collection to use the S3 store. @landreev pointed out we can extend the |
Developers can now test S3 locally by using the Dockerized development environment, which now includes both LocalStack and MinIO. See S3AccessIT which executes API (end to end) tests. In addition, a new integration test test class (not an API test, the new kind launched with `mvn verify`) has been added at S3AccessIOLocalstackIT. It uses Testcontainers to spin up Localstack for S3 testing and does not require Dataverse to be running. Note that the format of docker-compose-dev.yml had to change to allow for JVM options to be added. Finally, docs were improved for listing and setting stores via API.
Developers can now test S3 locally by using the Dockerized development environment, which now includes both LocalStack and MinIO. See S3AccessIT which executes API (end to end) tests. In addition, a new integration test test class (not an API test, the new kind launched with `mvn verify`) has been added at S3AccessIOLocalstackIT. It uses Testcontainers to spin up Localstack for S3 testing and does not require Dataverse to be running. Note that the format of docker-compose-dev.yml had to change to allow for JVM options to be added. Finally, docs were improved for listing and setting stores via API.
We rely heavily on S3 storage, including in our production use, but there are no regularly-run integration tests for the functionality.
We do already have a RestAssured test class (test/java/edu/harvard/iq/dataverse/api/S3AccessIT.java) with one test, written by @bsilverstein (hi!). We are not regularly running it on jenkins.dataverse.org.
S3 needs to be configured on the test bed installation for the test to work. But since our new jenkins infrastructure already depends on utilizing AWS, this is not an issue. We would just need to modify our scripts to copy the AWS credentials to the target EC2 instance and configure an S3 file store there.
So I propose we do that; plus extend the S3 test class above, for some more extensive testing and to test the functionality that's been added since it was written. (for example, as it is now, the test relies on having S3 configured as the default storage driver. We can instead make it create a new dataverse, set S3 as the storage for that specific dataverse, using the new multistore functionality, and proceed testing it)
S3 testing was specifically requested in #5068. But I'm opening a dedicated issue for it since there were other things being discussed there.
The text was updated successfully, but these errors were encountered: