Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nf-amazon 2.0.0: Unable to load AWS credentials when running on Github Actions #3989

Closed
adamrtalbot opened this issue May 31, 2023 · 4 comments · Fixed by #3992
Closed

nf-amazon 2.0.0: Unable to load AWS credentials when running on Github Actions #3989

adamrtalbot opened this issue May 31, 2023 · 4 comments · Fixed by #3992

Comments

@adamrtalbot
Copy link
Collaborator

Bug report

When running on Github actions (and probably some other systems that lack AWS credentials) Nextflow reports the error:

Downloading plugin [email protected]
ERROR ~ Unable to load AWS credentials from any provider in the chain: [EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)), SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey), WebIdentityTokenCredentialsProvider: You must specify a value for roleArn and roleSessionName, com.amazonaws.auth.profile.ProfileCredentialsProvider@c2dab10: profile file cannot be null, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@32029cd: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]

Full run details:
https://github.com/nf-core/smrnaseq/actions/runs/5134355097/jobs/9238248639?pr=254

The data is a public source so it should not require credentials.

Because this was a Github action, we can go back in history and see the same code working fine a few weeks ago. Here is a working version:
https://github.com/nf-core/smrnaseq/actions/runs/4958065849/jobs/8870496729

The main delta I can see is the working version was:
Nextflow v22.10.1 and nf-amazon v1.11.0
Nextflow v23.04.1 and nf-amazon v1.16.2

vs

Nextflow v23.05.0-edge and nf-amazon v2.0.0

I'm trying some combinations here. You can see some combinations aren't compatible which makes this fairly tricky to get right by just selecting versions: https://github.com/nf-core/smrnaseq/actions/runs/5135450691/jobs/9240888639

Expected behavior and actual behavior

Tests should pull data from AWS without credentials and not throw an error.

Steps to reproduce the problem

nf-core/smrnaseq#256

Program output

To follow.

Environment

  • Nextflow version: Various (22.10.1, 23.04.1, 23.05.0-edge, multiple nf-amazon versions).
  • Java version: [?]
  • Operating system: [macOS, Linux, etc]
  • Bash version: (use the command $SHELL --version)

Additional context

(Add any other context about the problem here)

@pditommaso
Copy link
Member

It could be a problem with the handling of anonymous bucket access

final String bucketName = S3Path.bucketName(uri);
final boolean anonymous = "true".equals(props.getProperty("anonymous"));
if( anonymous ) {
log.debug("Creating AWS S3 client with anonymous credentials");
client = new S3Client(new AmazonS3Client(new AnonymousAWSCredentials(), clientConfig));
}
else {
final boolean global = bucketName!=null;
final AwsClientFactory factory = new AwsClientFactory(awsConfig, Regions.US_EAST_1.getName());
client = new S3Client(factory.getS3Client(clientConfig, global));
}

@bentsherman can you please give it a try ?

@drpatelh
Copy link
Contributor

drpatelh commented Jun 1, 2023

Yep, this wasn't an issue before and is breaking CI tests on nf-core. Seems to be specific to Github Actions. One way to try and reproduce would be use Nextflow v23.05.0-edge and nf-amazon v2.0.0 in a Gitpod environment that runs a simple NF script to stage a path from a public bucket on S3 e.g.

params.input = 's3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt'

process ls_file {

    debug true

    input:
    path input_file

    """
    ls $input_file
    """
}

workflow {
    ls_file(params.input)
}

@adamrtalbot
Copy link
Collaborator Author

adamrtalbot commented Jun 1, 2023

Here's a reproducible example: https://github.com/adamrtalbot/nf-amazon-bug

Here's the Github actions: https://github.com/adamrtalbot/nf-amazon-bug/actions/runs/5144138673

@bentsherman
Copy link
Member

The problem is with the credentials provider chain, which was added in the AWS config refactor:

return new AWSCredentialsProviderChain(List.of(new EnvironmentVariableCredentialsProvider(),
new SystemPropertiesCredentialsProvider(),
WebIdentityTokenCredentialsProvider.create(),
new ProfileCredentialsProvider(configFile(), null),
new EC2ContainerCredentialsProviderWrapper()))

This chain does not fallback to anonymous creds, so it just fails if none of the providers yield anything.

You can also replicate it with the CLI, remove your credentials first. You can then make it work by adding --no-sign-request or by downloading over HTTP:

$ aws s3 cp s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt -
download failed: s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt to - Unable to locate credentials

$ aws s3 cp --no-sign-request s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt -
The contents of the annotation directories were downloaded from Ensembl on: July 17, 2015.

Gene annotation files were downloaded from Ensembl release 75. SmallRNA annotation files were downloaded from miRBase release 21.

$ curl https://ngi-igenomes.s3.amazonaws.com/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt 
The contents of the annotation directories were downloaded from Ensembl on: July 17, 2015.

Gene annotation files were downloaded from Ensembl release 75. SmallRNA annotation files were downloaded from miRBase release 21.

So, the quick workaround is to use the HTTP URL in your Nextflow pipeline, and the actual solution is to figure out how to fallback to anonymous creds in Nextflow. Clearly it used to do this because the public S3 URLs used to work.

I'm not 100% sure, but from my research, it should be possible to add this fallback to our custom provider chain...

Notes:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants