Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race in credentials. #3524

Closed
3 tasks done
chuzui opened this issue Sep 7, 2020 · 8 comments · Fixed by #3607
Closed
3 tasks done

Data race in credentials. #3524

chuzui opened this issue Sep 7, 2020 · 8 comments · Fixed by #3607
Labels
bug This issue is a bug.

Comments

@chuzui
Copy link

chuzui commented Sep 7, 2020

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug
I found some data race warnning of credentials in our test server.

 WARNING: DATA RACE
 Write at 0x00c000080a08 by goroutine 104:
   github.com/aws/aws-sdk-go/aws/credentials.(*Expiry).SetExpiration()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:176 +0x404
   github.com/aws/aws-sdk-go/aws/credentials/endpointcreds.(*Provider).RetrieveWithContext()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/endpointcreds/provider.go:132 +0x357
   github.com/aws/aws-sdk-go/aws/credentials/endpointcreds.(*Provider).Retrieve()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/endpointcreds/provider.go:119 +0xa4
   github.com/aws/aws-sdk-go/aws/credentials.(*ChainProvider).Retrieve()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/chain_provider.go:75 +0x118
   github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).singleRetrieve()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:263 +0x4d6
   github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).GetWithContext.func1()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:244 +0xbb
   github.com/aws/aws-sdk-go/internal/sync/singleflight.(*Group).doCall()
       /home/go/pkg/mod/github.com/aws/[email protected]/internal/sync/singleflight/singleflight.go:97 +0x51
 Previous read at 0x00c000080a08 by goroutine 87:
   github.com/aws/aws-sdk-go/aws/credentials.(*Expiry).IsExpired()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:188 +0x92
   github.com/aws/aws-sdk-go/aws/credentials/endpointcreds.(*Provider).IsExpired()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/endpointcreds/provider.go:113 +0x65
   github.com/aws/aws-sdk-go/aws/credentials.(*ChainProvider).IsExpired()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/chain_provider.go:96 +0x7b
   github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).isExpired()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:305 +0x9c1
   github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).GetWithContext()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/credentials/credentials.go:236 +0x68
   github.com/aws/aws-sdk-go/aws/signer/v4.Signer.signWithBody()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/signer/v4/v4.go:343 +0x3a4
   github.com/aws/aws-sdk-go/aws/signer/v4.SignSDKRequestWithCurrentTime()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/signer/v4/v4.go:481 +0x456
   github.com/aws/aws-sdk-go/aws/signer/v4.SignSDKRequest()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/signer/v4/v4.go:428 +0x55
   github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/request/handlers.go:267 +0xd4
   github.com/aws/aws-sdk-go/aws/request.(*Request).Sign()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/request/request.go:429 +0x156
   github.com/aws/aws-sdk-go/aws/request.(*Request).Send()
       /home/go/pkg/mod/github.com/aws/[email protected]/aws/request/request.go:526 +0x1a4
   github.com/aws/aws-sdk-go/service/sns.(*SNS).Unsubscribe()
       /home/go/pkg/mod/github.com/aws/[email protected]/service/sns/api.go:3390 +0x5e

Version of AWS SDK for Go?
v1.33.0

Version of Go (go version)?
go1.15

To Reproduce (observed behavior)
There are multiple aws clients such like sns.New(session.New()) in our code.

Expected behavior
No data race.

It seems that the code in credentials is not thread safe?

func (e *Expiry) SetExpiration(expiration time.Time, window time.Duration) {
e.expiration = expiration
if window > 0 {
e.expiration = e.expiration.Add(-window)
}
}
// IsExpired returns if the credentials are expired.
func (e *Expiry) IsExpired() bool {
curTime := e.CurrentTime
if curTime == nil {
curTime = time.Now
}
return e.expiration.Before(curTime())
}

@chuzui chuzui added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 7, 2020
@bitcodr
Copy link

bitcodr commented Sep 8, 2020

I'm interested to work on it.

@jasdel
Copy link
Contributor

jasdel commented Sep 16, 2020

Thanks for reporting this issue @bitcodr. Could you provide a example of how the SDK is being configured to better help us investigate this issue? An example that reproduces this issue would also be very helpful.

@jasdel jasdel added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Sep 16, 2020
@bitcodr
Copy link

bitcodr commented Sep 17, 2020

@jasdel I just wanted to work on it, @chuzui found the issue

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Sep 18, 2020
@rittneje
Copy link
Contributor

rittneje commented Sep 21, 2020

I think I've found the underlying issue here.

func (c *Credentials) GetWithContext(ctx Context) (Value, error) {
if curCreds := c.creds.Load(); !c.isExpired(curCreds) {
return curCreds.(Value), nil
}

// isExpired helper method wrapping the definition of expired credentials.
func (c *Credentials) isExpired(creds interface{}) bool {
return creds == nil || creds.(Value) == Value{} || c.provider.IsExpired()
}

There is no lock held during the call to isExpired. Consequently, it is entirely possibly that the following situation unfolds.

  1. Thread A calls GetWithContext. The creds are expired, so this will trigger a call to Provider.RetrieveWithContext.
  2. Thread B calls GetWithContext. To check if the creds are expired, it calls Provider.IsExpired.
  3. In parallel with (2), thread A finishes fetching creds and calls SetExpiration. Thus the expiration is being written by thread A at the exact same time that thread B is trying to read it.

One possible implication of this is that the second thread might erroneously use the expiration time of the newer creds after fetching the older creds from the sync.Value, and thus continue to use the older creds thinking they are still valid.

@rittneje
Copy link
Contributor

There is already an attempted fix in #3448. However, this fix has a problem - using a lock in this manner means that if there is an in-progress call to GetWithContext, the second thread will block indefinitely until it completes, potentially in violation of the incoming context's cancellation/deadline. To avoid this issue, we must not hold a lock during the call to RetrieveWithContext.

One way I think the issue can be mitigated is to first clear out the existing credentials after discovering they are expired, but before fetching new ones. That should prevent the second thread from seeing the old credentials with the new expiration.

// The provider will record the expiration time internally without holding the write lock.
// However, we only fetch credentials while c.creds is the zero value (Value{}), and
// isExpired will only consult the provider if c.creds is NOT the zero value.
// Additionally, singleRetrieve sets c.creds while holding the write lock after the provider has returned.
// Thus holding the read lock here is sufficient for thread-safety.
c.m.RLock()
if curCreds := c.creds; !c.isExpired(curCreds) {
    c.m.RUnlock()
    return curCreds, nil
}
c.m.RUnlock()

...

func (c *Credentials) singleRetrieve(ctx Context) (creds interface{}, err error) {
    // Another thread may have gotten new credentials between this thread releasing the read lock
    // and acquiring the write lock. Double check the credentials are still expired to be safe.
    c.m.Lock()
    if curCreds := c.creds; !c.isExpired(curCreds) {
        c.m.Unlock()
        return curCreds, nil
    }
    c.creds = Value{}
    c.m.Unlock()
    
    ...
    
    if err == nil {
        c.m.Lock()
        c.creds = creds
        c.m.Unlock()
    }
}

@jasdel
Copy link
Contributor

jasdel commented Oct 20, 2020

Updated PRhttps://github.com//pull/3448 to integrate your updates, and made few tweaks to the test. Once this is merged in it should ensure this race condition doesn't happen anymore.

@jasdel
Copy link
Contributor

jasdel commented Oct 20, 2020

Fixed in #3448. Thanks for creating this bug request and PR contributing to its fix. We've merged in the change and it will be included in the SDK's next release. Let us know if you have any additional issues, or feedback.

@jasdel jasdel closed this as completed Oct 20, 2020
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

aws-sdk-go-automation pushed a commit that referenced this issue Oct 21, 2020
===

### Service Client Updates
* `service/cloudfront`: Updates service API and documentation
  * CloudFront adds support for managing the public keys for signed URLs and signed cookies directly in CloudFront (it no longer requires the AWS root account).
* `service/ec2`: Updates service API and documentation
  * instance-storage-info nvmeSupport added to DescribeInstanceTypes API
* `service/globalaccelerator`: Updates service API and documentation
* `service/glue`: Updates service API and documentation
  * AWS Glue crawlers now support incremental crawls for the Amazon Simple Storage Service (Amazon S3) data source.
* `service/kendra`: Updates service API and documentation
  * This release adds custom data sources: a new data source type that gives you full control of the documents added, modified or deleted during a data source sync while providing run history metrics.
* `service/organizations`: Updates service documentation
  * AWS Organizations renamed the 'master account' to 'management account'.

### SDK Bugs
* `aws/credentials`: Fixed a race condition checking if credentials are expired. ([#3448](#3448))
  * Fixes [#3524](#3524)
* `internal/ini`: Fixes ini file parsing for cases when Right Hand Value is missed in the last statement of the ini file ([#3596](#3596))
  * related to [#2800](#2800)
aws-sdk-go-automation added a commit that referenced this issue Oct 21, 2020
Release v1.35.12 (2020-10-21)
===

### Service Client Updates
* `service/cloudfront`: Updates service API and documentation
  * CloudFront adds support for managing the public keys for signed URLs and signed cookies directly in CloudFront (it no longer requires the AWS root account).
* `service/ec2`: Updates service API and documentation
  * instance-storage-info nvmeSupport added to DescribeInstanceTypes API
* `service/globalaccelerator`: Updates service API and documentation
* `service/glue`: Updates service API and documentation
  * AWS Glue crawlers now support incremental crawls for the Amazon Simple Storage Service (Amazon S3) data source.
* `service/kendra`: Updates service API and documentation
  * This release adds custom data sources: a new data source type that gives you full control of the documents added, modified or deleted during a data source sync while providing run history metrics.
* `service/organizations`: Updates service documentation
  * AWS Organizations renamed the 'master account' to 'management account'.

### SDK Bugs
* `aws/credentials`: Fixed a race condition checking if credentials are expired. ([#3448](#3448))
  * Fixes [#3524](#3524)
* `internal/ini`: Fixes ini file parsing for cases when Right Hand Value is missed in the last statement of the ini file ([#3596](#3596)) 
  * related to [#2800](#2800)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Projects
None yet
4 participants