Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sshContext thread safe and fix the data race bug #211

Merged
merged 1 commit into from
Sep 27, 2023
Merged

Make sshContext thread safe and fix the data race bug #211

merged 1 commit into from
Sep 27, 2023

Conversation

wxiaoguang
Copy link
Contributor

@wxiaoguang wxiaoguang commented Aug 9, 2023

Fix #160

The problem in old code, suppose there are 2 goroutines:

  1. Goroutine-A calls the context.Context interface functions of sshCtx, then it reads sshCtx.Context
  2. Goroutine-B calls sshCtx.SetValue, then it writes sshCtx.Context
  3. Then, data race occurs

This PR makes the sshCtx.Context immutable, there won't be data race issue.

The Mutex.Lock/Unlock is quite fast when there is no concurrency, it only calls atomic.CompareAndSwapInt32 / atomic.AddInt32.

@gustavosbarreto
Copy link
Collaborator

@wxiaoguang Could you please improve the commit message?

What is the difference from #209?

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 9, 2023

@wxiaoguang Could you please improve the commit message?

Done with a force push.

What is the difference from #209?

More details in #160 (comment) , IMO 209 is incomplete, because all functions of context.Context interface should be thread-safe. ps: this approach is lock-free (if you like "mutex", I can rewrite it with mutex)

@silverwind
Copy link

silverwind commented Aug 10, 2023

I suspect that this race was exposed by some change in the go1.21 compiler. These test failures appeared in the last 1-2 days for gitea and our CI did automatically upgrade to 1.21 when it went live. Never seen those before.

context.go Outdated Show resolved Hide resolved
@wxiaoguang wxiaoguang changed the title Fix data race by using atomic value Make sshContext thread safe and fix the data race bug Aug 10, 2023
@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

Now it's the same as #209 ?

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

Now it's the same as #209 ?

No. #211 (comment)

More details in #160 (comment) , IMO 209 is incomplete, because all functions of context.Context interface should be thread-safe.

Functions like Err , Deadline are still not thread-safe with 209

@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

But they are not thread-safe in this PR too.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

But they are not thread-safe in this PR too.

This one is safe, while that one isn't

See the comment #160 (comment)

All functions of context.Context interface should be thread-safe.
Every call to sshCtx.Context (including Err() and others) all causes Data Race.

And this PR's description:

  1. Goroutine-A calls the context.Context interface functions of sshCtx, then it reads sshCtx.Context
  2. Goroutine-B calls sshCtx.SetValue, then it writes sshCtx.Context
  3. Then, data race occurs

The struct: type sshContext struct { context.Context }, then, calling sshCtx.Err() means { return sshCtx.Context.Err() }

  • This PR: sshCtx.Context is immutable, so it is thread-safe.
  • That PR: sshCtx.Context is mutable and will be changed by SetValue, its Err / Deadline are not protected by mutex.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

And one more thing: changing a "ctx" field again and again is an anti-pattern of Golang design. So, it's good to make it immutable.

@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

* This PR: `sshCtx.Context` is immutable, so it is thread-safe.

Understand.

With the original Go WithValue it's not possible to retrieve multiple values added with the same key, you always get the last added. So you could use a map to get the same behaviour and cleaner code. Currently you always get the oldest value added with the same key.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

Currently you always get the oldest value added with the same key.

Nope, it (SetValue) behaves exactly as original Go WithValue, see the "for" loop in SetValue. I will add a test for it.

Update: test has been added.

@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

Why use a list then if you replace the previous value? That's the same behaviour as a map.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

Why use a list then if you replace the previous value? That's the same behaviour as a map.

The reasons IMO are:

  1. That's the same behavior as WithValue
  2. That's the same behavior as the old SetValue
  3. The context "key" type is any which is not comparable (defined by Value / WithValue), can't be used as a map key, so it can't use a map (update: I misunderstood and was wrong here, see latest commit)

@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

  1. That's the same behavior as WithValue

What behaviour are you talking about? The current implementation behaves like a map. The Go WithValue keeps a reference to all added values, with the current implementation they may get GCed (not a real world problem I think).

  1. The context "key" type is any which is not comparable (defined by Value / WithValue), can't be used as a map key, so it can't use a map

any is comparable since Go 1.20.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 10, 2023

  1. That's the same behavior as WithValue

What behaviour are you talking about? The current implementation behaves like a map. The Go WithValue keeps a reference to all added values, with the current implementation they may get GCed (not a real world problem I think).

The behavior:

ctx = ctx.WithValue(key, v1)
ctx = ctx.WithValue(key, v2)
then ctx.Value(key) == v2, and the Value function is virtually a "recursion loop" of traversing all the keys.
ctx.SetValue(key, v1)
ctx.SetValue(key, v2)
then ctx.Value(key) == v2, and the sshContext.Value function is real loop of traversing all the keys.

For the GC, I do not see a real bad case for it. Go developers should seldom depend on this behavior.

  1. The context "key" type is any which is not comparable (defined by Value / WithValue), can't be used as a map key, so it can't use a map

any is comparable since Go 1.20.

For this project: Golang 1.12/1.13

image

image

@KN4CK3R
Copy link

KN4CK3R commented Aug 10, 2023

The behavior:

Yes, that's the normal behaviour and would be identical with map. I thought you are talking about something else.

For this project:

That's sad

@wxiaoguang
Copy link
Contributor Author

For this project:

That's sad

Hmm, I think you are right for the map with interface{} key. Although Go 1.20 makes the "any" to be "comparable", while the "any/interface{}" could be used as a map key in very old Go releases, the CI passes.

@wxiaoguang
Copy link
Contributor Author

@gustavosbarreto do you have more questions?

@gustavosbarreto
Copy link
Collaborator

@gustavosbarreto do you have more questions?

LGTM

@gustavosbarreto
Copy link
Collaborator

Can you please squash the commits so we have a clean history without fix commits being applied on top of an original patch.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Aug 12, 2023

Can you please squash the commits so we have a clean history without fix commits being applied on top of an original patch.

Done, while IMO it's also good to do "squash" when the mergers are doing the final merge (into master), GitHub has that choice on the web UI, then contributors don't need to squash and the commit history is clear for reviewers.

@wxiaoguang
Copy link
Contributor Author

Kindly ping ~~~

@gustavosbarreto
Copy link
Collaborator

Kindly ping ~~~

@belak Can you please take a look into this?

@wxiaoguang
Copy link
Contributor Author

Kindly ping (2 weeks) ~~~

@wxiaoguang
Copy link
Contributor Author

Kindly ping (another 2 weeks) ~~~

I don't like keeping stale PRs. If this PR doesn't get interest or this project is dead, I will close it.

@wxiaoguang
Copy link
Contributor Author

Hmm, another week, no interest. Close and feel free to take over.

@gustavosbarreto
Copy link
Collaborator

@wxiaoguang I apologize for the delay. I could have merged your PR, but I preferred to wait for other long-time contributors to approve it as well. Since no one has responded to my requests, I will proceed with the merge because I believe this fix is important.

@gustavosbarreto gustavosbarreto merged commit ece6c79 into gliderlabs:master Sep 27, 2023
@wxiaoguang wxiaoguang deleted the fix-data-race branch September 28, 2023 01:00
@belak
Copy link
Collaborator

belak commented Oct 5, 2023

Sorry about the delay in reviewing - I forgot I set up a filter to put all Github PRs in a folder. Either way, this looks good to me - thanks for submitting it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data race on sshContext
6 participants