Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a stats race condition #303

Closed

Conversation

reedobrien
Copy link

@reedobrien reedobrien commented Jul 21, 2016

After seeing a data race in testing and digging in it appears there
is a race in mgo. The Stats object in the global stats variable could change due
to a SetStats call or in a reassignment in ResetStats.

This commit fixes that by:

  • making stats an object,
  • embedding enabled as a private attribute of Stats rather than relying on nil checks
  • and changing the reset logic to zero out the unwanted values rather
    than replacing the pointer with a new pointer.

Running the mgo tests against v2-unstable I am sporadically seeing a failure in stats counting, see line 1154. Running against this fix-data-race-in-stats branch, I have not seen it yet.

cc: @mjs @wallyworld

After seeing a data race in testing[1] and digging in it appears there
is a race in mgo. The Stats object in the global stats variable could change due
to a SetStats call or in a reassignment in ResetStats.

This commit fixes that by:
 - making stats an object,
 - embedding enabled as a private attribute of Stats rather than relying on nil checks
 - and changing the reset logic to zero out the unwanted values rather
   than replacing the pointer with a new pointer.

[1] https://bugs.launchpad.net/juju-core/+bug/1604817
@kat-co
Copy link

kat-co commented Jul 22, 2016

Just to clearly state the race:

Scenario 1

  • We dereference the stats pointer before ResetStats has a chance to point it to a new instance.
  • The stats we are trying to set are lost.

End result: we're trying to set stats on something we're resetting and we don't care.

Scenario 2

  • We dereference the stats pointer after ResetStats points it to a new instance.
  • We enter the critical section by locking the mutex and {inc,dec}rement the correct memory.

End result: we're settings stats on something we thought had been reset and we receive an incorrect result.

@reedobrien
Copy link
Author

Hello, @niemeyer any thoughts on this or when it might make it to a stable release? It is triggering farily regularly. Thanks!

cc @mjs @AlexisBruemmer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants