Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix datarace on WriterProxy stop while TimedEvent being triggered [16341] #3097

Merged
merged 6 commits into from
Dec 13, 2022

Conversation

juanlofer-eprosima
Copy link
Contributor

Description

This PR is a revision of #3046, in which the data race between WriterProxy::stop and perform_initial_acknack was addressed by forcing the thread invoking the former to wait for the completion of the latter. However, a deadlock may occur as the reader waiting for the event to end does so with Endpoint::mp_mutex locked, which is also locked when sending an acknack at perform_initial_acknack.

Two different solutions are suggested:

  • Introducing a new mutex in StatefulReader which is taken before reader's, and released after modifying the writer proxy's state and before waiting for the executing event to finish.
  • Releasing reader's mutex before stopping a WriterProxy, and locking it again after doing so. This solution implies that StatefulReader::matched_writer_add and StatefulReader::matched_writer_remove may no longer be executed "atomically", a relevant change that should be carefully analyzed.

In addition, this PR also addresses the potential (rarely or never seen in our tests) datarace that may occur between WriterProxy::stop and perform_heartbeat_response, as StatefulReader::send_acknack reads attributes from WriterProxy which are modified when being stopped.

Contributor Checklist

  • Commit messages follow the project guidelines.
  • The code follows the style guidelines of this project.
  • Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added.
  • Any new/modified methods have been properly documented using Doxygen.
  • Fast DDS test suite has been run locally.
  • Changes are ABI compatible.
  • Changes are API compatible.
  • Documentation builds and tests pass locally.
  • New feature has been added to the versions.md file (if applicable).
  • New feature has been documented/Current behavior is correctly described in the documentation.

Reviewer Checklist

  • Check contributor checklist is correct.
  • Check CI results: changes do not issue any warning.
  • Check CI results: failing tests are unrelated with the changes.

@juanlofer-eprosima juanlofer-eprosima changed the title Fix datarace on WriterProxy stop while TimedEvent being triggered Fix datarace on WriterProxy stop while TimedEvent being triggered [16341] Nov 28, 2022
@EduPonz EduPonz added this to the v2.9.0 milestone Dec 7, 2022
This reverts commit a374872.
Signed-off-by: Juan López Fernández <[email protected]>
Signed-off-by: Juan López Fernández <[email protected]>
This reverts commit b18a5272dd8f7d2bc1f01eda0134b8d572ea86a1.
@jparisu jparisu force-pushed the datarace/perform-initial-acknack branch from d3a4b3a to 8f7bc26 Compare December 7, 2022 08:50
@jsan-rt
Copy link
Contributor

jsan-rt commented Dec 7, 2022

@richiprosima please test this

@jsan-rt jsan-rt added the ci-pending PR which CI is running label Dec 7, 2022
jsan-rt
jsan-rt previously approved these changes Dec 9, 2022
@EduPonz
Copy link

EduPonz commented Dec 10, 2022

@richiprosima please test this

@MiguelCompany
Copy link
Member

MiguelCompany commented Dec 12, 2022

CI for 2.9.0-rc:

  • Linux Build Status
  • Mac Build Status
  • Windows Build Status
  • Discovery server Build Status

@MiguelCompany MiguelCompany merged commit edf8ef7 into master Dec 13, 2022
@MiguelCompany MiguelCompany deleted the datarace/perform-initial-acknack branch December 13, 2022 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-pending PR which CI is running
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants