Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock between nvWave stop and feed #11886

Merged
merged 2 commits into from
Dec 8, 2020
Merged

Conversation

feerrenrut
Copy link
Contributor

@feerrenrut feerrenrut commented Dec 1, 2020

Link to issue number:

fixes #11591

Summary of the issue:

On at least one user's system, a deadlock seems to occur while using NVDA with the oneCore synth. This results in a freeze and a stack dump by the watchdog.

Stacks to note, from log:

Python stack for thread 3128 (Dummy-82):
  File "synthDrivers\oneCore.pyc", line 365, in _callback
  File "nvwave.pyc", line 326, in feed
  File "nvwave.pyc", line 341, in _feedUnbuffered_handleErrors
  File "nvwave.pyc", line 369, in _feedUnbuffered
  File "nvwave.pyc", line 386, in sync
  File "winKernel.pyc", line 225, in waitForSingleObject

Python stack for thread 9300 (MainThread):
  File "nvda.pyw", line 215, in <module>
  File "core.pyc", line 550, in main
  File "wx\core.pyc", line 2134, in MainLoop
  File "gui\__init__.pyc", line 1050, in Notify
  File "core.pyc", line 520, in run
  File "queueHandler.pyc", line 88, in pumpAll
  File "queueHandler.pyc", line 55, in flushQueue
  File "speech\__init__.pyc", line 146, in cancelSpeech
  File "speech\manager.pyc", line 737, in cancel
  File "synthDrivers\oneCore.pyc", line 209, in cancel
  File "nvwave.pyc", line 473, in stop
  File "nvwave.pyc", line 435, in _idleUnbuffered

Because I can't reproduce this, I'm having to reason about how the code got into this state. What seems to have happened is that feed and stop have been called at the same time from the synth thread and the main thread respectively. These have both progressed to sub-routines, sync and _idleUnbuffered respectively. _idleUnbuffered is blocked waiting to acquire the self._lock, and sync has acquired the self._lock mutex and is blocked by waitForSingleObject.

feed has no locks until it calls into _feedUnbuffered, however stop does acquire locks that _feedUnbuffered needs, then releases them and then calls to _idleUnbuffered which has contention with _feedUnbuffered again. It seems to me that the only way to get into this situation is if stop acquires the locks it needs first. It calls waveOutReset, then when it releases the locks and before they are acquired by _idleUnbuffered they are instead acquired by _feedUnbuffered. This results in waveOutWrite being called before progressing to sync and waiting on waitForSingleObject

It's still not entirely clear to me why this set of events causes a deadlock. My best guess is that on this particular system, the whdr (Wave Out Header) which includes dwFlags that should have the WHDR_DONE bit set when the associated buffer has finished being played is not getting updated due to the call to reset. Thus the synth thread gets stuck in a loop calling waitForSingleObject and checking _prev_whdr.

Alternatively, it may be that the stack is not representative of the freeze?

Description of how this pull request fixes the issue:

Because I'm not 100% sure of the cause of the problem, I have covered a few potentials:

  • waitForSingleObject now has a timeout, it then will check if nvWave has been stopped before waiting again.
  • When resetting the device, the event is explicitly signaled. The documentation for winm isn't explicit about whether this is handled.
  • When feed is called, don't send data to the device if cancel has already been called.

Relevant Info from Docs:
When opening winm device, the event signal args:

If fdwOpen contains the CALLBACK_EVENT flag, dwCallback is a handle to an event. The event is signaled whenever the state of the waveform buffer changes. The application can use WaitForSingleObject or WaitForMultipleObjects to wait for the event. When the event is signaled, you can get the current state of the waveform buffer by checking the dwFlags member of the WAVEHDR structure. (See waveOutPrepareHeader.)

waveOutReset:

The waveOutReset function stops playback on the given waveform-audio output device and resets the current position to zero. All pending playback buffers are marked as done (WHDR_DONE) and returned to the application.

The locking logic for nvWave is complicated, however given plans to re write nvWave with modern Windows API's I think we should hold off on doing more than we need to resolve this specific issue

Testing performed:

Ran locally with oneCore

Known issues with pull request:

None

Change log entry:

None

with self._global_waveout_lock:
winmm.waveOutPrepareHeader(self._waveout, LPWAVEHDR(whdr), sizeof(WAVEHDR))
winmm.waveOutWrite(self._waveout, LPWAVEHDR(whdr), sizeof(WAVEHDR))
self.sync() # sync must still be called even if stopping, so that waveOutUnprepareHeader can be called
self._prev_whdr = whdr
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug, in the case that nvWave is stopping, we shouldn't save this unprepared header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what then would unprepare that header? saving it off means that feed / sync will unprepare it next time (I think) not saving it, and setting _prev_whdr to None, you'd have to specifically call waveOutUnprepare for hdr I think?

michaelDCurran
michaelDCurran previously approved these changes Dec 7, 2020
Copy link
Member

@michaelDCurran michaelDCurran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@feerrenrut feerrenrut changed the base branch from master to beta December 8, 2020 03:00
@feerrenrut feerrenrut dismissed michaelDCurran’s stale review December 8, 2020 03:00

The base branch was changed.

@feerrenrut
Copy link
Contributor Author

This should be in 2020.4. @michaelDCurran are you still happy with this?

@michaelDCurran
Copy link
Member

@feerrenrut I'm happy for this to be in 2020.4, yes.

@feerrenrut feerrenrut merged commit de7fb88 into beta Dec 8, 2020
@feerrenrut feerrenrut deleted the fixNVWaveDeadlock branch December 8, 2020 05:41
@nvaccessAuto nvaccessAuto added this to the 2021.1 milestone Dec 8, 2020
@feerrenrut feerrenrut modified the milestones: 2021.1, 2020.4 Dec 8, 2020
feerrenrut added a commit that referenced this pull request Dec 11, 2020
Merge remote-tracking branch 'origin/beta'
Including:
- Update translations.
- Update liblouis to 3.16.1 (PR #11888)
- Fix deadlock between nvWave stop and feed (PR #11886)
- Detect Humanware Brailliant BI 40X and 20X via both USB and Bluetooth (PR #11819)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants