Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash: recursive Pa_Initialize(), FlexAsio #766

Closed
daschuer opened this issue Feb 2, 2023 · 14 comments · Fixed by #794
Closed

Crash: recursive Pa_Initialize(), FlexAsio #766

daschuer opened this issue Feb 2, 2023 · 14 comments · Fixed by #794
Labels
enhancement New feature or request P2 Priority: High src-common Common sources in /src/common

Comments

@daschuer
Copy link
Contributor

daschuer commented Feb 2, 2023

Describe the bug
Mixxxx crashes unconditional with an out of stack memory issue when FlexAsio, itself using Portaudio is installed.
This has been reported and drilled down here:
mixxxdj/mixxx#10081

The issue is a recursive Pa_Initialize() loop.

To Reproduce
Start a Portaudio based Application with FlexAsio installed, like Mixxxx.

Expected behavior
There should be no crash, FlexAsio should not be available.

Actual behavior
Crash, seevthf original report.

Desktop (please complete the following information):

  • OS windows
  • PA: 2020-02-20 and 19.7

Additional context

I think there should be a guard that prevents recursive calls a secure solution for all such cases. We may also consider to add FexAudio to the ignore list

@dechamps
Copy link
Contributor

dechamps commented Feb 2, 2023

This is not supposed to happen.

In dechamps/FlexASIO#47, a similar issue was encountered and I fixed it in dechamps/FlexASIO@b5a24c8. The way the fix works is by changing the behavior of the Windows DLL loader using a side-by-side assembly. The basic idea is that when the FlexASIO DLL is loaded, the Windows DLL loader will link the FlexASIO DLL with the PortAudio DLL that is bundled with FlexASIO, even if another DLL of the same name was already loaded in the app process (in this case, by mixxx). This is precisely supposed to prevent exactly this kind of problem, and as far as I know it is a robust fix.

Because the PortAudio DLL bundled with FlexASIO does not have ASIO built in, there should be no infinite recursion, as there is no way for FlexASIO to recurse back into itself.

I am not sure why this solution doesn't seem to work in your particular case.

I would encourage you to file an issue against FlexASIO (please make sure to include a FlexASIO log). Reports like "mixxx 2.3.3 on x64 windows is crashing right after FlexASIO 1.9 install" in mixxxdj/mixxx#10081 indicate this is something that should be looked at and investigated on the FlexASIO side.

It looks like the reason you filed this issue against PortAudio is because you have a feature request for PortAudio where you'd like Pa_Initialize() to detect that it's reentering itself and automatically stop if that is the case. I guess that makes sense to some extent, but I'm really not sure it's a good idea to "hide" broken setups in that way. This situation should never happen in the first place, and even if you manage to stop the infinite recursion, incorrect mixing and matching of PortAudio DLLs is bound to cause problems down the line. FlexASIO does not make any promises about running against a PortAudio DLL other than the one it was bundled with, and I would presume the same holds for most other PortAudio clients. It might be better to fail early and loudly than try to soldier on in a highly questionable environment.

@daschuer
Copy link
Contributor Author

daschuer commented Feb 2, 2023

Fail early is a good idea in case we are also able to inform the user reasonable. However in this case we just crash. The normal user has no hint what cause the issue and how to solve it.

Because the PortAudio DLL bundled with FlexASIO does not have ASIO built in.

The windows linker is not able to use two versions if the same library in one executable.
When you look at the backtrack you see that the Portaudio calls at FlexASIO are at the same address as the Mixxxx calls.

I don't see a solution on the FlexAsio side, because they are not able to patch the Portaudio library shipped with Mixxxx. I think we need to patch the Mixxxx Portaudio library.

I have reported it here to find a solution that can go finally into the Portaudio repository.

@dechamps
Copy link
Contributor

dechamps commented Feb 2, 2023

Fail early is a good idea in case we are also able to inform the user reasonable. However in this case we just crash. The normal user has no hint what cause the issue and how to solve it.

Sure, I guess we could improve Pa_Initialize()'s error reporting to detect reentrancy and immediately return an error in that case. This is a nice to have, but it's not a proper fix for your particular problem. As soon as the FlexASIO DLL ends up linked against a foreign PortAudio DLL, you are already in an unsupported situation, even before you call Pa_Initialize(). For example the FlexASIO DLL could refer to PortAudio entry points that don't exist in the foreign DLL because of version mismatch, leading to an immediate failure at DLL loading time. Or there could be subtle ABI inconsistencies between the two DLLs. We should never get into this situation in the first place - by the time FlexASIO calls Pa_Initialize() in a foreign DLL you already lost the game and all bets are off.

It would be nice if we could solve this cleanly instead of papering over the problem.

The windows linker is not able to use two versions if the same library in one executable.

Yes it can. This is precisely what side-by-side assemblies are for, and FlexASIO does make use of that feature precisely to prevent this particular problem.

This solution was demonstrated to work in dechamps/FlexASIO#47 for at least one app (Polyphone). However it looks like it doesn't work for your case (mixxx). It would be nice to understand why before rushing into a half-fix.

I don't see a solution on the FlexAsio side

Even if for some reason the side-by-side assembly solution doesn't work, there are still ways to work around this problem in FlexASIO, such that the FlexASIO DLL doesn't end up linking against a foreign (app-provided) PortAudio DLL. I can see at least two ways this can be done on the FlexASIO side: (1) we could build the PortAudio DLL with a different name (e.g. portaudio_flexasio.dll) to make this kind of naming clash less likely; or (2) FlexASIO could link PortAudio statically instead of using a DLL.

@daschuer
Copy link
Contributor Author

daschuer commented Feb 2, 2023

Original this issue was reported with FlexASIO 1.4 and could no longer reproduced in FlexASIO 1.5.
Unfortunately it has been reported back with FlexASIO 1.9.

In general I would prefer to have his fixed in Portaudio, because that is under control of the Mixxx installation. We have no control which version of FlexASIO is installed on the users machine. Everyone will balme Mixxx when Mixxx crashes.

@dechamps
Copy link
Contributor

dechamps commented Feb 2, 2023

Original this issue was reported with FlexASIO 1.4 and could no longer reproduced in FlexASIO 1.5.
Unfortunately it has been reported back with FlexASIO 1.9.

Yeah this is surprising. Filed dechamps/FlexASIO#182 to track this on the FlexASIO side.

In general I would prefer to have his fixed in Portaudio

You cannot fix this cleanly in PortAudio. You do not want to get into a state where FlexASIO is attempting to link against a different PortAudio DLL than the one it was bundled with, regardless of what Pa_Initialize() does. You enter unsupported territory as soon as the linking is attempted, even if no calls are made.

It looks like what you're trying to do is mitigate this problem by attempting to detect this situation from within the PortAudio DLL that is bundled with mixxx, so that at least mixxx won't crash if people load an ASIO driver that suffers from this reentrency issue. Now that I understand where you're coming from, I agree that your proposed PortAudio feature request makes sense. I just want to stress that this is a mitigation strategy, not a clean fix, and is not guaranteed to work in all cases. But I do agree it's better than nothing.

I wonder if another way to mitigate this from the application side is to use a side-by-side manifest (similar to FlexASIO's) on the application itself, to treat the application's PortAudio DLL as private to the application EXE. I'm not entirely sure this will work though - it's been a while since I looked into how DLL loading works with SxS manifests. You might want to look into that on the mixxx side, as it would only require application changes, not PortAudio changes.

By the way, the way PortAudio handles ASIO drivers (see #696) is once again a big contributing factor in this issue: because PortAudio insists in loading every single ASIO driver present on the system at Pa_Initialize() time, it just takes one crashy ASIO driver (in this case FlexASIO, but it could have been any other driver) to crash the entire application at startup, even if the user never intended to use that particular ASIO driver. I understand this ship has sailed already, but as I wrote in my comment in #696, this is a textbook example as to why the ASIO code in PortAudio is an absolute minefield that application developers should stay clear of as much as possible.

@RossBencina
Copy link
Collaborator

RossBencina commented Feb 13, 2023

Sorry I have not read this whole thread in detail, but: PortAudio is not coded in such a way that it would handle multiple "instances" of itself. It uses global and static variables all over the place and is essentially a singleton. Recusive calls to Pa_Initialize are never going to work with the current PA implementation. The only way I can see this working is if you have a second copy of the PortAudio DLL with non-colliding symbols. Importantly you need to arrange to have two copies of the global+static state of PortAudio.

Another way to do it is to have FlexASIO statically link to PortAudio and not export any PortAudio symbols.

@RossBencina RossBencina changed the title Crash: recursive Pa_Initialize(), FexAsio Crash: recursive Pa_Initialize(), FlexAsio Feb 13, 2023
@RossBencina RossBencina added the P4 Priority: Low label Feb 13, 2023
@RossBencina
Copy link
Collaborator

We recommend that this is closed as infeasible. The correct ticket to track this is on the FlexASIO side: dechamps/FlexASIO#182

Any objections?

@daschuer
Copy link
Contributor Author

A fix with FlexASIO is certainly the right place.

Unfortunately this is out of control if the Mixxxx team. It is up to the users to figure out that they have a faulty version of FlexASIO and fix it. That is nearly impossible for non tech unseres.

The current situation is like this: Install Mixxx, experience the crash, call it rubbish and advance to an other tool blaming the Mixxxx team for poor quality.

The legacy has proven that it had happened twice, and it can happen again with any other wrapper library in future using Portaudio.

This issue is about helping the user to fix the situation. A single if statement with a reasonable error message will point the user to the right fix.
That is my demand here.

@RossBencina
Copy link
Collaborator

@daschuer thanks for explaining. I agree that it's a reasonable precaution for PortAudio to take, given that a single PortAudio global state does not support recursive instancing.

A single if statement with a reasonable error message will point the user to the right fix.

Please feel free to submit a patch.

@RossBencina RossBencina added enhancement New feature or request src-common Common sources in /src/common P2 Priority: High and removed P4 Priority: Low labels Feb 20, 2023
RossBencina pushed a commit that referenced this issue Apr 4, 2023
This fixes a stack overflow when a driver such as FlexAsio also uses PortAudio. Fixes: #766
@dmitrykos
Copy link
Collaborator

dmitrykos commented Apr 4, 2023

To my view this addition added unnecessary complexity to PA. The reason of the misbehavior fixed by this workaround lies in the faulty application/library design which is using PA library, i.e. in the incorrect management of dependencies. If multiple processes can instantiate PA simultaneously then it may create other problems in the future in those apps,

PA really should not try to fix bad external designs to my view.

On the other hand a reasonable advancement on PA side would be an implementation of the Library Context which would solve all issues similar to the this one once and for all. It is also possible to preserve backwards compatibility with apps not using new API by providing legacy PA API with statically instantiated Library Context.

@dechamps
Copy link
Contributor

dechamps commented Apr 5, 2023

On the other hand a reasonable advancement on PA side would be an implementation of the Library Context which would solve all issues similar to the this one once and for all.

No, it would not.

When a piece of code is unintentionally interacting with a PortAudio DLL that is not the one it was built for, there is potential for ABI incompatibility (e.g. missing entry point in the DLL import table), which can cause problems during runtime linking before PortAudio functions are even called. Introducing a "context" concept will not help with this, as by the time you get to use it it's already too late. The correct fix is to ensure we never get into this situation in the first place (e.g. using unique DLL names, or using side-by-side assemblies).

As was discussed at length above, the point of the change is to introduce a best-effort error reporting path so that when problems like these do occur, at least there is some chance that a recoverable error will be reported, instead of a 100% chance of the entire application process crashing.

@dmitrykos
Copy link
Collaborator

@dechamps I get your point but I have expressed my opinion - "PA really should not try to fix bad external designs". Adding some static variable for catching situation described in this thread simply delays the problem for the user of the application. It is better to fix the app/lib rather than increase complexity of PA which is not guilty in the problem and should not be used as a tool for fixing the problem of bad design.

@daschuer
Copy link
Contributor Author

daschuer commented Apr 5, 2023

As explained above this fixes a real live user issue in Mixxx that can not be be fixed in the Mixxx code alone.

Apart from the discussion of programming principles, do you have any real live issues with the fix? If not, you can just ignore it and carry on.

@dmitrykos
Copy link
Collaborator

@daschuer you can follow up #809.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P2 Priority: High src-common Common sources in /src/common
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants