-
Notifications
You must be signed in to change notification settings - Fork 10
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RWLock fallbacks seem to cause lockup in Firefox #32
Comments
I found a single try_lock call in Firefox, and I was able to easily remove it, but the issue still occurs. I have tested with hex editing out Try that it still works and it's not my fix. |
I keep the changes to windows_sys.lst/rs as small as possible to make rebasing easier in the future. Unused declarations don't make it into the binary.
You could try forcing different Mutex kinds here and see if the behavior changes:
I have tested the basic lock detection on Win 98SE (CreateMutex), Win XP SP3 (CriticalSection), Win 11 (SRWLocks) via the sample program insofar that they seem to work fine and lock and unlock properly. There have not been any stress tests or anything, so I can't guarantee that there isn't something wrong.
No idea if it's just firefox, but you could trace the loading/init process and hook the init code or even
Only things that will be added to the statically dllimported (= land in the import table header of the binary) have to be added to Notice how every function (excluding WinSock) I added to |
These seem to conflict, but I guess it works.
Next time I build Rust, I'll try that.
Debug builds don't work too well on Vista, but I figure worst case I can just malform the TryAcquireSRWLock functions by hex editing to not match on 7+ and try that way. |
Let me know if you figure out more! 🙏 |
I force set it to CriticalSection, and I have the issue on both Vista and 10, and hex editing the TryAcquireSRWLock call does not fix it. I'll try Legacy tomorrow, I am too tired right now. |
Legacy mutex's seemed to last longer, but I still got the same problem. The browser works from anywhere from a few seconds to a few minutes before freezing. I did get the funniest error in testing though
|
I wonder if Firefox (probably rightly?) assumes that an RWLock can be read-locked multiple times. The fallbacks to a critical section or mutex of course won't allow that, which might cause a deadlock. In that case Rust9x would need to implement a completely separate fallback RWLock implementation, e.g. based on |
You might be right, I was also experimenting with the Rust used to create Mypal68 (Firefox 68 for XP) and I noticed Semaphores are used in it. If you want to see the patch for it, here it is. It is 1.45.2 though, and I think I would need at minimum 1.66.0 for 115, and 1.76.0 for 128. I originally wanted to make a fork of 1.77.2 since that works for both 115 and 128, but I couldn't figure out how to port the changes to 1.77.2, however I was kinda doing it wrong in the first place so I could probably do it if I tried again. |
Some of the implementations in that commit look super interesting! The RWLock impl there however doesn't seem to use semaphores or allow multiple read locks at the same time either, it also just falls back to std's old |
I also suspect that the freeze is GFX related, and I know modern Firefox uses WebRender, which Mypal68 doesn't have. I would love to try building an older Firefox version that still allowed for disabling WebRender, however finding all the dependencies for that sounds like hell. |
Hmmm, I'm pretty sure my current RwLock fallback implementation is just wrong (and the one in your linked commit too!). This example should panic immmediately, as the fallbacks guard against reentrancy. I've created #33 for it. Not sure when I get to updating rust9x though at the moment. |
I guess I can look at the rwlock implementations in Firefox and see if anything would cause a crash or have this issue. |
I don't know what I am looking for. I've traced it to this file as it's the only WebRender file to use std::sync::RwLock, but I'm not sure what in it would be. |
I mean the RWLock fallback implementations are just incorrect. You could try replacing the Arc<RwLock<..>> with an Arc<Mutex<...>> and test again. If the broken RWLock fallback is the cause, the Arc Mutex variant should also lock up on modern systems. |
Well, yeah, but it still partially works. So I would want to mitigate the issue for now. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I tried to fork Rust9x to completely remove TryAcquireSRWLock and either replace the two functions with non-Try versions or critical sections.
However when I look into the code, it seems it tries to use them first, and if it can't find the functions, it then falls back to critical sections.
Something here is funny though, because in Firefox compiled with rust9x on a system without TryAcquireSRWLock but otherwise has SRW locks, the browser just freezes after a minute. If I hex edit out the Try part, the browser just works fine.
So either the fallback fails and causes the whole thing to freeze, or something fails when it hits critical sections, or something else idk.
If this is only an issue with Firefox, please give me some pointers as to how I could trace the issue, because I literally have no mentions of TryAcquireSRWLock anywhere in the source code of my modified version. This call is only from Rust.
Also idk if intentional, but when replicating this commit 08798a9 I noticed that in c.rs when you add TryEnterCriticalSection, you don't also replicate it in windows_sys.rs. Now I notice the code for the function itself looks the same as any other, so I assumed that if it's there in c.rs, it doesn't need to be in windows_sys.rs. However I also notice you just left all the SRW lock and ConditionVariable functions in windows_sys.rs. I'm not sure if this is a mistake or intentional, but it doesn't look right to me.
The text was updated successfully, but these errors were encountered: