Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on Apple Silicon #349

Closed
cimes-isi opened this issue Apr 4, 2022 · 23 comments · Fixed by #375
Closed

Hangs on Apple Silicon #349

cimes-isi opened this issue Apr 4, 2022 · 23 comments · Fixed by #375
Labels
bug Something isn't working macOS NEXT

Comments

@cimes-isi
Copy link

As in #231, I first tried installing using pip, but as noted there filprofiler isn't yet available for Apple Silicon. I then installed with pip install git+https://github.com/pythonspeed/filprofiler.git#egg=filprofiler as instructed. I can run my application, but it hangs at the end with:

=fil-profile= Preparing to write to fil-result/2022-04-04T13-29-32_759

Per the docs [1], it sounds like a results window is supposed to appear, but nothing happens. I cannot ctrl-c at this point and have to kill the process by its PID.

[1] https://pythonspeed.com/fil/docs/trying.html

Cheers.

@itamarst itamarst added the bug Something isn't working label Apr 4, 2022
@itamarst
Copy link
Collaborator

itamarst commented Apr 4, 2022

Thanks for the bug report!

@cimes-isi
Copy link
Author

I'll add that a couple of files are created (totaling about 57 MB), in case that helps narrow things down a bit:

$ ls fil-result/2022-04-04T13-29-32_759/
peak-memory-source.prof	peak-memory.prof

@itamarst
Copy link
Collaborator

itamarst commented Apr 4, 2022

That's... a strange place to freeze, at that point it's just doing some in-memory data processing, there's no fancy stuff going on.

I will need to order an M1 Mac before I can debug this further.

@itamarst
Copy link
Collaborator

itamarst commented Apr 4, 2022

OK, ordered. Not sure when exactly I have time after it arrives, but hopefully not too far from now.

@nkov
Copy link

nkov commented May 18, 2022

@itamarst I'm experiencing the same issue but not on an M1 mac (MBP 2019 16") running 10.15.7, installed from pip.

More details: symptoms are the same as described above (output of "preparing to write") and 2 files generated in the directory, but no additional output. I'm running a long-running flask app and using kill -s SIGUSR2 18334 to (attempt to) receive the output.

Is there a workaround to read the output of those files?

@itamarst
Copy link
Collaborator

OK, I guess I'll make a public promise just so to hold myself accountable: tomorrow is the day I unpack the Mac Mini, set it up, and try to debug macOS Fil stuff.

@nkov
Copy link

nkov commented May 18, 2022

For the record, I tried running the example.py from here as well as a custom (terminating) script, and both worked on my machine (non Apple silicon) as expected. It's just the Flask app with the kill -s SIGUSR2 which didn't work.

@itamarst
Copy link
Collaborator

OK, so reading through above, there are (maybe) two issues: not finishing on SIGUSR2, and not finishing at end of process. Or maybe it's the same issue. Going to add notes as I try things, in case I have to put debugging on hold.

I have setup ARM Mac Mini, running with Python 3.9 the tests all pass.

@itamarst
Copy link
Collaborator

My first theory was that there is a deadlock/reentrancy issue involving memory allocations in the report generation. However the dumping entrypoint always increments the reentrancy counter, so in theory memory allocations should not be tracked.

@itamarst
Copy link
Collaborator

@cimes-isi do you remember if both written files were a few MB, or was one of them an empty file?

@itamarst
Copy link
Collaborator

@itamarst I'm experiencing the same issue but not on an M1 mac (MBP 2019 16") running 10.15.7, installed from pip.

More details: symptoms are the same as described above (output of "preparing to write") and 2 files generated in the directory, but no additional output. I'm running a long-running flask app and using kill -s SIGUSR2 18334 to (attempt to) receive the output.

Is there a workaround to read the output of those files?

As a workaround you can pipe the non-source file to flamegraph.pl (https://github.com/brendangregg/FlameGraph) or or inferno-flamegraph (https://github.com/jonhoo/inferno).

@itamarst
Copy link
Collaborator

@nkov how are you running your Flask app?

@cimes-isi
Copy link
Author

@cimes-isi do you remember if both written files were a few MB, or was one of them an empty file?

Both files appear to have some content:

$ ls -lh fil-result/2022-05-19T12-54-56_762/
total 39136
-rw-r--r--  1 cimes  staff   3.4M May 19 12:55 peak-memory-source.prof
-rw-r--r--  1 cimes  staff    15M May 19 12:54 peak-memory.prof

I haven't updated at all (since I'm not actively using the SW), so this is still on revision 812a750.

@itamarst
Copy link
Collaborator

Starting to think this is same issue as #365.

@cimes-isi what kind of application were you profiling exactly? Was it multi-threaded?

@cimes-isi
Copy link
Author

You might be right. My application is using PyTorch, and I see that they have some SIGUSR2 handling in their communication code [1].

Works:

print('foo')

Doesn't work:

import torch
print('foo')

[1] https://github.com/pytorch/pytorch/blob/6bb33d93ab94bb268d7cfb600c700585720bcdde/c10/util/signal_handler.cpp#L186-L240

@itamarst
Copy link
Collaborator

@cimes-isi were you using SIGUSR2 originally?

@cimes-isi
Copy link
Author

No, I'm just letting the program run to completion. I did not issue kill -s SIGUSR2, nor does the application do anything itself with signals (except what is implicitly inherited from PyTorch after import).

@itamarst
Copy link
Collaborator

OK, so the SIGUSR2 thing is not so much a cause as a different way to trigger what is probably the same bug. I will have a potential fix available for testing shortly.

@itamarst
Copy link
Collaborator

@cimes-isi can you try revision 045773b and see if it fixes the problems you were having?

@itamarst
Copy link
Collaborator

(If it's too much work I should have macOS ARM wheels available at some point now that I have a mac, but probably not today at this rate).

@itamarst
Copy link
Collaborator

@nkov I believe I will have the SIGUSR2 bug fixed in the next release, at least.

@cimes-isi
Copy link
Author

@cimes-isi can you try revision 045773b and see if it fixes the problems you were having?

It appears to be fixed - the program now runs to completion and results appear in the browser (though I've no idea how to parse them since I never really got to use the tool!).

@itamarst
Copy link
Collaborator

Great! There's some docs at the end of the report, and at https://pythonspeed.com/fil/docs/interpreting-output.html, if it's still not clear after a few minutes maybe start a new discussion in the Discussions tab with a screenshot (if you can share it) and I'll try to explain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working macOS NEXT
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants