Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Socket Calls ignored when calling quickly. #1107

Closed
xiamaz opened this issue Dec 25, 2021 · 11 comments
Closed

Client Socket Calls ignored when calling quickly. #1107

xiamaz opened this issue Dec 25, 2021 · 11 comments
Labels
bug Something isn't working

Comments

@xiamaz
Copy link

xiamaz commented Dec 25, 2021

I have implemented a custom python client by directly opening a UNIX socket https://github.com/xiamaz/python-yabai-client

This works fairly well for single calls, but I have noticed that consecutive calls in the same script are sometimes ignored by yabai without any error or indication of failure.

This was tested on 3.3.10

Reproduction

This can be tested by using the aforementioned client and using the following sample python script.

from yabai_client import YabaiClient
yc = YabaiClient()

display_info = yc._send_message("query --displays")

spaces_info = yc._send_message("query --spaces")

print(spaces_info)

This script will sometimes fail, when calling directly. Verbose logging in yabai only reveals that the second call does not appear in the logs when the error occurs.

Current fixes

This error does not occur when introducing a small delay between both calls, either in form of a retrying mechanism (such as implemented in the client) or inserting a sleep between both message calls.

Conclusion

A proper fix would be great, since I believe there might be some kind of race condition in the current code. I would like to investigate this myself further, but maybe there is also some kind of bug in my unix socket client implementation?

@FelixKratz
Copy link

This is exactly what I also found in SketchyBar (which uses the same client server code), I mitigated it very rudamentary https://github.com/FelixKratz/SketchyBar/blob/f3d58959b8f1b7c58dcd157a98b50130e150b759/src/sketchybar.c#L46

@koekeishiya
Copy link
Owner

Resolved on master.

@koekeishiya koekeishiya added addressed on master; not released Fixed upstream, but not yet released bug Something isn't working labels Dec 26, 2021
@xiamaz
Copy link
Author

xiamaz commented Dec 27, 2021

Thanks a lot! The new fix works for me 👍

@FelixKratz
Copy link

I have now completely moved away from the unix socket messaging approach in favor of XNU mach messages since I still
had occasional hickups: FelixKratz/SketchyBar#172. Seems to be faster, more reliable and more efficient. Maybe this could also be interesting for yabai, though there are probably much fewer messages?

@koekeishiya
Copy link
Owner

koekeishiya commented Mar 2, 2022

Your socket implementation still had the bug I fixed in my commit referenced above:

https://github.com/FelixKratz/SketchyBar/pull/172/files#diff-cbf91a87d3f07d2ad455ab70d64c16f55b7dc82be33346014bf25f91296d1e9eL688 (search for: fclose)

and

https://github.com/FelixKratz/SketchyBar/pull/172/files#diff-d7b85dd13ad37f50a9fc23e68961ebe40065e3254f15aa8d6fd4e8b9d51f0f80L105 (search for: socket_close)

resulted in a double closing of the socket file descriptor which caused an incoming socket connection to sometimes drop instantly because the descriptor was reused by the system.

As for XPC/mach messages vs unix socket files I suspect that the difference in performance doesn't really matter much as the largest overhead is in spawning the process to send the message, as the number of messages send is relatively low for yabai. I am also of the impression that it is easier for people to re-create the unix socket client than a mach interface when building third party tools as well

@FelixKratz
Copy link

Maybe communication with the scripting addon might benefit as there is no process spawning overhead. I mainly implemented it for academic reasons but ended up liking it.

@koekeishiya
Copy link
Owner

Maybe communication with the scripting addon might benefit as there is no process spawning overhead

That is probably true, yeah.

@koekeishiya koekeishiya removed the addressed on master; not released Fixed upstream, but not yet released label Mar 16, 2022
@koekeishiya
Copy link
Owner

@FelixKratz

Maybe communication with the scripting addon might benefit as there is no process spawning overhead. I mainly implemented it for academic reasons but ended up liking it.

Did you profile this to get data that shows actual gain in performance compared to using unix domain sockets?

(I am aware of the technical differences in the techniques they use and why mach messages is generally faster, just interested to see how it plays out in practice with all the other overhead included).

I'm thinking about moving the SA to a pure user-space shared memory model instead of the current unix domain socket implementation. This should be faster than mach messages, so if mach messages are a substantial improvement over the unix sockets, it would be worth doing.

@FelixKratz
Copy link

FelixKratz commented Aug 27, 2022

@FelixKratz

Maybe communication with the scripting addon might benefit as there is no process spawning overhead. I mainly implemented it for academic reasons but ended up liking it.

Did you profile this to get data that shows actual gain in performance compared to using unix domain sockets?

(I am aware of the technical differences in the techniques they use and why mach messages is generally faster, just interested to see how it plays out in practice with all the other overhead included).

I'm thinking about moving the SA to a pure user-space shared memory model instead of the current unix domain socket implementation. This should be faster than mach messages, so if mach messages are a substantial improvement over the unix sockets, it would be worth doing.

Yes I did some tests with the yabai scripting addition. Basically the mach implementation from my fork: https://github.com/FelixKratz/yabai (copied over from SketchyBar) takes 20-85 microseconds to transmit a message to the scripting addition, while the implementation using unix domain sockets takes 250-1000 microseconds. So yeah, an order of magnitude in speed can be gained here.

In the implementation I am using, out-of-line data with memory deallocation is already used, so mach messages wont get any faster than what I have shown here probably:

The power of OOL data, and its advantage over inline data, comes from integration with the virtual memory system. A sender can share entire memory pages with the receiver without manually copying the data into temporary buffers. It is especially beneficial when transferring a large amount of data. The kernel can directly operate on virtual memory mappings to make the transfer as fast as possible and minimize memory usage. The sender can even choose to deallocate the memory regions from its address space during sending, allowing for even more optimizations.

My testing is based on the comparison of pre-send:

    struct timespec tv;
    clock_gettime(CLOCK_REALTIME, &tv);

and post receive clock_gettime on an M1 Pro. I have no background in computer science, so this might be a bit hand wavy.

@koekeishiya
Copy link
Owner

Great, sounds like something that would be worth doing at some point then. I think a pure shared memory solution would be even faster because as far as I can tell, when using mach messages we still go through the kernel, where as a pure shared memory model would not.

@mathisto
Copy link

mathisto commented Oct 9, 2022

...I think a pure shared memory solution would be even faster...

@koekeishiya @FelixKratz I have been chewing on this exact idea for a while now.
I believe you right, shared memory is the fastest possible solution, bar none.
I, too, was tossing messages around with JSON IPC and then the Mach
messages, but something felt wrong as I started to try to orchestrate
and manage the sheer number of lines crossing as I integrated more actors:

  • yabai
  • sketchybar
  • svim
  • mpv
  • hammerspoon

All of a sudden I went from bespoke dotfiles to FUTEX/MUTEX hell.

It looks like your use of Mach messages already have you sneaking up on
single digits. Shared Memory is the the fastest possible way to share
state and the only way I am aware of that you will possibly broach the
microsecond barrier. The Python interpreter does some sneaky semaphore
IPC to handle process locking. I think you would likely be forced into
using the slightly more arcane System V semaphores. Last time I checked,
the macOS POSIX affordances were still jank/incomplete.

Just writing this got me thinking about mmap. It should be portable
enough to fulfill your needs. In fact, get this:

Mac OS X specific: the file descriptor used for creating MAP_ANON
regions can be used to pass some Mach VM flags, and can be specified
as -1 if no such flags are associated with the region. Mach VM flags
are defined in <mach/vm_statistics.h>

So maybe Mach messages are already taking this route?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants