-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OMD 5.11.20230318-labs-edition seems to freeze/block the livestatus socket #163
Comments
Hmm, We've just had this happen sporadically with just the two sockets showing in
strace of the two pids, one is very busy one is not:
|
this might be linked to the recent changes in naemon comment/downtime handling but needs more investigation. |
Do you have any suggestions of how we can gather more information? |
there is something going wrong... i just updated the patch, since it wasn't the last version of that patch anyway. You could try tomorrows daily. |
This is unfortunatley a little vague at the moment however it seems like when we put PG001 (host) into downtime on 5.11.20230318 naemon ends up locking up or getting broken in some way.
Under normal circumstances this returns this:
However when it is broken (i.e. thruk is timing out communicating with the socket) lsof shows:
Which looks the same naemon has spun up another file handle to the socket or something.
Thruk has the following errors;
There is nothing significant or that looks like errors in the
naemon.log
itself nor thelivestatus.log
.Our resolution for the problem is:
We are not convinced that the downtime action is actually what is causing it, it may just be that it has correllated with the event multiple times.
The text was updated successfully, but these errors were encountered: