-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop in ssl_tls_write()
#3229
Comments
I haven't tested but the fix will be like this. Maybe the same change should applied in other places. @matt335672 What do you think? diff --git a/common/os_calls.c b/common/os_calls.c
index afd5a958..1952d425 100644
--- a/common/os_calls.c
+++ b/common/os_calls.c
@@ -1567,7 +1567,7 @@ g_sck_can_send(int sck, int millis)
pollfd.revents = 0;
if (poll(&pollfd, 1, millis) > 0)
{
- if ((pollfd.revents & POLLOUT) != 0)
+ if ((pollfd.revents & (POLLOUT | POLLHUP | POLLERR)) != 0)
{
rv = 1;
}
diff --git a/common/ssl_calls.c b/common/ssl_calls.c
index 70d2d7c8..63636770 100644
--- a/common/ssl_calls.c
+++ b/common/ssl_calls.c
@@ -1362,11 +1362,11 @@ ssl_tls_write(struct ssl_tls *tls, const char *data, int length)
* SSL_ERROR_WANT_WRITE
*/
case SSL_ERROR_WANT_READ:
- g_sck_can_recv(tls->trans->sck, SSL_WANT_READ_WRITE_TIMEOUT);
- continue;
+ break_flag = g_sck_can_recv(tls->trans->sck, SSL_WANT_READ_WRITE_TIMEOUT);
+ break;
case SSL_ERROR_WANT_WRITE:
- g_sck_can_send(tls->trans->sck, SSL_WANT_READ_WRITE_TIMEOUT);
- continue;
+ break_flag = g_sck_can_send(tls->trans->sck, SSL_WANT_READ_WRITE_TIMEOUT);
+ break;
/* socket closed */
case SSL_ERROR_ZERO_RETURN: |
I've had a look at the Linux/FreeBSD issue. The language used in both is more complicated than it needs to be. The POSIX manpage says:- POLLERR An error has occurred on the device or stream. This flag is only valid in the revents bitmask; it shall be ignored in the events member. POLLHUP A device has been disconnected, or a pipe or FIFO has been closed by the last process that had it open for writing. Once set, the hangup state of a FIFO shall persist until some process opens the FIFO for writing or until all read-only file descriptors for the FIFO are closed. This event and POLLOUT are mutually-exclusive; a stream can never be writable if a hangup has occurred. However, this event and POLLIN, POLLRDNORM, POLLRDBAND, or POLLPRI are not mutually-exclusive. This flag is only valid in the revents bitmask; it shall be ignored in the events member. I think these are all saying the same thing, which is that we don't need to set As for the existing code, this looks OK to me. The
So I don't think this needs changing. What does need changing is that something isn't checking the status of a read or write. It's not obvious looking at the code where this would be. A couple of questions:-
Thanks. |
I think I saw something like this once during GFX testing. My guess is we've sent the client something on the GFX channel that it doesn't like. The client is closing the GFX channel (but not the main channel), and then we're logging errors for everything else we send on that channel. We do do a lot of error checking in those code paths. If that's right, we need to figure out what the client is objecting to. |
Thanks for the info. I've been busy for a while. I haven't asked the end-user if the issue is GFX related but they've taken a stack trace. The stack trace taken whilst the xrdp process is still running shows
strace output shows evidence that the socket is disconnected (
|
That's useful - thanks. That's the code that sends GFX updates when the backend isn't running in GFX mode. My suspicion is one of these lines. Neither has error checking on it:- Lines 586 to 587 in 364790b
Lines 694 to 695 in 364790b
If you're stuck for time at the moment, let me know if you'd like a patch. |
@matt335672 Could you work on patch? I'm still stuck for time for a while. |
Will do. Should get to it Monday at the latest. |
I think I can (sort of) reproduce this now with this patch:- --- a/xrdp/xrdp_wm.c
+++ b/xrdp/xrdp_wm.c
@@ -833,6 +833,13 @@ xrdp_wm_init(struct xrdp_wm *self)
{
LOG(LOG_LEVEL_DEBUG, " xrdp_wm_init: no autologin / auto run detected, draw login window");
xrdp_login_wnd_create(self);
+static int first_time = 1;
+if (first_time)
+{
+ LOG(LOG_LEVEL_INFO, "Sleeping....");
+ g_sleep(10 * 1000);
+ LOG(LOG_LEVEL_INFO, "Done sleeping....");
+}
/* clear screen */
xrdp_bitmap_invalidate(self->screen, 0);
xrdp_wm_set_focused(self, self->login_window); I then connect and kill the connection before the sleep finishes. I don't get an infinite loop, but I get a lot of logging along these lines:-
Despite my earlier comments, it's nothing to do with GFX - in fact in GFX mode things seem to terminate more quickly. I'll step through this and figure out where the best place to detect the connection has failed is. Adding a check to each potential write is a lot of work and probably unnecessary. There's probably something that can be done at the top level. |
After spending quite a lot of time looking into this, I think I'm chasing my own tail. I'm sure I'm missing something. The very first comment contains this line :-
That has to be generated by The stacktrace is related to the login screen being drawn. I didn't think that was possible after we've authenticated and are trying to connect to a session. This looks like two separate problems. To stop even more flailing about on my part, I'm going to ignore the stacktrace for now. Going back to the logging above, we've got this message:-
This is generated here in Lines 1534 to 1540 in 1c33f3d
Following the call stack up:-
After @metalefty - can you check a couple of things with the user:-
|
Yes.
Yes, trying to reproduce with |
An enterprise user reports that they're seeing a problem where xrdp 0.10.1 is stuck in a logging loop.
I haven't reproduced the issue I guess there are 2 issues.
POLLERR
/POLLHUP
is not checkedg_sck_can_send/recv()
is not checkedRegarding issue 1,
POLLERR
/POLLHUP
is always checked on FreeBSD but not on GNU/Linux.Linux
https://man7.org/linux/man-pages/man2/poll.2.html
FreeBSD
https://man.freebsd.org/cgi/man.cgi?query=poll&apropos=0&sektion=0&manpath=FreeBSD+14.1-RELEASE+and+Ports&arch=default&format=html
The text was updated successfully, but these errors were encountered: