You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Occasionally, my sshj client hits a timeout exception almost as soon as the main thread starts waiting for the key exchange to complete, even though the timeout has not been reached.
Based on the logs and code, this can happen during the following sequence of events between the main thread and the reader thread:
[2023-11-06T21:59:52,561][DEBUG][net.schmizz.sshj.transport.KeyExchanger][sshj-Reader-localhost/127.0.0.1:22-1699307992560][KeyExchanger.handle] Received SSH_MSG_KEXINIT
[2023-11-06T21:59:52,561][DEBUG][net.schmizz.sshj.transport.KeyExchanger][sshj-Reader-localhost/127.0.0.1:22-1699307992560][KeyExchanger.startKex] Initiating key exchange
[2023-11-06T21:59:52,562][DEBUG][net.schmizz.concurrent.Promise][Test worker][Promise.tryRetrieve] Awaiting <<kex done>>
[2023-11-06T21:59:52,562][DEBUG][net.schmizz.concurrent.Promise][sshj-Reader-localhost/127.0.0.1:22-1699307992560][Promise.deliver] Setting <<kex done>> to `null`
This is a race condition. What happens under the covers is:
The key exchange is initiated by the reader thread when it receives SSH_MSG_KEXINIT from the server before the main thread has a chance to initiate that exchange.
Before the reader thread calls done.clear() in KeyExchanger.startKex, the main thread also calls KeyExchanger.startKex, skips the key exchange initiation (because it's already ongoing) and calls done.await with a timeout to wait for the exchange to complete.
done.await calls Promise.retrieve which calls Promise.tryRetrieve, which waits for the associated condition via cond.await.
The reader thread calls done.clear(), which delivers a value of null to the done promise.
This causes cond.await to wake up the main thread, and Promise.tryRetrieve returns this null value.
Promise.retrieve throws a timeout exception because the retrieved value is null, even though there wasn't a timeout.
The text was updated successfully, but these errors were encountered:
Occasionally, my sshj client hits a timeout exception almost as soon as the main thread starts waiting for the key exchange to complete, even though the timeout has not been reached.
Based on the logs and code, this can happen during the following sequence of events between the main thread and the reader thread:
This is a race condition. What happens under the covers is:
SSH_MSG_KEXINIT
from the server before the main thread has a chance to initiate that exchange.done.clear()
inKeyExchanger.startKex
, the main thread also callsKeyExchanger.startKex
, skips the key exchange initiation (because it's already ongoing) and callsdone.await
with a timeout to wait for the exchange to complete.done.await
callsPromise.retrieve
which callsPromise.tryRetrieve
, which waits for the associated condition viacond.await
.done.clear()
, which delivers a value ofnull
to thedone
promise.cond.await
to wake up the main thread, andPromise.tryRetrieve
returns thisnull
value.Promise.retrieve
throws a timeout exception because the retrieved value isnull
, even though there wasn't a timeout.The text was updated successfully, but these errors were encountered: