-
Notifications
You must be signed in to change notification settings - Fork 260
[Pal/lib] Add spinlocks to mbedTLS-specific SSL recv/send #2059
base: master
Are you sure you want to change the base?
Conversation
The common header "spinlock.h" contains spinlock implementation. It may be used in LibOS, PAL, and other parts of Graphene. Previously, it included "api.h" which is PAL-specific. This prevented using "spinlock.h" in PAL-agnostic parts of Graphene such as mbedTLS crypto adapters. This commit removes this dependency and fixes a few emerged include-related bugs.
Linux-SGX PAL wraps all pipe/UNIX domain socket communication in TLS sessions. Previously, Graphene assumed that only one thread at a time accesses one TLS session (i.e., no multi-threading support). This commit adds spinlocks to `lib_SSL*` functions to support such (rare) multi-threading scenarios. Note that we cannot rely on pthreads and/or mutexes so we use simple spinlocks. This commit adds a corresponding LibOS test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)
LibOS/shim/test/regression/Makefile, line 173 at r1 (raw file):
CFLAGS-sigaction_per_process += -pthread CFLAGS-signal_multithread += -pthread CFLAGS-pthread_set_get_affinity += -pthread
FYI: This wasn't sorted, so I just added sorting.
Pal/lib/crypto/adapters/mbedtls_adapter.c, line 388 at r1 (raw file):
* recv() and the other will spin and burn CPU cycles. We consider "shared SSL context" * a rare case and use simple spinlocks instead of mutexes. */ assert(spinlock_is_locked(&ssl_ctx->lock));
FYI: I consider the performance overhead of spinlocks here minimal. We never hit multi-threaded pipes until #2058 (actually, I know that Erlang does the same), so most apps won't notice this. And the apps that do use this multi-threaded pattern... well, I just hope it's not on critical path because these are just pipes for intra-process communication.
Pal/src/pal_internal.h, line 12 at r1 (raw file):
#include <stdarg.h> #include <sys/types.h>
FYI: That's because one function here uses ssize_t
type.
Jenkins, retest Jenkins-Debug please (LTP test |
I think protecting mbedtls_ssl_read and mbedtls_ssl_write with the same lock may cause other problems. Read usually blocks for a long time. If the application is trying to send when recv'ing in another thread, it will cause a deadlock. For example, this test C program runs fine under Linux and Graphene SGX before this PR, but with this PR it will hang. I don't know a real world example like this, but I guess it is possible. Looking at mbedTLS code, read and write does not seem to cause conflict, except when handshake or re-negotiation is pending. And I can't think why an application will call recv() from two threads. Therefore we can probably call mbedtls_ssl_read without a lock. |
The attached test program reads and writes on the same file descriptor from two threads. This looks like an unrealistic scenario. Typically applications create a socket pair or a pipe pair and use one end for recv and another end for send.
I definitely don't like the idea of having |
It seems that calling fork() after a thread have started recv() will also cause deadlock. See: Perhaps we can drop the lock before ocall_recv, then re-acquire it after it returns? This should be fine as long as re-negotiation is disabled. Or, at least we can use spinlock with a timeout and print some warning message, so users can see and report the problem when it happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
Pal/lib/crypto/adapters/mbedtls_adapter.c, line 417 at r1 (raw file):
* a rare case and use simple spinlocks instead of mutexes. */ assert(spinlock_is_locked(&ssl_ctx->lock)); ssize_t ret = ssl_ctx->pal_send_cb(fd, buf, buf_size);
pal_send_cb eventually calls blocking host system call?
it would cause troubles.
Anyway I don't know about MT-requirement for mbedtls. So I leave this comment non-blocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can drop the lock before ocall_recv, then re-acquire it after it returns? This should be fine as long as re-negotiation is disabled.
But this doesn't help with the SSL context... mbedTLS internally changes its own metadata before and after the recv()
syscall/ocall. So there can be a situation when two threads try to recv on the same SSL context, and one unlocks the SSL context and blocks on recv()
, and the other thread now can proceed, and observes inconsistent state of the SSL context... and stuff breaks.
I googled around but couldn't find a good solution. I have a feeling that it's just not possible to escape the issue of "blocking recv/send syscall while holding the lock". @mkow @boryspoplawski @yamahata Maybe you know how to circumvent this?
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
Pal/lib/crypto/adapters/mbedtls_adapter.c, line 417 at r1 (raw file):
Previously, yamahata wrote…
pal_send_cb eventually calls blocking host system call?
it would cause troubles.
Anyway I don't know about MT-requirement for mbedtls. So I leave this comment non-blocking.
Yes, it will cause troubles (deadlocks in scenarios @lejunzhu mentioned)... I can't come up with a way to fix this.
I can't think why an application needs to use two recv() simultaneously. If it does happen, perhaps we can add a flag in ssl_ctx indicating the first mbedtls_ssl_read() is already running, and just return fail to the second one before it even enters mbedTLS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see one solution: create two TLS contexts, one for each communication direction. Then either:
recv
changes only context internals, but does not send anything on the socket (like some metadata, idk)recv
also sends some data on the socket.
If it's 1) then we could wrap the very samefd
into both context, otherwise (2) we need to create two separate connections to emulate each way of the original one.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw I assumed mbedTLS does not give us any way to implement this (but I'm not that familiar with it, so this needs to be confirmed).
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- sounds like a risky assumption, I guess TLS may sometimes send some data when receiving, e.g. to rotate symmetric keys, in case of long-lasting connections.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe non-blocking I/O can solve this. Roughly, calls to mbedtls_ssl_handshake(), mbedtls_ssl_read() and mbedtls_ssl_write() will look like this:
while (1) {
lock();
ret = mbedtls_ssl_...(); /* handshake/read/write */
unlock();
if (ret == MBEDTLS_ERR_SSL_WANT_WRITE) {
ocall_poll(stream_fd, POLLOUT); /* wait until we can write */
continue;
} else if (ret == MBEDTLS_ERR_SSL_WANT_READ) {
ocall_poll(stream_fd, POLLIN); /* wait until we can read */
continue;
} else {
break; /* Go on and process the return value as usual */
}
}
Threads will only block at ocall_poll(). All calls to mbedTLS API are serialized, therefore the internal state must be consistent.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I didn't understand Borys's idea. (As Michal mentioned, 1 is a risky assumption so we could go with 2, but I don't really understand 2. Sounds very complicated.)
I like @lejunzhu's workaround. @lejunzhu Do I understand correctly that we do this:
- We always create non-blocking host
stream_fd
for TLS connections - If the user app wants non-blocking pipes/sockets, then nothing to be done additionally
- But if the user app wants blocking pipes/sockets, then we additionally do
ocall_poll()
to wait for data to be available for read/write outside of the lock.
Is my understanding correct? From the first glance, this looks like a good workaround. @boryspoplawski @mkow Any drawbacks to this hack?
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also summoning @yamahata
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @lejunzhu)
Yes, this is what I'm suggesting. |
I worked out a commit llly@e4dc52e based on this PR. It's similar to @lejunzhu 's idea,
I don't know what caused this behavior. |
Description of the changes
Linux-SGX PAL wraps all pipe/UNIX domain socket communication in TLS sessions. Previously, Graphene assumed that only one thread at a time accesses one TLS session (i.e., no multi-threading support). This commit adds spinlocks to
lib_SSL*
functions to support such (rare) multi-threading scenarios.Note that we cannot rely on pthreads and/or mutexes so we use simple spinlocks. Because this is in the "common library" used by PAL and LibOS, with no "complex" dependencies allowed.
Also, there is a prerequisite (first) commit in this PR: remove
api.h
fromspinlock.h
. See commit messages for explanations.Fixes #2058.
How to test this PR?
This PR adds a corresponding LibOS test
pipe_multithread
(kudos to @lejunzhu for the initial version of this test and debugging this).This change is