reschedule not done after mutex unlock #20802

pdunaj · 2019-11-19T15:50:04Z

Describe the bug
I was preparing some training material and noticed behavior that I believe is a bug.
I have threads. Two are cooperative. The higher priority one is waiting for mutex. The lower priority one is holding the mutex. After mutex is release low priority one continues execution until sleep. If mutex is held by preemptive thread releasing the mutex cause immediate swap to high priority thread.

[00:00:00.000,000] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 0 locked
[00:00:00.977,569] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 0 unlocked
[00:00:00.977,600] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 0 waiting <--- [-2] sleeps
[00:00:00.977,630] <inf> sched_demo: thread 0x200103bc: id:1 prio:-1 ... 0 locked <--- [-1] has the lock
[00:00:01.955,108] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 0 waited <--- [-1] is sleeping, [-2] is done sleeping will executes until mutex_lock
[00:00:01.955,139] <inf> sched_demo: thread 0x200103bc: id:1 prio:-1 ... 0 unlocked  <--- [-1] gets back the CPU as [-2] is waiting for the lock; [-1] releases the lock and continues execution
[00:00:01.955,169] <inf> sched_demo: thread 0x200103bc: id:1 prio:-1 ... 0 waiting <--- continues until sleep
[00:00:01.955,200] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 1 locked <--- [-1] went to sleep so [-2] can take over the CPU
[00:00:02.932,647] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 1 unlocked
[00:00:02.932,678] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 1 waiting
[00:00:02.932,708] <inf> sched_demo: thread 0x200103bc: id:1 prio:-1 ... 0 waited
[00:00:02.932,739] <inf> sched_demo: thread 0x20010340: id:0 prio:1 ... 0 locked
[00:00:03.910,186] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 1 waited
[00:00:03.910,217] <inf> sched_demo: thread 0x20010340: id:0 prio:1 ... 0 unlocked
[00:00:03.910,247] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 2 locked
[00:00:03.910,278] <inf> sched_demo: thread 0x20010340: id:0 prio:1 ... 0 waiting
[00:00:04.887,725] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 2 unlocked
[00:00:04.887,756] <inf> sched_demo: thread 0x20010438: id:2 prio:-2 ... 2 waiting
[00:00:04.887,786] <inf> sched_demo: thread 0x200103bc: id:1 prio:-1 ... 1 locked

To Reproduce
Steps to reproduce the behavior:
Run simple code

Expected behavior
High priority thread should be swapped in as soon as lock is released. i.e. mutex_unlock must be a scheduling point!

Impact
Probably showstopper

Screenshots or console output
N/A

Environment (please complete the following information):
master

Additional context
N/A

The text was updated successfully, but these errors were encountered:

pdunaj · 2019-11-19T15:51:48Z

static void thread_fn(void *p1, void *p2, void *p3)                                                                      
{                                                                                                                        
        size_t id                    = (size_t)p1;                                                                       
        struct thread_config *thread = p2;                                                                               
                                                                                                                         
        int    prio             = thread->prio;                                                                          
        s32_t  locked_time_ms   = thread->locked_time_ms;                                                                
        s32_t  unlocked_time_ms = thread->unlocked_time_ms;                                                              
                                                                                                                         
        size_t i = 0;                                                                                                    
                                                                                                                         
        while (true) {                                                                                                   
                /* Resource is locked */                                                                                 
                k_mutex_lock(&mutex, K_FOREVER);                                                                         
                                                                                                                         
                LOG_INF("thread %p: id:%zu prio:%d ... %zu locked",                                                      
                        k_current_get(), id, prio, i);                                                                   
                                                                                                                         
                k_sleep(locked_time_ms);                                                                                 
                                                                                                                         
                LOG_INF("thread %p: id:%zu prio:%d ... %zu unlocked",                                                    
                        k_current_get(), id, prio, i);                                                                   
                                                                                                                         
                k_mutex_unlock(&mutex);                                                                                  
                                                                                                                         
                /* Thread not holding resources */                                                                       
                LOG_INF("thread %p: id:%zu prio:%d ... %zu waiting",                                                     
                        k_current_get(), id, prio, i);                                                                   
                                                                                                                         
                k_sleep(unlocked_time_ms);                                                                               
                                                                                                                         
                LOG_INF("thread %p: id:%zu prio:%d ... %zu waited",                                                      
                        k_current_get(), id, prio, i);                                                                   
                                                                                                                         
                i++;                                                                                                     
        }                                                                                                                
}

pdunaj · 2019-11-19T15:53:29Z

@carlescufi , @ioannisg

andyross · 2019-11-20T20:26:35Z

Indeed, there's no reschedule done inside k_mutex_unlock(). I'm not sure this is a bug, though. Note that k_sem_give() has exactly the same behavior, as does k_poll_signal_raise() and a bunch of other "write-like" IPC APIs we have.

But not all: I note that an unblocking insert to a queue (e.g. k_lifo/k_fifo) does do a synchronous reschedule. We've never been good about documenting what our reschedule points are.

As far as k_mutex... I suspect the reason it's done this way is a performance thing. There's an existing controversy in the industry about "fair" locks. IIRC, the last time this came up it was pointed out that default POSIX mutexes in OS X are fair (i.e. have the behavior you want, and always yield if needed) and Linux/glibc ones are not, which is why they can minimize the number of context switches and thus blow Apple away in routine benchmarks.

I officially take no position here. If there's desire for the feature, this small patch should work to do it. It doesn't break existing tests, but I haven't tried it with your example above yet.

(Note that it also fixes what seem like really obvious races here -- it's changing mutex state after readying the thread, which means that an SMP core or interrupt might run it before the mutex is correct. I'll submit those separately once we decide what to do here.)

--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -233,18 +233,15 @@ void z_impl_k_mutex_unlock(struct k_mutex *mutex)
                mutex, new_owner, new_owner ? new_owner->base.prio : -1000);
 
        if (new_owner != NULL) {
-               z_ready_thread(new_owner);
-
-               k_spin_unlock(&lock, key);
-
-               arch_thread_return_value_set(new_owner, 0);
-
                /*
                 * new owner is already of higher or equal prio than first
                 * waiter since the wait queue is priority-based: no need to
                 * ajust its priority
                 */
                mutex->owner_orig_prio = new_owner->base.prio;
+               arch_thread_return_value_set(new_owner, 0);
+               z_ready_thread(new_owner);
+               z_reschedule(&lock, key);
        } else {
                mutex->lock_count = 0U;
                k_spin_unlock(&lock, key);

andrewboie · 2019-11-20T21:01:50Z

my vote is to have a reschedule point here. k_mutex should be considered as a special case like Linux RT-Mutex, aside from other IPC which does more traditional wait/wake semantics and aren't supposed to be priority inheriting.

aescolar · 2019-11-21T07:55:18Z

Lowering priority. Based on @andyross evaluation, even considering this a medium bug may be too much.

The k_mutex is a priority-inheriting mutex, so on unlock it's possible that a thread's priority will be lowered. Make this a reschedule point so that reasoning about thread priorities is easier (possibly at the cost of performance): most users are going to expect that the priority elevation stops at exactly the moment of unlock. Note that this also reorders the code to fix what appear to be obvious race conditions. After the call to z_ready_thread(), that thread may be run (e.g. by an interrupt preemption or on another SMP core), yet the return value and mutex weren't correctly set yet. The spinlock was also prematurely released. Fixes zephyrproject-rtos#20802 Signed-off-by: Andy Ross <[email protected]>

pabigot · 2019-12-01T12:37:24Z

While this is currently a question it is blocking for #18970 as described in #20919 (review): until it's answered I can't identify the conditions under which a cooperative thread may transition from running to another state.

andrewboie · 2019-12-02T23:49:21Z

I think we're converging to a consensus that this is working as designed for co-op threads (even if we add a z_reschedule(), it's a no-op if the caller is coop) but definitely broken for preempt threads.

The k_mutex is a priority-inheriting mutex, so on unlock it's possible that a thread's priority will be lowered. Make this a reschedule point so that reasoning about thread priorities is easier (possibly at the cost of performance): most users are going to expect that the priority elevation stops at exactly the moment of unlock. Note that this also reorders the code to fix what appear to be obvious race conditions. After the call to z_ready_thread(), that thread may be run (e.g. by an interrupt preemption or on another SMP core), yet the return value and mutex weren't correctly set yet. The spinlock was also prematurely released. Fixes #20802 Signed-off-by: Andy Ross <[email protected]>

pdunaj added the bug The issue is a bug, or the PR is fixing a bug label Nov 19, 2019

pdunaj added the area: Kernel label Nov 19, 2019

aescolar assigned andyross Nov 19, 2019

dleach02 added the priority: high High impact/importance bug label Nov 19, 2019

aescolar added priority: medium Medium impact/importance bug and removed priority: high High impact/importance bug labels Nov 21, 2019

andyross mentioned this issue Nov 21, 2019

kernel/mutex: Fix races, make unlock rescheduling #20919

Merged

aescolar added has-pr and removed bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug labels Nov 25, 2019

dleach02 added the question label Nov 26, 2019

andyross mentioned this issue Dec 2, 2019

Reschedule points are currently undocumented #21111

Closed

nashif added bug The issue is a bug, or the PR is fixing a bug and removed question labels Dec 3, 2019

dleach02 closed this as completed in #20919 Dec 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reschedule not done after mutex unlock #20802

reschedule not done after mutex unlock #20802

pdunaj commented Nov 19, 2019

pdunaj commented Nov 19, 2019

pdunaj commented Nov 19, 2019

andyross commented Nov 20, 2019 •

edited

Loading

andrewboie commented Nov 20, 2019 •

edited

Loading

aescolar commented Nov 21, 2019

pabigot commented Dec 1, 2019

andrewboie commented Dec 2, 2019

reschedule not done after mutex unlock #20802

reschedule not done after mutex unlock #20802

Comments

pdunaj commented Nov 19, 2019

pdunaj commented Nov 19, 2019

pdunaj commented Nov 19, 2019

andyross commented Nov 20, 2019 • edited Loading

andrewboie commented Nov 20, 2019 • edited Loading

aescolar commented Nov 21, 2019

pabigot commented Dec 1, 2019

andrewboie commented Dec 2, 2019

andyross commented Nov 20, 2019 •

edited

Loading

andrewboie commented Nov 20, 2019 •

edited

Loading