Updates for locking and atomics:

The regular pile:

  - A few improvements to the mutex code

  - Documentation updates for atomics to clarify the difference between
    cmpxchg() and try_cmpxchg() and to explain the forward progress
    expectations.

  - Simplification of the atomics fallback generator

  - The addition of arch_atomic_long*() variants and generic arch_*()
    bitops based on them.

  - Add the missing might_sleep() invocations to the down*() operations of
    semaphores.

The PREEMPT_RT locking core:

  - Scheduler updates to support the state preserving mechanism for
    'sleeping' spin- and rwlocks on RT. This mechanism is carefully
    preserving the state of the task when blocking on a 'sleeping' spin- or
    rwlock and takes regular wake-ups targeted at the same task into
    account. The preserved or updated (via a regular wakeup) state is
    restored when the lock has been acquired.

  - Restructuring of the rtmutex code so it can be utilized and extended
    for the RT specific lock variants.

  - Restructuring of the ww_mutex code to allow sharing of the ww_mutex
    specific functionality for rtmutex based ww_mutexes.

  - Header file disentangling to allow substitution of the regular lock
    implementations with the PREEMPT_RT variants without creating an
    unmaintainable #ifdef mess.

  - Shared base code for the PREEMPT_RT specific rw_semaphore and rwlock
    implementations. Contrary to the regular rw_semaphores and rwlocks the
    PREEMPT_RT implementation is writer unfair because it is infeasible to
    do priority inheritance on multiple readers. Experience over the years
    has shown that real-time workloads are not the typical workloads which
    are sensitive to writer starvation. The alternative solution would be
    to allow only a single reader which has been tried and discarded as it
    is a major bottleneck especially for mmap_sem. Aside of that many of
    the writer starvation critical usage sites have been converted to a
    writer side mutex/spinlock and RCU read side protections in the past
    decade so that the issue is less prominent than it used to be.

  - The actual rtmutex based lock substitutions for PREEMPT_RT enabled
    kernels which affect mutex, ww_mutex, rw_semaphore, spinlock_t and
    rwlock_t. The spin/rw_lock*() functions disable migration across the
    critical section to preserve the existing semantics vs. per CPU
    variables.

  - Rework of the futex REQUEUE_PI mechanism to handle the case of early
    wake-ups which interleave with a re-queue operation to prevent the
    situation that a task would be blocked on both the rtmutex associated
    to the outer futex and the rtmutex based hash bucket spinlock.

    While this situation cannot happen on !RT enabled kernels the changes
    make the underlying concurrency problems easier to understand in
    general. As a result the difference between !RT and RT kernels is
    reduced to the handling of waiting for the critical section. !RT
    kernels simply spin-wait as before and RT kernels utilize rcu_wait().

  - The substitution of local_lock for PREEMPT_RT with a spinlock which
    protects the critical section while staying preemptible. The CPU
    locality is established by disabling migration.

  The underlying concepts of this code have been in use in PREEMPT_RT for
  way more than a decade. The code has been refactored several times over
  the years and this final incarnation has been optimized once again to be
  as non-intrusive as possible, i.e. the RT specific parts are mostly
  isolated.

  It has been extensively tested in the 5.14-rt patch series and it has
  been verified that !RT kernels are not affected by these changes.
locking/rtmutex: Return success on deadlock for ww_mutex waiters

ww_mutexes can legitimately cause a deadlock situation in the lock graph
which is resolved afterwards by the wait/wound mechanics. The rtmutex chain
walk can detect such a deadlock and returns EDEADLK which in turn skips the
wait/wound mechanism and returns EDEADLK to the caller. That's wrong
because both lock chains might get EDEADLK or the wrong waiter would back
out.

Detect that situation and return 'success' in case that the waiter which
initiated the chain walk is a ww_mutex with context. This allows the
wait/wound mechanics to resolve the situation according to the rules.

[ tglx: Split it apart and added changelog ]

Reported-by: Sebastian Siewior <bigeasy@linutronix.de>
Fixes: add461325ec5 ("locking/rtmutex: Extend the rtmutex core to support ww_mutex")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/YSeWjCHoK4v5OcOt@hirez.programming.kicks-ass.net
1 file changed