|  | lglock - local/global locks for mostly local access patterns | 
|  | ------------------------------------------------------------ | 
|  |  | 
|  | Origin: Nick Piggin's VFS scalability series introduced during | 
|  | 2.6.35++ [1] [2] | 
|  | Location: kernel/locking/lglock.c | 
|  | include/linux/lglock.h | 
|  | Users: currently only the VFS and stop_machine related code | 
|  |  | 
|  | Design Goal: | 
|  | ------------ | 
|  |  | 
|  | Improve scalability of globally used large data sets that are | 
|  | distributed over all CPUs as per_cpu elements. | 
|  |  | 
|  | To manage global data structures that are partitioned over all CPUs | 
|  | as per_cpu elements but can be mostly handled by CPU local actions | 
|  | lglock will be used where the majority of accesses are cpu local | 
|  | reading and occasional cpu local writing with very infrequent | 
|  | global write access. | 
|  |  | 
|  |  | 
|  | * deal with things locally whenever possible | 
|  | - very fast access to the local per_cpu data | 
|  | - reasonably fast access to specific per_cpu data on a different | 
|  | CPU | 
|  | * while making global action possible when needed | 
|  | - by expensive access to all CPUs locks - effectively | 
|  | resulting in a globally visible critical section. | 
|  |  | 
|  | Design: | 
|  | ------- | 
|  |  | 
|  | Basically it is an array of per_cpu spinlocks with the | 
|  | lg_local_lock/unlock accessing the local CPUs lock object and the | 
|  | lg_local_lock_cpu/unlock_cpu accessing a remote CPUs lock object | 
|  | the lg_local_lock has to disable preemption as migration protection so | 
|  | that the reference to the local CPUs lock does not go out of scope. | 
|  | Due to the lg_local_lock/unlock only touching cpu-local resources it | 
|  | is fast. Taking the local lock on a different CPU will be more | 
|  | expensive but still relatively cheap. | 
|  |  | 
|  | One can relax the migration constraints by acquiring the current | 
|  | CPUs lock with lg_local_lock_cpu, remember the cpu, and release that | 
|  | lock at the end of the critical section even if migrated. This should | 
|  | give most of the performance benefits without inhibiting migration | 
|  | though needs careful considerations for nesting of lglocks and | 
|  | consideration of deadlocks with lg_global_lock. | 
|  |  | 
|  | The lg_global_lock/unlock locks all underlying spinlocks of all | 
|  | possible CPUs (including those off-line). The preemption disable/enable | 
|  | are needed in the non-RT kernels to prevent deadlocks like: | 
|  |  | 
|  | on cpu 1 | 
|  |  | 
|  | task A          task B | 
|  | lg_global_lock | 
|  | got cpu 0 lock | 
|  | <<<< preempt <<<< | 
|  | lg_local_lock_cpu for cpu 0 | 
|  | spin on cpu 0 lock | 
|  |  | 
|  | On -RT this deadlock scenario is resolved by the arch_spin_locks in the | 
|  | lglocks being replaced by rt_mutexes which resolve the above deadlock | 
|  | by boosting the lock-holder. | 
|  |  | 
|  |  | 
|  | Implementation: | 
|  | --------------- | 
|  |  | 
|  | The initial lglock implementation from Nick Piggin used some complex | 
|  | macros to generate the lglock/brlock in lglock.h - they were later | 
|  | turned into a set of functions by Andi Kleen [7]. The change to functions | 
|  | was motivated by the presence of multiple lock users and also by them | 
|  | being easier to maintain than the generating macros. This change to | 
|  | functions is also the basis to eliminated the restriction of not | 
|  | being initializeable in kernel modules (the remaining problem is that | 
|  | locks are not explicitly initialized - see lockdep-design.txt) | 
|  |  | 
|  | Declaration and initialization: | 
|  | ------------------------------- | 
|  |  | 
|  | #include <linux/lglock.h> | 
|  |  | 
|  | DEFINE_LGLOCK(name) | 
|  | or: | 
|  | DEFINE_STATIC_LGLOCK(name); | 
|  |  | 
|  | lg_lock_init(&name, "lockdep_name_string"); | 
|  |  | 
|  | on UP this is mapped to DEFINE_SPINLOCK(name) in both cases, note | 
|  | also that as of 3.18-rc6 all declaration in use are of the _STATIC_ | 
|  | variant (and it seems that the non-static was never in use). | 
|  | lg_lock_init is initializing the lockdep map only. | 
|  |  | 
|  | Usage: | 
|  | ------ | 
|  |  | 
|  | From the locking semantics it is a spinlock. It could be called a | 
|  | locality aware spinlock. lg_local_* behaves like a per_cpu | 
|  | spinlock and lg_global_* like a global spinlock. | 
|  | No surprises in the API. | 
|  |  | 
|  | lg_local_lock(*lglock); | 
|  | access to protected per_cpu object on this CPU | 
|  | lg_local_unlock(*lglock); | 
|  |  | 
|  | lg_local_lock_cpu(*lglock, cpu); | 
|  | access to protected per_cpu object on other CPU cpu | 
|  | lg_local_unlock_cpu(*lglock, cpu); | 
|  |  | 
|  | lg_global_lock(*lglock); | 
|  | access all protected per_cpu objects on all CPUs | 
|  | lg_global_unlock(*lglock); | 
|  |  | 
|  | There are no _trylock variants of the lglocks. | 
|  |  | 
|  | Note that the lg_global_lock/unlock has to iterate over all possible | 
|  | CPUs rather than the actually present CPUs or a CPU could go off-line | 
|  | with a held lock [4] and that makes it very expensive. A discussion on | 
|  | these issues can be found at [5] | 
|  |  | 
|  | Constraints: | 
|  | ------------ | 
|  |  | 
|  | * currently the declaration of lglocks in kernel modules is not | 
|  | possible, though this should be doable with little change. | 
|  | * lglocks are not recursive. | 
|  | * suitable for code that can do most operations on the CPU local | 
|  | data and will very rarely need the global lock | 
|  | * lg_global_lock/unlock is *very* expensive and does not scale | 
|  | * on UP systems all lg_* primitives are simply spinlocks | 
|  | * in PREEMPT_RT the spinlock becomes an rt-mutex and can sleep but | 
|  | does not change the tasks state while sleeping [6]. | 
|  | * in PREEMPT_RT the preempt_disable/enable in lg_local_lock/unlock | 
|  | is downgraded to a migrate_disable/enable, the other | 
|  | preempt_disable/enable are downgraded to barriers [6]. | 
|  | The deadlock noted for non-RT above is resolved due to rt_mutexes | 
|  | boosting the lock-holder in this case which arch_spin_locks do | 
|  | not do. | 
|  |  | 
|  | lglocks were designed for very specific problems in the VFS and probably | 
|  | only are the right answer in these corner cases. Any new user that looks | 
|  | at lglocks probably wants to look at the seqlock and RCU alternatives as | 
|  | her first choice. There are also efforts to resolve the RCU issues that | 
|  | currently prevent using RCU in place of view remaining lglocks. | 
|  |  | 
|  | Note on brlock history: | 
|  | ----------------------- | 
|  |  | 
|  | The 'Big Reader' read-write spinlocks were originally introduced by | 
|  | Ingo Molnar in 2000 (2.4/2.5 kernel series) and removed in 2003. They | 
|  | later were introduced by the VFS scalability patch set in 2.6 series | 
|  | again as the "big reader lock" brlock [2] variant of lglock which has | 
|  | been replaced by seqlock primitives or by RCU based primitives in the | 
|  | 3.13 kernel series as was suggested in [3] in 2003. The brlock was | 
|  | entirely removed in the 3.13 kernel series. | 
|  |  | 
|  | Link: 1 http://lkml.org/lkml/2010/8/2/81 | 
|  | Link: 2 http://lwn.net/Articles/401738/ | 
|  | Link: 3 http://lkml.org/lkml/2003/3/9/205 | 
|  | Link: 4 https://lkml.org/lkml/2011/8/24/185 | 
|  | Link: 5 http://lkml.org/lkml/2011/12/18/189 | 
|  | Link: 6 https://www.kernel.org/pub/linux/kernel/projects/rt/ | 
|  | patch series - lglocks-rt.patch.patch | 
|  | Link: 7 http://lkml.org/lkml/2012/3/5/26 |