Merge branches 'doc.2023.12.13a', 'torture.2023.11.23a', 'fixes.2023.12.13a', 'rcu-tasks.2023.12.12b' and 'srcu.2023.12.13a' into rcu-merge.2023.12.13a
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index bd3c58c..2d42998 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -241,15 +241,22 @@
srcu_struct. The rules for the expedited RCU grace-period-wait
primitives are the same as for their non-expedited counterparts.
- If the updater uses call_rcu_tasks() or synchronize_rcu_tasks(),
- then the readers must refrain from executing voluntary
- context switches, that is, from blocking. If the updater uses
- call_rcu_tasks_trace() or synchronize_rcu_tasks_trace(), then
- the corresponding readers must use rcu_read_lock_trace() and
- rcu_read_unlock_trace(). If an updater uses call_rcu_tasks_rude()
- or synchronize_rcu_tasks_rude(), then the corresponding readers
- must use anything that disables preemption, for example,
- preempt_disable() and preempt_enable().
+ Similarly, it is necessary to correctly use the RCU Tasks flavors:
+
+ a. If the updater uses synchronize_rcu_tasks() or
+ call_rcu_tasks(), then the readers must refrain from
+ executing voluntary context switches, that is, from
+ blocking.
+
+ b. If the updater uses call_rcu_tasks_trace()
+ or synchronize_rcu_tasks_trace(), then the
+ corresponding readers must use rcu_read_lock_trace()
+ and rcu_read_unlock_trace().
+
+ c. If an updater uses call_rcu_tasks_rude() or
+ synchronize_rcu_tasks_rude(), then the corresponding
+ readers must use anything that disables preemption,
+ for example, preempt_disable() and preempt_enable().
Mixing things up will result in confusion and broken kernels, and
has even resulted in an exploitable security issue. Therefore,
diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst
index 3b739f6..659d591 100644
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -3,13 +3,26 @@
PROPER CARE AND FEEDING OF RETURN VALUES FROM rcu_dereference()
===============================================================
-Most of the time, you can use values from rcu_dereference() or one of
-the similar primitives without worries. Dereferencing (prefix "*"),
-field selection ("->"), assignment ("="), address-of ("&"), addition and
-subtraction of constants, and casts all work quite naturally and safely.
+Proper care and feeding of address and data dependencies is critically
+important to correct use of things like RCU. To this end, the pointers
+returned from the rcu_dereference() family of primitives carry address and
+data dependencies. These dependencies extend from the rcu_dereference()
+macro's load of the pointer to the later use of that pointer to compute
+either the address of a later memory access (representing an address
+dependency) or the value written by a later memory access (representing
+a data dependency).
-It is nevertheless possible to get into trouble with other operations.
-Follow these rules to keep your RCU code working properly:
+Most of the time, these dependencies are preserved, permitting you to
+freely use values from rcu_dereference(). For example, dereferencing
+(prefix "*"), field selection ("->"), assignment ("="), address-of
+("&"), casts, and addition or subtraction of constants all work quite
+naturally and safely. However, because current compilers do not take
+either address or data dependencies into account it is still possible
+to get into trouble.
+
+Follow these rules to preserve the address and data dependencies emanating
+from your calls to rcu_dereference() and friends, thus keeping your RCU
+readers working properly:
- You must use one of the rcu_dereference() family of primitives
to load an RCU-protected pointer, otherwise CONFIG_PROVE_RCU
diff --git a/Documentation/RCU/torture.rst b/Documentation/RCU/torture.rst
index b3b6dfa..49e7bee 100644
--- a/Documentation/RCU/torture.rst
+++ b/Documentation/RCU/torture.rst
@@ -185,7 +185,7 @@
Not all changes require that all scenarios be run. For example, a change
to Tree SRCU might run only the SRCU-N and SRCU-P scenarios using the
--configs argument to kvm.sh as follows: "--configs 'SRCU-N SRCU-P'".
-Large systems can run multiple copies of of the full set of scenarios,
+Large systems can run multiple copies of the full set of scenarios,
for example, a system with 448 hardware threads can run five instances
of the full set concurrently. To make this happen::
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 65731b0..b72e204 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5302,6 +5302,12 @@
Dump ftrace buffer after reporting RCU CPU
stall warning.
+ rcupdate.rcu_cpu_stall_notifiers= [KNL]
+ Provide RCU CPU stall notifiers, but see the
+ warnings in the RCU_CPU_STALL_NOTIFIER Kconfig
+ option's help text. TL;DR: You almost certainly
+ do not want rcupdate.rcu_cpu_stall_notifiers.
+
rcupdate.rcu_cpu_stall_suppress= [KNL]
Suppress RCU CPU stall warning messages.
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index d414e14..4202174 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -396,10 +396,11 @@
(2) Address-dependency barriers (historical).
- [!] This section is marked as HISTORICAL: For more up-to-date
- information, including how compiler transformations related to pointer
- comparisons can sometimes cause problems, see
- Documentation/RCU/rcu_dereference.rst.
+ [!] This section is marked as HISTORICAL: it covers the long-obsolete
+ smp_read_barrier_depends() macro, the semantics of which are now
+ implicit in all marked accesses. For more up-to-date information,
+ including how compiler transformations can sometimes break address
+ dependencies, see Documentation/RCU/rcu_dereference.rst.
An address-dependency barrier is a weaker form of read barrier. In the
case where two loads are performed such that the second depends on the
@@ -560,9 +561,11 @@
ADDRESS-DEPENDENCY BARRIERS (HISTORICAL)
----------------------------------------
-[!] This section is marked as HISTORICAL: For more up-to-date information,
-including how compiler transformations related to pointer comparisons can
-sometimes cause problems, see Documentation/RCU/rcu_dereference.rst.
+[!] This section is marked as HISTORICAL: it covers the long-obsolete
+smp_read_barrier_depends() macro, the semantics of which are now implicit
+in all marked accesses. For more up-to-date information, including
+how compiler transformations can sometimes break address dependencies,
+see Documentation/RCU/rcu_dereference.rst.
As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for
DEC Alpha, which means that about the only people who need to pay attention
diff --git a/include/linux/rcu_notifier.h b/include/linux/rcu_notifier.h
index ebf3713..5640f02 100644
--- a/include/linux/rcu_notifier.h
+++ b/include/linux/rcu_notifier.h
@@ -13,7 +13,7 @@
#define RCU_STALL_NOTIFY_NORM 1
#define RCU_STALL_NOTIFY_EXP 2
-#ifdef CONFIG_RCU_STALL_COMMON
+#if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
#include <linux/notifier.h>
#include <linux/types.h>
@@ -21,12 +21,12 @@
int rcu_stall_chain_notifier_register(struct notifier_block *n);
int rcu_stall_chain_notifier_unregister(struct notifier_block *n);
-#else // #ifdef CONFIG_RCU_STALL_COMMON
+#else // #if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
// No RCU CPU stall warnings in Tiny RCU.
static inline int rcu_stall_chain_notifier_register(struct notifier_block *n) { return -EEXIST; }
static inline int rcu_stall_chain_notifier_unregister(struct notifier_block *n) { return -ENOENT; }
-#endif // #else // #ifdef CONFIG_RCU_STALL_COMMON
+#endif // #else // #if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
#endif /* __LINUX_RCU_NOTIFIER_H */
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index d29740b..3dc1e58 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -355,7 +355,7 @@
})
/**
- * list_next_or_null_rcu - get the first element from a list
+ * list_next_or_null_rcu - get the next element from a list
* @head: the head for the list.
* @ptr: the list head to take the next element from.
* @type: the type of the struct this is embedded in.
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index f7206b2..0746b1b 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -34,9 +34,6 @@
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
-#define ulong2long(a) (*(long *)(&(a)))
-#define USHORT_CMP_GE(a, b) (USHRT_MAX / 2 >= (unsigned short)((a) - (b)))
-#define USHORT_CMP_LT(a, b) (USHRT_MAX / 2 < (unsigned short)((a) - (b)))
/* Exported common interfaces */
void call_rcu(struct rcu_head *head, rcu_callback_t func);
@@ -301,6 +298,11 @@
lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
}
+static inline void rcu_try_lock_acquire(struct lockdep_map *map)
+{
+ lock_acquire(map, 0, 1, 2, 0, NULL, _THIS_IP_);
+}
+
static inline void rcu_lock_release(struct lockdep_map *map)
{
lock_release(map, _THIS_IP_);
@@ -315,6 +317,7 @@
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
# define rcu_lock_acquire(a) do { } while (0)
+# define rcu_try_lock_acquire(a) do { } while (0)
# define rcu_lock_release(a) do { } while (0)
static inline int rcu_read_lock_held(void)
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 127ef3b..236610e 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -229,7 +229,7 @@
srcu_check_nmi_safety(ssp, true);
retval = __srcu_read_lock_nmisafe(ssp);
- rcu_lock_acquire(&ssp->dep_map);
+ rcu_try_lock_acquire(&ssp->dep_map);
return retval;
}
diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
index 2984de6..9b0b52e 100644
--- a/kernel/rcu/Kconfig.debug
+++ b/kernel/rcu/Kconfig.debug
@@ -105,6 +105,31 @@
The boot option rcupdate.rcu_cpu_stall_cputime has the same function
as this one, but will override this if it exists.
+config RCU_CPU_STALL_NOTIFIER
+ bool "Provide RCU CPU-stall notifiers"
+ depends on RCU_STALL_COMMON
+ depends on DEBUG_KERNEL
+ depends on RCU_EXPERT
+ default n
+ help
+ WARNING: You almost certainly do not want this!!!
+
+ Enable RCU CPU-stall notifiers, which are invoked just before
+ printing the RCU CPU stall warning. As such, bugs in notifier
+ callbacks can prevent stall warnings from being printed.
+ And the whole reason that a stall warning is being printed is
+ that something is hung up somewhere. Therefore, the notifier
+ callbacks must be written extremely carefully, preferably
+ containing only lockless code. After all, it is quite possible
+ that the whole reason that the RCU CPU stall is happening in
+ the first place is that someone forgot to release whatever lock
+ that you are thinking of acquiring. In which case, having your
+ notifier callback acquire that lock will hang, preventing the
+ RCU CPU stall warning from appearing.
+
+ Say Y here if you want RCU CPU stall notifiers (you don't want them)
+ Say N if you are unsure.
+
config RCU_TRACE
bool "Enable tracing for RCU"
depends on DEBUG_KERNEL
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index b531c33..f94f658 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -262,6 +262,8 @@
return rcu_cpu_stall_suppress_at_boot && !rcu_inkernel_boot_has_ended();
}
+extern int rcu_cpu_stall_notifiers;
+
#ifdef CONFIG_RCU_STALL_COMMON
extern int rcu_cpu_stall_ftrace_dump;
@@ -659,10 +661,10 @@
bool rcu_cpu_beenfullyonline(int cpu);
#endif
-#ifdef CONFIG_RCU_STALL_COMMON
+#if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
int rcu_stall_notifier_call_chain(unsigned long val, void *v);
-#else // #ifdef CONFIG_RCU_STALL_COMMON
+#else // #if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
static inline int rcu_stall_notifier_call_chain(unsigned long val, void *v) { return NOTIFY_DONE; }
-#endif // #else // #ifdef CONFIG_RCU_STALL_COMMON
+#endif // #else // #if defined(CONFIG_RCU_STALL_COMMON) && defined(CONFIG_RCU_CPU_STALL_NOTIFIER)
#endif /* __LINUX_RCU_H */
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index a0b2520..7567ca8 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2450,10 +2450,12 @@
unsigned long stop_at;
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
- ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
- if (ret)
- pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
- __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
+ if (rcu_cpu_stall_notifiers) {
+ ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
+ if (ret)
+ pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
+ __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
+ }
if (stall_cpu_holdoff > 0) {
VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff");
schedule_timeout_interruptible(stall_cpu_holdoff * HZ);
@@ -2497,7 +2499,7 @@
cur_ops->readunlock(idx);
}
pr_alert("%s end.\n", __func__);
- if (!ret) {
+ if (rcu_cpu_stall_notifiers && !ret) {
ret = rcu_stall_chain_notifier_unregister(&rcu_torture_stall_block);
if (ret)
pr_info("%s: rcu_stall_chain_notifier_unregister() returned %d.\n", __func__, ret);
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 560e99e..0351a4e 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -772,20 +772,10 @@
*/
static void srcu_gp_start(struct srcu_struct *ssp)
{
- struct srcu_data *sdp;
int state;
- if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
- sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
- else
- sdp = this_cpu_ptr(ssp->sda);
lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
- spin_lock_rcu_node(sdp); /* Interrupts already disabled. */
- rcu_segcblist_advance(&sdp->srcu_cblist,
- rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
- WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
- spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */
WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
@@ -1271,9 +1261,11 @@
* period (gp_num = X + 8). So acceleration fails.
*/
s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
- rcu_segcblist_advance(&sdp->srcu_cblist,
- rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
- WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s) && rhp);
+ if (rhp) {
+ rcu_segcblist_advance(&sdp->srcu_cblist,
+ rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+ WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
+ }
if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
sdp->srcu_gp_seq_needed = s;
needgp = true;
@@ -1723,6 +1715,11 @@
WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+ /*
+ * Although this function is theoretically re-entrant, concurrent
+ * callbacks invocation is disallowed to avoid executing an SRCU barrier
+ * too early.
+ */
if (sdp->srcu_cblist_invoking ||
!rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
spin_unlock_irq_rcu_node(sdp);
@@ -1753,6 +1750,7 @@
sdp->srcu_cblist_invoking = false;
more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
spin_unlock_irq_rcu_node(sdp);
+ /* An SRCU barrier or callbacks from previous nesting work pending */
if (more)
srcu_schedule_cbs_sdp(sdp, 0);
}
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index f54d578..732ad5b 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -975,7 +975,7 @@
t->rcu_tasks_nvcsw != READ_ONCE(t->nvcsw) ||
!rcu_tasks_is_holdout(t) ||
(IS_ENABLED(CONFIG_NO_HZ_FULL) &&
- !is_idle_task(t) && t->rcu_tasks_idle_cpu >= 0)) {
+ !is_idle_task(t) && READ_ONCE(t->rcu_tasks_idle_cpu) >= 0)) {
WRITE_ONCE(t->rcu_tasks_holdout, false);
list_del_init(&t->rcu_tasks_holdout_list);
put_task_struct(t);
@@ -993,7 +993,7 @@
t, ".I"[is_idle_task(t)],
"N."[cpu < 0 || !tick_nohz_full_cpu(cpu)],
t->rcu_tasks_nvcsw, t->nvcsw, t->rcu_tasks_holdout,
- t->rcu_tasks_idle_cpu, cpu);
+ data_race(t->rcu_tasks_idle_cpu), cpu);
sched_show_task(t);
}
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 3ac3c84..1ae8517 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2338,6 +2338,8 @@
struct rcu_node *rnp;
struct rcu_node *rnp_old = NULL;
+ if (!rcu_gp_in_progress())
+ return;
/* Funnel through hierarchy to reduce memory contention. */
rnp = raw_cpu_read(rcu_data.mynode);
for (; rnp != NULL; rnp = rnp->parent) {
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index ac8e86b..5d66642 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -1061,6 +1061,7 @@
}
early_initcall(rcu_sysrq_init);
+#ifdef CONFIG_RCU_CPU_STALL_NOTIFIER
//////////////////////////////////////////////////////////////////////////////
//
@@ -1081,7 +1082,13 @@
*/
int rcu_stall_chain_notifier_register(struct notifier_block *n)
{
- return atomic_notifier_chain_register(&rcu_cpu_stall_notifier_list, n);
+ int rcsn = rcu_cpu_stall_notifiers;
+
+ WARN(1, "Adding %pS() to RCU stall notifier list (%s).\n", n->notifier_call,
+ rcsn ? "possibly suppressing RCU CPU stall warnings" : "failed, so all is well");
+ if (rcsn)
+ return atomic_notifier_chain_register(&rcu_cpu_stall_notifier_list, n);
+ return -EEXIST;
}
EXPORT_SYMBOL_GPL(rcu_stall_chain_notifier_register);
@@ -1115,3 +1122,5 @@
{
return atomic_notifier_call_chain(&rcu_cpu_stall_notifier_list, val, v);
}
+
+#endif // #ifdef CONFIG_RCU_CPU_STALL_NOTIFIER
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index c534d68..46aaaa9 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -538,9 +538,15 @@
EXPORT_SYMBOL_GPL(torture_sched_setaffinity);
#endif
+int rcu_cpu_stall_notifiers __read_mostly; // !0 = provide stall notifiers (rarely useful)
+EXPORT_SYMBOL_GPL(rcu_cpu_stall_notifiers);
+
#ifdef CONFIG_RCU_STALL_COMMON
int rcu_cpu_stall_ftrace_dump __read_mostly;
module_param(rcu_cpu_stall_ftrace_dump, int, 0644);
+#ifdef CONFIG_RCU_CPU_STALL_NOTIFIER
+module_param(rcu_cpu_stall_notifiers, int, 0444);
+#endif // #ifdef CONFIG_RCU_CPU_STALL_NOTIFIER
int rcu_cpu_stall_suppress __read_mostly; // !0 = suppress stall warnings.
EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
module_param(rcu_cpu_stall_suppress, int, 0644);