rcu: Add full memory barrier in __wait_rcu_gp()

RCU grace periods have extremely strong any-to-any ordering
requirements that are met by placing full barriers in various places
in the grace-period computation.  However, normal grace period requests
might be subjected to a "fly-by" wakeup in which the requesting process
doesn't actually sleep and in which the corresponding CPU is not yet
aware that the grace period has ended.  In this case, loads in the code
immediately following the synchronize_rcu() call might potentially see
values before stores preceding the grace period on other CPUs.

This is an unusual use case, because RCU readers normally read.  However,
they can do writes, and if they do, we need post-grace-period reads to
see those writes.

This commit therefore adds an smp_mb() to the end of __wait_rcu_gp().

Many thanks to Joel Fernandes for the series of questions leading to me
realizing that this bug exists!

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
1 file changed