ksoftirqd: Enable IRQs and call cond_resched() before poking RCU

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

    [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around
25%, spiking as high as 50%. Before the change, the usage would never
exceed 5%.  On a heavily loaded Proxygen server, the aggregate softirq
CPU usage decreases by roughly 10% (relative) given the same amount
of traffic with the patch. It also produces statistically significant
performance wins at higher loads on webservers: about a 1% reduction in
overall CPU utilization and improved latency metrics.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem, but the new cond_resched_rcu_qs() provides shorter code and
avoids double RCU notification in the case where cond_resched() really
did a context switch.

Please note that this commit cannot be backported before v3.20.  Yes,
it will build, and it will even boot and run, but it will be subject
to RCU CPU stalls under heavy softirq load.  You should instead backport
Calvin Owens's original patch: https://lkml.org/lkml/2015/1/6/874.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
[ paulmck: Substituted shiny new cond_resched_rcu_qs() primitive. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Added Calvin's measurements on Proxygen server and webservers. ]
1 file changed