Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pelt #1

Merged
merged 127 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
127 commits
Select commit Hold shift + click to select a range
91ce1da
UPSTREAM: cgroup: export list of delegatable control files using sysfs
rgushchin Nov 6, 2017
f688ce2
UPSTREAM: cgroup: export list of cgroups v2 features using sysfs
rgushchin Nov 6, 2017
77c316a
UPSTREAM: cgroup: avoid copying strings longer than the buffers
Dec 12, 2017
3e126f9
UPSTREAM: cgroup: use strlcpy() instead of strscpy() to avoid spuriou…
arndb Dec 15, 2017
2a1706a
BACKPORT: string: drop __must_check from strscpy() and restore strscp…
htejun Jan 9, 2018
8ef167e
UPSTREAM: cgroup: make cgroup.threads delegatable
rgushchin Jan 10, 2018
f40da98
UPSTREAM: cgroup: Update documentation reference
mattrope Dec 29, 2017
80f6151
UPSTREAM: cgroup: Explicitly remove core interface files
htejun Apr 26, 2018
dc872d1
BACKPORT: cgroup: Simplify cgroup_ancestor
rdna Sep 22, 2018
85a5401
UPSTREAM: cgroup: remove unnecessary unlikely()
TinyWindzz Nov 4, 2018
26f4efa
UPSTREAM: cgroup: Add named hierarchy disabling to cgroup_no_v1 boot …
htejun Dec 28, 2018
f617dcb
UPSTREAM: cgroup: saner refcounting for cgroup_root
Jan 12, 2019
bc83bc7
UPSTREAM: cgroup: remove extra cgroup_migrate_finish() call
shakeelb Apr 3, 2019
391a7e3
UPSTREAM: cgroup: rename freezer.c into legacy_freezer.c
rgushchin Apr 19, 2019
4081cda
UPSTREAM: cgroup: implement __cgroup_task_count() helper
rgushchin Apr 19, 2019
387f476
UPSTREAM: cgroup: cgroup v2 freezer
rgushchin Apr 19, 2019
add21f8
UPSTREAM: signal: unconditionally leave the frozen state in ptrace_st…
rgushchin May 16, 2019
6113159
UPSTREAM: cgroup: freezer: fix frozen state inheritance
rgushchin Sep 12, 2019
1ff2fd4
UPSTREAM: cgroup: freezer: call cgroup_enter_frozen() with preemption…
oleg-nesterov Oct 9, 2019
2ca9a2d
UPSTREAM: cgroup: Remove unused cgrp variable
zhangshk Apr 30, 2019
60fcc77
Makefile.lib: Stop calling size_append
wloot Feb 27, 2020
cc98fb9
dtc: Silence warnings
kdrag0n Jul 19, 2019
0cd48a8
kernel/cgroup: Sync with miuicx-v2
TheVoyager0777 Sep 20, 2022
1558a6c
BACKPORT: psi: Fix uaf issue when psi trigger is destroyed while bein…
surenbaghdasaryan Jan 11, 2022
76a871b
ANDROID: cgroup: Fix for a partially backported patch
surenbaghdasaryan Jul 20, 2022
d94b7bd
Revert "sched/cass: Introduce the Capacity Aware Superset Scheduler"
hxsyzl Jan 24, 2025
fff4768
Revert "kernel: Force trivial, unbound kthreads onto low-power CPUs"
hxsyzl Jan 24, 2025
23bf7b0
Revert "kernel: Force trivial, unbound kthreads onto low-power CPUs"
hxsyzl Jan 24, 2025
d4b69a9
Revert "kernel: Add API to mark IRQs and kthreads as performance crit…
hxsyzl Jan 24, 2025
7889ae0
Revert "[PATCH] constgran vanilla-max sched: Make latency / granularity"
hxsyzl Jan 24, 2025
468c2d2
Revert "sched: import BORE Scheduler 5.1.0"
hxsyzl Jan 24, 2025
99c406f
sysctl: promote several nodes out of CONFIG_SCHED_DEBUG
arter97 Oct 22, 2021
7c6432c
ANDROID: sched: fair: balance for single core cluster
weivincewang Sep 21, 2019
7568de8
trace: Lets also track flags when a task is skipped for load balancing
RenderBroken May 26, 2019
bd1e157
sched: reduce softirq conflicts with RT
Jul 8, 2019
18961bb
Revert "sched/core: Fix migration to invalid CPU in __set_cpus_allowe…
Oct 24, 2019
603b51f
sched/fair: Derive the downmigration margin wrt the destination CPU
DefinitelyNOTobscenelyvague Oct 4, 2019
c22d989
sched/walt: Improve the scheduler
Jun 6, 2019
2452169
sched/fair: Refactor packing eligible test
DefinitelyNOTobscenelyvague Apr 30, 2019
1c0f95f
sched: improve the scheduler
Apr 19, 2019
406db02
sched/fair: Allow prev cpu in find best target
Apr 19, 2019
d216a5c
sched/fair: Fix excessive packing on the max capacity CPU
Jun 10, 2019
85951fc
sched/fair: upadte adjust_cpus_for_packing()
DefinitelyNOTobscenelyvague Apr 30, 2019
984e476
sched: clean-up unused/duplicate functions & variables
Mar 13, 2019
021ac39
sched: Improve the scheduler
Mar 18, 2019
806a8a1
sched: walt: Improve the scheduler
Mar 18, 2019
f35571a
sched: Cleanup unused variables in walt
Apr 17, 2019
03f07c0
sched: Improve the scheduler
Mar 12, 2019
cabd77b
sched: Improve the scheduler
Mar 12, 2019
cdbdace
sched: Improve the scheduler
Mar 12, 2019
08bc621
sched: Improve the scheduler
Mar 13, 2019
18f05f6
sched: Improve the scheduler
Feb 26, 2019
c4fd2e8
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague May 7, 2019
4f317c5
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague May 2, 2019
4784c67
sched/walt: Improve the scheduler
May 20, 2019
89326f3
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Jun 3, 2019
1740bdb
sched/walt: Improve the scheduler
May 21, 2019
2f66471
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague May 2, 2019
f3fd79f
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Jun 4, 2019
014aeb5
sched/walt: Improve the scheduler
Jun 19, 2019
52e5553
sched/fair: Fix incorrect CPU access in check_for_migration()
DefinitelyNOTobscenelyvague Jul 2, 2019
debed17
sched/isolcpus: Fix "isolcpus=" boot parameter handling when !CONFIG_…
rmullick Oct 23, 2017
bc22278
sched/walt: drop preferred_cluster from rtg
Jun 27, 2019
342c699
sched/walt: Improve the scheduler
Jul 9, 2019
fba0974
sched/core_ctl: Improve the scheduler
DefinitelyNOTobscenelyvague Mar 8, 2019
d9a5bcd
sched/core_ctl: Improve the scheduler
DefinitelyNOTobscenelyvague Jun 18, 2019
c542a5a
sched: Improve the scheduler
Jul 1, 2019
9e80aa8
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Jul 17, 2019
49ce435
sched/walt: Improve the scheduler
Jul 3, 2019
7fa9bd0
sched: Improve the scheduler
Jul 25, 2019
0d8d17e
arch_topology: Add possible sibling cpu mask for cpu_topology
Mar 18, 2019
a3d7af6
sched: core: Fix usage of cpu core group mask
Jul 19, 2019
20e4ec6
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Jul 12, 2019
f3b06a5
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Sep 3, 2019
6412352
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Sep 3, 2019
3774e47
sched: Introduce sched_busy_hysteresis_enable_cpus tunable
DefinitelyNOTobscenelyvague Nov 29, 2018
a921f07
sched: Use bitmask for sched_busy_hysteresis_enable_cpus tunable
DefinitelyNOTobscenelyvague May 28, 2019
bdd72d8
sched: Remove unused code in sched_avg.c
DefinitelyNOTobscenelyvague Jun 25, 2019
9dba211
sched/walt: Improve the scheduler
Jul 23, 2019
ff57c5c
sched/walt: Improve the scheduler
Sep 10, 2019
723acd1
sched: walt: Improve the Scheduler
Aug 29, 2019
15c6e58
sched: walt: remove unused variable
Sep 27, 2019
caaf7da
sched: improve the scheduler
Sep 25, 2019
031721a
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Oct 10, 2019
6a57e55
sched/walt: Improve the scheduler
Oct 25, 2019
c0ffe6f
sched/walt: Improve the scheduler
Oct 30, 2019
df26c83
sched/walt: Improve the scheduler
Nov 9, 2019
efdae46
sched: walt: Dump walt status on BUG_ON
Nov 9, 2019
fee1e0e
sched/walt: Improve the scheduler
Nov 8, 2019
4ac0781
sched/walt: Improve the scheduler
Nov 1, 2019
708d50e
sched: walt: fix sched_cluster initialization
Nov 8, 2019
4024269
sched: core: Use sched_clusters for updown migration handler
Nov 8, 2019
b5b9cea
sched: walt: Improve the scheduler
Nov 29, 2019
f364ea9
sched/walt: Avoid walt irq work in offlined cpu
sixtaku Aug 13, 2019
beaf02d
sched/walt: Improve the scheduler
Nov 21, 2019
6684c68
sched/walt: cleanup unused code
RenderBroken Apr 23, 2020
ce2296f
sched: walt: improve the scheduler
Dec 6, 2019
7a0fe20
sched/walt: Improve the scheduler
Dec 4, 2019
4a87fd4
sched: walt: Improve the scheduler
Dec 5, 2019
627ebd7
sched: walt: Improve the scheduler
Dec 17, 2019
a18b8a9
sched/walt: Fix kernel panic issue by uninitialized data
Jan 6, 2020
f7ea628
sched: core_ctl: Improve the scheduler
Dec 30, 2019
c00c3cf
sched/fair: Don't place wakee on waker cpu if colocate enabled
Dec 19, 2019
794fefd
sched/cpufreq_schedutil: create a function for common steps
May 22, 2019
0c44e73
cpufreq: schedutil: Queue sugov irq work on policy online cpu
sixtaku Aug 13, 2019
294a201
cpufreq: Avoid leaving stale IRQ work items during CPU offline
rafaeljw Dec 11, 2019
2fe53c5
sched: fair: Stop running idle_balance on active migration kick
Jul 3, 2019
62a3930
sched: Improve the scheduler
May 21, 2019
536cb63
sched: fair: Improve the scheduler
Aug 8, 2019
bef97e6
sched/fair: remove unused variable
Oct 5, 2019
a763367
sched/fair: Cleanup for incoming upstream changes
RenderBroken Apr 23, 2020
b28bd7a
sched/fair: Cleanup for incoming upstream changes
RenderBroken Apr 23, 2020
6710e45
sched/fair: Force gold cpus to do idle lb when silver has big tasks
DefinitelyNOTobscenelyvague Apr 2, 2019
97a8a7a
sched/fair: Avoid force newly idle load balance if have iowait task
sixtaku Apr 22, 2019
db68901
sched: Add support to spread tasks
Jan 8, 2020
b12b540
sched/walt: Improve the scheduler
Jul 3, 2019
eea9043
sched/walt: Improve the scheduler
DefinitelyNOTobscenelyvague Apr 21, 2020
5a35bf6
kernel/sched: Introduce newest Util Clamp from Linux 5.19
TheVoyager0777 Aug 30, 2022
b7bf8e2
kernel/sched: Go back to WALT
TheVoyager0777 Aug 30, 2022
11324b4
sched/energy: Checkout to branch android-4.14 of https://android.goog…
0ctobot Nov 26, 2019
be7601c
kernel/sched : fix build
hxsyzl Jan 24, 2025
5b59b28
cpuidle: lpm-levels: get bias time from scheduler
Sep 11, 2019
83b5064
fs/proc_fs: add register_sysctl_init
TheVoyager0777 Aug 30, 2022
1c42f30
treewide: Fix complie
TheVoyager0777 Sep 20, 2022
aeaa7b1
lib/lz4: sync with Voyager kernel
TheVoyager0777 Sep 1, 2022
617db31
rcu: Squash backport from v5.4
xNombre Dec 16, 2021
04fc752
rcu: Speed up calling of RCU tasks callbacks
rostedt May 24, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions Documentation/RCU/Design/Expedited-Grace-Periods/ExpSchedFlow.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,10 @@ <h2><a name="RCU-preempt Expedited Grace Periods">
in quiescent states.
Otherwise, the expedited grace period will use
<tt>smp_call_function_single()</tt> to send the CPU an IPI, which
is handled by <tt>sync_rcu_exp_handler()</tt>.
is handled by <tt>rcu_exp_handler()</tt>.

<p>
However, because this is preemptible RCU, <tt>sync_rcu_exp_handler()</tt>
However, because this is preemptible RCU, <tt>rcu_exp_handler()</tt>
can check to see if the CPU is currently running in an RCU read-side
critical section.
If not, the handler can immediately report a quiescent state.
Expand Down Expand Up @@ -146,19 +146,18 @@ <h2><a name="RCU-sched Expedited Grace Periods">
<p><img src="ExpSchedFlow.svg" alt="ExpSchedFlow.svg" width="55%">

<p>
As with RCU-preempt's <tt>synchronize_rcu_expedited()</tt>,
As with RCU-preempt, RCU-sched's
<tt>synchronize_sched_expedited()</tt> ignores offline and
idle CPUs, again because they are in remotely detectable
quiescent states.
However, the <tt>synchronize_rcu_expedited()</tt> handler
is <tt>sync_sched_exp_handler()</tt>, and because the
However, because the
<tt>rcu_read_lock_sched()</tt> and <tt>rcu_read_unlock_sched()</tt>
leave no trace of their invocation, in general it is not possible to tell
whether or not the current CPU is in an RCU read-side critical section.
The best that <tt>sync_sched_exp_handler()</tt> can do is to check
The best that RCU-sched's <tt>rcu_exp_handler()</tt> can do is to check
for idle, on the off-chance that the CPU went idle while the IPI
was in flight.
If the CPU is idle, then tt>sync_sched_exp_handler()</tt> reports
If the CPU is idle, then <tt>rcu_exp_handler()</tt> reports
the quiescent state.

<p>
Expand Down Expand Up @@ -299,19 +298,18 @@ <h3><a name="Idle-CPU Checks">Idle-CPU Checks</a></h3>
idle CPUs in the mask passed to <tt>rcu_report_exp_cpu_mult()</tt>.

<p>
For RCU-sched, there is an additional check for idle in the IPI
handler, <tt>sync_sched_exp_handler()</tt>.
For RCU-sched, there is an additional check:
If the IPI has interrupted the idle loop, then
<tt>sync_sched_exp_handler()</tt> invokes <tt>rcu_report_exp_rdp()</tt>
<tt>rcu_exp_handler()</tt> invokes <tt>rcu_report_exp_rdp()</tt>
to report the corresponding quiescent state.

<p>
For RCU-preempt, there is no specific check for idle in the
IPI handler (<tt>sync_rcu_exp_handler()</tt>), but because
IPI handler (<tt>rcu_exp_handler()</tt>), but because
RCU read-side critical sections are not permitted within the
idle loop, if <tt>sync_rcu_exp_handler()</tt> sees that the CPU is within
idle loop, if <tt>rcu_exp_handler()</tt> sees that the CPU is within
RCU read-side critical section, the CPU cannot possibly be idle.
Otherwise, <tt>sync_rcu_exp_handler()</tt> invokes
Otherwise, <tt>rcu_exp_handler()</tt> invokes
<tt>rcu_report_exp_rdp()</tt> to report the corresponding quiescent
state, regardless of whether or not that quiescent state was due to
the CPU being idle.
Expand Down Expand Up @@ -626,6 +624,8 @@ <h3><a name="Mid-Boot Operation">Mid-boot operation</a></h3>
<p>
With this refinement, synchronous grace periods can now be used from
task context pretty much any time during the life of the kernel.
That is, aside from some points in the suspend, hibernate, or shutdown
code path.

<h3><a name="Summary">
Summary</a></h3>
Expand Down
121 changes: 95 additions & 26 deletions Documentation/RCU/Design/Requirements/Requirements.html
Original file line number Diff line number Diff line change
Expand Up @@ -2079,6 +2079,8 @@ <h2><a name="Linux Kernel Complications">Linux Kernel Complications</a></h2>
<li> <a href="#Hotplug CPU">Hotplug CPU</a>.
<li> <a href="#Scheduler and RCU">Scheduler and RCU</a>.
<li> <a href="#Tracing and RCU">Tracing and RCU</a>.
<li> <a href="#Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a>.
<li> <a href="#Energy Efficiency">Energy Efficiency</a>.
<li> <a href="#Scheduling-Clock Interrupts and RCU">
Scheduling-Clock Interrupts and RCU</a>.
Expand Down Expand Up @@ -2393,30 +2395,9 @@ <h3><a name="Scheduler and RCU">Scheduler and RCU</a></h3>
<p>
RCU depends on the scheduler, and the scheduler uses RCU to
protect some of its data structures.
This means the scheduler is forbidden from acquiring
the runqueue locks and the priority-inheritance locks
in the middle of an outermost RCU read-side critical section unless either
(1)&nbsp;it releases them before exiting that same
RCU read-side critical section, or
(2)&nbsp;interrupts are disabled across
that entire RCU read-side critical section.
This same prohibition also applies (recursively!) to any lock that is acquired
while holding any lock to which this prohibition applies.
Adhering to this rule prevents preemptible RCU from invoking
<tt>rcu_read_unlock_special()</tt> while either runqueue or
priority-inheritance locks are held, thus avoiding deadlock.

<p>
Prior to v4.4, it was only necessary to disable preemption across
RCU read-side critical sections that acquired scheduler locks.
In v4.4, expedited grace periods started using IPIs, and these
IPIs could force a <tt>rcu_read_unlock()</tt> to take the slowpath.
Therefore, this expedited-grace-period change required disabling of
interrupts, not just preemption.

<p>
For RCU's part, the preemptible-RCU <tt>rcu_read_unlock()</tt>
implementation must be written carefully to avoid similar deadlocks.
The preemptible-RCU <tt>rcu_read_unlock()</tt>
implementation must therefore be written carefully to avoid deadlocks
involving the scheduler's runqueue and priority-inheritance locks.
In particular, <tt>rcu_read_unlock()</tt> must tolerate an
interrupt where the interrupt handler invokes both
<tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>.
Expand All @@ -2425,7 +2406,7 @@ <h3><a name="Scheduler and RCU">Scheduler and RCU</a></h3>
interrupt handler's use of RCU.

<p>
This pair of mutual scheduler-RCU requirements came as a
This scheduler-RCU requirement came as a
<a href="https://lwn.net/Articles/453002/">complete surprise</a>.

<p>
Expand All @@ -2436,9 +2417,28 @@ <h3><a name="Scheduler and RCU">Scheduler and RCU</a></h3>
<tt>CONFIG_NO_HZ_FULL=y</tt>
<a href="http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf">did come as a surprise [PDF]</a>.
RCU has made good progress towards meeting this requirement, even
for context-switch-have <tt>CONFIG_NO_HZ_FULL=y</tt> workloads,
for context-switch-heavy <tt>CONFIG_NO_HZ_FULL=y</tt> workloads,
but there is room for further improvement.

<p>
In the past, it was forbidden to disable interrupts across an
<tt>rcu_read_unlock()</tt> unless that interrupt-disabled region
of code also included the matching <tt>rcu_read_lock()</tt>.
Violating this restriction could result in deadlocks involving the
scheduler's runqueue and priority-inheritance spinlocks.
This restriction was lifted when interrupt-disabled calls to
<tt>rcu_read_unlock()</tt> started deferring the reporting of
the resulting RCU-preempt quiescent state until the end of that
interrupts-disabled region.
This deferred reporting means that the scheduler's runqueue and
priority-inheritance locks cannot be held while reporting an RCU-preempt
quiescent state, which lifts the earlier restriction, at least from
a deadlock perspective.
Unfortunately, real-time systems using RCU priority boosting may
need this restriction to remain in effect because deferred
quiescent-state reporting also defers deboosting, which in turn
degrades real-time latencies.

<h3><a name="Tracing and RCU">Tracing and RCU</a></h3>

<p>
Expand All @@ -2453,6 +2453,75 @@ <h3><a name="Tracing and RCU">Tracing and RCU</a></h3>
The tracing folks both located the requirement and provided the
needed fix, so this surprise requirement was relatively painless.

<h3><a name="Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a></h3>

<p>
The kernel needs to access user-space memory, for example, to access
data referenced by system-call parameters.
The <tt>get_user()</tt> macro does this job.

<p>
However, user-space memory might well be paged out, which means
that <tt>get_user()</tt> might well page-fault and thus block while
waiting for the resulting I/O to complete.
It would be a very bad thing for the compiler to reorder
a <tt>get_user()</tt> invocation into an RCU read-side critical
section.
For example, suppose that the source code looked like this:

<blockquote>
<pre>
1 rcu_read_lock();
2 p = rcu_dereference(gp);
3 v = p-&gt;value;
4 rcu_read_unlock();
5 get_user(user_v, user_p);
6 do_something_with(v, user_v);
</pre>
</blockquote>

<p>
The compiler must not be permitted to transform this source code into
the following:

<blockquote>
<pre>
1 rcu_read_lock();
2 p = rcu_dereference(gp);
3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
4 v = p-&gt;value;
5 rcu_read_unlock();
6 do_something_with(v, user_v);
</pre>
</blockquote>

<p>
If the compiler did make this transformation in a
<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
page fault, the result would be a quiescent state in the middle
of an RCU read-side critical section.
This misplaced quiescent state could result in line&nbsp;4 being
a use-after-free access, which could be bad for your kernel's
actuarial statistics.
Similar examples can be constructed with the call to <tt>get_user()</tt>
preceding the <tt>rcu_read_lock()</tt>.

<p>
Unfortunately, <tt>get_user()</tt> doesn't have any particular
ordering properties, and in some architectures the underlying <tt>asm</tt>
isn't even marked <tt>volatile</tt>.
And even if it was marked <tt>volatile</tt>, the above access to
<tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
reason to keep those two accesses in order.

<p>
Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
at least for outermost instances of <tt>rcu_read_lock()</tt> and
<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
sections.

<h3><a name="Energy Efficiency">Energy Efficiency</a></h3>

<p>
Expand Down
61 changes: 52 additions & 9 deletions Documentation/RCU/stallwarn.txt
Original file line number Diff line number Diff line change
Expand Up @@ -230,15 +230,58 @@ handlers are no longer able to execute on this CPU. This can happen if
the stalled CPU is spinning with interrupts are disabled, or, in -rt
kernels, if a high-priority process is starving RCU's softirq handler.

For CONFIG_RCU_FAST_NO_HZ kernels, the "last_accelerate:" prints the
low-order 16 bits (in hex) of the jiffies counter when this CPU last
invoked rcu_try_advance_all_cbs() from rcu_needs_cpu() or last invoked
rcu_accelerate_cbs() from rcu_prepare_for_idle(). The "nonlazy_posted:"
prints the number of non-lazy callbacks posted since the last call to
rcu_needs_cpu(). Finally, an "L" indicates that there are currently
no non-lazy callbacks ("." is printed otherwise, as shown above) and
"D" indicates that dyntick-idle processing is enabled ("." is printed
otherwise, for example, if disabled via the "nohz=" kernel boot parameter).
The "fqs=" shows the number of force-quiescent-state idle/offline
detection passes that the grace-period kthread has made across this
CPU since the last time that this CPU noted the beginning of a grace
period.

The "detected by" line indicates which CPU detected the stall (in this
case, CPU 32), how many jiffies have elapsed since the start of the grace
period (in this case 2603), the grace-period sequence number (7075), and
an estimate of the total number of RCU callbacks queued across all CPUs
(625 in this case).

In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
for each CPU:

0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 Nonlazy posted: ..D

The "last_accelerate:" prints the low-order 16 bits (in hex) of the
jiffies counter when this CPU last invoked rcu_try_advance_all_cbs()
from rcu_needs_cpu() or last invoked rcu_accelerate_cbs() from
rcu_prepare_for_idle(). The "Nonlazy posted:" indicates lazy-callback
status, so that an "l" indicates that all callbacks were lazy at the start
of the last idle period and an "L" indicates that there are currently
no non-lazy callbacks (in both cases, "." is printed otherwise, as
shown above) and "D" indicates that dyntick-idle processing is enabled
("." is printed otherwise, for example, if disabled via the "nohz="
kernel boot parameter).

If the grace period ends just as the stall warning starts printing,
there will be a spurious stall-warning message, which will include
the following:

INFO: Stall ended before state dump start

This is rare, but does happen from time to time in real life. It is also
possible for a zero-jiffy stall to be flagged in this case, depending
on how the stall warning and the grace-period initialization happen to
interact. Please note that it is not possible to entirely eliminate this
sort of false positive without resorting to things like stop_machine(),
which is overkill for this sort of problem.

If all CPUs and tasks have passed through quiescent states, but the
grace period has nevertheless failed to end, the stall-warning splat
will include something like the following:

All QSes seen, last rcu_preempt kthread activity 23807 (4297905177-4297881370), jiffies_till_next_fqs=3, root ->qsmask 0x0

The "23807" indicates that it has been more than 23 thousand jiffies
since the grace-period kthread ran. The "jiffies_till_next_fqs"
indicates how frequently that kthread should run, giving the number
of jiffies between force-quiescent-state scans, in this case three,
which is way less than 23807. Finally, the root rcu_node structure's
->qsmask field is printed, which will normally be zero.

If the relevant grace-period kthread has been unable to run prior to
the stall warning, the following additional line is printed:
Expand Down
10 changes: 6 additions & 4 deletions Documentation/RCU/whatisRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -210,17 +210,17 @@ synchronize_rcu()

rcu_assign_pointer()

typeof(p) rcu_assign_pointer(p, typeof(p) v);
void rcu_assign_pointer(p, typeof(p) v);

Yes, rcu_assign_pointer() -is- implemented as a macro, though it
would be cool to be able to declare a function in this manner.
(Compiler experts will no doubt disagree.)

The updater uses this function to assign a new value to an
RCU-protected pointer, in order to safely communicate the change
in value from the updater to the reader. This function returns
the new value, and also executes any memory-barrier instructions
required for a given CPU architecture.
in value from the updater to the reader. This macro does not
evaluate to an rvalue, but it does execute any memory-barrier
instructions required for a given CPU architecture.

Perhaps just as important, it serves to document (1) which
pointers are protected by RCU and (2) the point at which a
Expand Down Expand Up @@ -815,11 +815,13 @@ RCU list traversal:
list_next_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
list_for_each_entry_from_rcu
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_bh
hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
Expand Down
Loading