)]}'
{
  "commit": "318e18ed22e89397635e15095c014accaf47ed30",
  "tree": "82d817cd7321fa2ce4ec97bc83293371a5a8e818",
  "parents": [
    "1f382215119a0bc165e766e5bc424b3d3e8dae35"
  ],
  "author": {
    "name": "Pingfan Liu",
    "email": "piliu@redhat.com",
    "time": "Wed Nov 19 17:55:25 2025 +0800"
  },
  "committer": {
    "name": "Tejun Heo",
    "email": "tj@kernel.org",
    "time": "Thu Nov 20 06:57:58 2025 -1000"
  },
  "message": "sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug\n\n*** Bug description ***\nWhen testing kexec-reboot on a 144 cpus machine with\nisolcpus\u003dmanaged_irq,domain,1-71,73-143 in kernel command line, I\nencounter the following bug:\n\n[   97.114759] psci: CPU142 killed (polled 0 ms)\n[   97.333236] Failed to offline CPU143 - error\u003d-16\n[   97.333246] ------------[ cut here ]------------\n[   97.342682] kernel BUG at kernel/cpu.c:1569!\n[   97.347049] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP\n[...]\n\nIn essence, the issue originates from the CPU hot-removal process, not\nlimited to kexec. It can be reproduced by writing a SCHED_DEADLINE\nprogram that waits indefinitely on a semaphore, spawning multiple\ninstances to ensure some run on CPU 72, and then offlining CPUs 1–143\none by one. When attempting this, CPU 143 failed to go offline.\n  bash -c \u0027taskset -cp 0 $$ \u0026\u0026 for i in {1..143}; do echo 0 \u003e /sys/devices/system/cpu/cpu$i/online 2\u003e/dev/null; done\u0027\n\nTracking down this issue, I found that dl_bw_deactivate() returned\n-EBUSY, which caused sched_cpu_deactivate() to fail on the last CPU.\nBut that is not the fact, and contributed by the following factors:\nWhen a CPU is inactive, cpu_rq()-\u003erd is set to def_root_domain. For an\nblocked-state deadline task (in this case, \"cppc_fie\"), it was not\nmigrated to CPU0, and its task_rq() information is stale. So its rq-\u003erd\npoints to def_root_domain instead of the one shared with CPU0.  As a\nresult, its bandwidth is wrongly accounted into a wrong root domain\nduring domain rebuild.\n\n*** Issue ***\nThe key point is that root_domain is only tracked through active rq-\u003erd.\nTo avoid using a global data structure to track all root_domains in the\nsystem, there should be a method to locate an active CPU within the\ncorresponding root_domain.\n\n*** Solution ***\nTo locate the active cpu, the following rules for deadline\nsub-system is useful\n  -1.any cpu belongs to a unique root domain at a given time\n  -2.DL bandwidth checker ensures that the root domain has active cpus.\n\nNow, let\u0027s examine the blocked-state task P.\nIf P is attached to a cpuset that is a partition root, it is\nstraightforward to find an active CPU.\nIf P is attached to a cpuset that has changed from \u0027root\u0027 to \u0027member\u0027,\nthe active CPUs are grouped into the parent root domain. Naturally, the\nCPUs\u0027 capacity and reserved DL bandwidth are taken into account in the\nancestor root domain. (In practice, it may be unsafe to attach P to an\narbitrary root domain, since that domain may lack sufficient DL\nbandwidth for P.) Again, it is straightforward to find an active CPU in\nthe ancestor root domain.\n\nThis patch groups CPUs into isolated and housekeeping sets. For the\nhousekeeping group, it walks up the cpuset hierarchy to find active CPUs\nin P\u0027s root domain and retrieves the valid rd from cpu_rq(cpu)-\u003erd.\n\nSigned-off-by: Pingfan Liu \u003cpiliu@redhat.com\u003e\nCc: Waiman Long \u003clongman@redhat.com\u003e\nCc: Chen Ridong \u003cchenridong@huaweicloud.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Juri Lelli \u003cjuri.lelli@redhat.com\u003e\nCc: Pierre Gondois \u003cpierre.gondois@arm.com\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: Vincent Guittot \u003cvincent.guittot@linaro.org\u003e\nCc: Dietmar Eggemann \u003cdietmar.eggemann@arm.com\u003e\nCc: Steven Rostedt \u003crostedt@goodmis.org\u003e\nCc: Ben Segall \u003cbsegall@google.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Valentin Schneider \u003cvschneid@redhat.com\u003e\nTo: linux-kernel@vger.kernel.org\nSigned-off-by: Tejun Heo \u003ctj@kernel.org\u003e\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "7b7671060bf9ed51312edfba2cf130f44de8c39b",
      "old_mode": 33188,
      "old_path": "kernel/sched/deadline.c",
      "new_id": "194a341e85864c561a60650ba1f96c2e31782024",
      "new_mode": 33188,
      "new_path": "kernel/sched/deadline.c"
    }
  ]
}
