linux-stable/kernel
Waiman Long 34c3bc762b cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
[ Upstream commit 65f97cc81b ]

The following lockdep splat was observed.

[  812.359086] ============================================
[  812.359089] WARNING: possible recursive locking detected
[  812.359097] --------------------------------------------
[  812.359100] runtest.sh/30042 is trying to acquire lock:
[  812.359105] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0xe/0x20
[  812.359131]
[  812.359131] but task is already holding lock:
[  812.359134] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_write_resmask+0x98/0xa70
     :
[  812.359267] Call Trace:
[  812.359272]  <TASK>
[  812.359367]  cpus_read_lock+0x3c/0xe0
[  812.359382]  static_key_enable+0xe/0x20
[  812.359389]  check_insane_mems_config.part.0+0x11/0x30
[  812.359398]  cpuset_write_resmask+0x9f2/0xa70
[  812.359411]  cgroup_file_write+0x1c7/0x660
[  812.359467]  kernfs_fop_write_iter+0x358/0x530
[  812.359479]  vfs_write+0xabe/0x1250
[  812.359529]  ksys_write+0xf9/0x1d0
[  812.359558]  do_syscall_64+0x5f/0xe0

Since commit d74b27d63a ("cgroup/cpuset: Change cpuset_rwsem
and hotplug lock order"), the ordering of cpu hotplug lock
and cpuset_mutex had been reversed. That patch correctly
used the cpuslocked version of the static branch API to enable
cpusets_pre_enable_key and cpusets_enabled_key, but it didn't do the
same for cpusets_insane_config_key.

The cpusets_insane_config_key can be enabled in the
check_insane_mems_config() which is called from update_nodemask()
or cpuset_hotplug_update_tasks() with both cpu hotplug lock and
cpuset_mutex held. Deadlock can happen with a pending hotplug event that
tries to acquire the cpu hotplug write lock which will block further
cpus_read_lock() attempt from check_insane_mems_config(). Fix that by
switching to use static_branch_enable_cpuslocked().

Fixes: d74b27d63a ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28 16:28:47 +02:00
..
bpf bpf: Make reg_not_null() true for CONST_PTR_TO_MAP 2025-08-28 16:28:24 +02:00
cgroup cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key 2025-08-28 16:28:47 +02:00
configs
debug
dma dma/contiguous: avoid warning about unused size_bytes 2025-05-02 07:50:42 +02:00
entry
events perf/core: Prevent VMA split of buffer mappings 2025-08-15 12:09:05 +02:00
futex
gcov
irq genirq: Make handle_enforce_irqctx() unconditionally available 2025-02-08 09:51:48 +01:00
kcsan kcsan: test: Initialize dummy variable 2025-08-15 12:08:49 +02:00
livepatch
locking locking/lockdep: Decrease nr_unused_locks if lock unused in zap_class() 2025-04-25 10:45:29 +02:00
module module: Prevent silent truncation of module name in delete_module(2) 2025-08-28 16:28:28 +02:00
power PM: sleep: console: Fix the black screen issue 2025-08-28 16:28:17 +02:00
printk printk: Check CON_SUSPEND when unblanking a console 2025-06-04 14:42:00 +02:00
rcu rcu: Fix racy re-initialization of irq_work causing hangs 2025-08-28 16:28:32 +02:00
sched sched/fair: Fix frequency selection for non-invariant case 2025-08-28 16:28:43 +02:00
time clocksource: Fix the CPUs' choice in the watchdog per CPU verification 2025-06-27 11:08:51 +01:00
trace tracing: Limit access to parser->buffer when trace_get_user failed 2025-08-28 16:28:46 +02:00
.gitignore
acct.c acct: block access to kernel internal filesystems 2025-02-27 04:10:52 -08:00
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c
audit.c
audit.h audit,module: restore audit logging in load failure case 2025-08-15 12:08:39 +02:00
auditfilter.c
auditsc.c audit,module: restore audit logging in load failure case 2025-08-15 12:08:39 +02:00
backtracetest.c
bounds.c
capability.c
cfi.c
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c hrtimers: Handle CPU state correctly on hotplug 2025-01-23 17:21:17 +01:00
crash_core.c
crash_dump.c
cred.c
delayacct.c
dma.c
exec_domain.c
exit.c perf: Fix sample vs do_exit() 2025-06-27 11:09:03 +01:00
extable.c
fail_function.c
fork.c mm: drop the assumption that VM_SHARED always implies writable 2025-08-28 16:28:39 +02:00
freezer.c sched,freezer: Remove unnecessary warning in __thaw_task 2025-08-15 12:09:07 +02:00
gen_kheaders.sh kheaders: Ignore silly-rename files 2025-01-23 17:21:13 +01:00
groups.c
hung_task.c
iomem.c
irq_work.c
jump_label.c jump_label: Fix static_key_slow_dec() yet again 2024-10-10 11:57:13 +02:00
kallsyms_internal.h
kallsyms_selftest.c
kallsyms_selftest.h
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.kexec
Kconfig.locks
Kconfig.preempt
kcov.c kcov: mark in_softirq_really() as __always_inline 2025-01-09 13:32:07 +01:00
kexec_core.c
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-04-10 14:37:34 +02:00
kexec_file.c
kexec_internal.h
kexec.c
kheaders.c
kprobes.c
ksyms_common.c
ksysfs.c
kthread.c kthread: unpark only parked kthread 2024-10-17 15:24:37 +02:00
latencytop.c
Makefile
module_signature.c
notifier.c
nsproxy.c
numa.c
padata.c padata: do not leak refcount in reorder_work 2025-06-04 14:42:19 +02:00
panic.c objtool, panic: Disable SMAP in __stack_chk_fail() 2025-05-02 07:50:55 +02:00
params.c module: ensure that kobject_put() is safe for module type kobjects 2025-05-18 08:24:08 +02:00
pid_namespace.c
pid_sysctl.h
pid.c
profile.c
ptrace.c
range.c
reboot.c
regset.c
relay.c
resource_kunit.c
resource.c resource: fix false warning in __request_region() 2025-08-01 09:47:31 +01:00
rseq.c rseq: Fix segfault on registration when rseq_cs is non-zero 2025-07-17 18:35:22 +02:00
scftorture.c
scs.c
seccomp.c
signal.c posix-timers: Target group sigqueue to current task only if not exiting 2024-12-09 10:33:11 +01:00
smp.c
smpboot.c
smpboot.h
softirq.c lockdep: Fix wait context check on softirq for PREEMPT_RT 2025-06-04 14:41:55 +02:00
stackleak.c
stacktrace.c
static_call_inline.c x86/static-call: provide a way to do very early static-call updates 2024-12-19 18:11:36 +01:00
static_call.c
stop_machine.c
sys_ni.c
sys.c hrtimer: Use and report correct timerslack values for realtime tasks 2025-03-22 12:50:37 -07:00
sysctl-test.c
sysctl.c
task_work.c task_work: make TWA_NMI_CURRENT handling conditional on IRQ_WORK 2024-11-01 01:58:34 +01:00
taskstats.c
torture.c
tracepoint.c
tsacct.c
ucount.c ucount: fix atomic_long_inc_below() argument type 2025-08-15 12:08:57 +02:00
uid16.c
uid16.h
umh.c
up.c
user_namespace.c
user-return-notifier.c
user.c
usermode_driver.c
utsname_sysctl.c
utsname.c
vhost_task.c
watch_queue.c watch_queue: fix pipe accounting mismatch 2025-04-10 14:37:25 +02:00
watchdog_buddy.c
watchdog_perf.c
watchdog.c watchdog: fix watchdog may detect false positive of softlockup 2025-06-27 11:08:49 +01:00
workqueue_internal.h
workqueue.c workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker 2025-01-17 13:36:25 +01:00