linux-stable/kernel
Christian Loehle f9b8d4dba8 sched: Fix sched_numa_find_nth_cpu() if mask offline
commit 5ebf512f33 upstream.

sched_numa_find_nth_cpu() uses a bsearch to look for the 'closest'
CPU in sched_domains_numa_masks and given cpus mask. However they
might not intersect if all CPUs in the cpus mask are offline. bsearch
will return NULL in that case, bail out instead of dereferencing a
bogus pointer.

The previous behaviour lead to this bug when using maxcpus=4 on an
rk3399 (LLLLbb) (i.e. booting with all big CPUs offline):

[    1.422922] Unable to handle kernel paging request at virtual address ffffff8000000000
[    1.423635] Mem abort info:
[    1.423889]   ESR = 0x0000000096000006
[    1.424227]   EC = 0x25: DABT (current EL), IL = 32 bits
[    1.424715]   SET = 0, FnV = 0
[    1.424995]   EA = 0, S1PTW = 0
[    1.425279]   FSC = 0x06: level 2 translation fault
[    1.425735] Data abort info:
[    1.425998]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[    1.426499]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    1.426952]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    1.427428] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000004a9f000
[    1.428038] [ffffff8000000000] pgd=18000000f7fff403, p4d=18000000f7fff403, pud=18000000f7fff403, pmd=0000000000000000
[    1.429014] Internal error: Oops: 0000000096000006 [#1]  SMP
[    1.429525] Modules linked in:
[    1.429813] CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc4-dirty #343 PREEMPT
[    1.430559] Hardware name: Pine64 RockPro64 v2.1 (DT)
[    1.431012] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.431634] pc : sched_numa_find_nth_cpu+0x2a0/0x488
[    1.432094] lr : sched_numa_find_nth_cpu+0x284/0x488
[    1.432543] sp : ffffffc084e1b960
[    1.432843] x29: ffffffc084e1b960 x28: ffffff80078a8800 x27: ffffffc0846eb1d0
[    1.433495] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[    1.434144] x23: 0000000000000000 x22: fffffffffff7f093 x21: ffffffc081de6378
[    1.434792] x20: 0000000000000000 x19: 0000000ffff7f093 x18: 00000000ffffffff
[    1.435441] x17: 3030303866666666 x16: 66663d736b73616d x15: ffffffc104e1b5b7
[    1.436091] x14: 0000000000000000 x13: ffffffc084712860 x12: 0000000000000372
[    1.436739] x11: 0000000000000126 x10: ffffffc08476a860 x9 : ffffffc084712860
[    1.437389] x8 : 00000000ffffefff x7 : ffffffc08476a860 x6 : 0000000000000000
[    1.438036] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
[    1.438683] x2 : 0000000000000000 x1 : ffffffc0846eb000 x0 : ffffff8000407b68
[    1.439332] Call trace:
[    1.439559]  sched_numa_find_nth_cpu+0x2a0/0x488 (P)
[    1.440016]  smp_call_function_any+0xc8/0xd0
[    1.440416]  armv8_pmu_init+0x58/0x27c
[    1.440770]  armv8_cortex_a72_pmu_init+0x20/0x2c
[    1.441199]  arm_pmu_device_probe+0x1e4/0x5e8
[    1.441603]  armv8_pmu_device_probe+0x1c/0x28
[    1.442007]  platform_probe+0x5c/0xac
[    1.442347]  really_probe+0xbc/0x298
[    1.442683]  __driver_probe_device+0x78/0x12c
[    1.443087]  driver_probe_device+0xdc/0x160
[    1.443475]  __driver_attach+0x94/0x19c
[    1.443833]  bus_for_each_dev+0x74/0xd4
[    1.444190]  driver_attach+0x24/0x30
[    1.444525]  bus_add_driver+0xe4/0x208
[    1.444874]  driver_register+0x60/0x128
[    1.445233]  __platform_driver_register+0x24/0x30
[    1.445662]  armv8_pmu_driver_init+0x28/0x4c
[    1.446059]  do_one_initcall+0x44/0x25c
[    1.446416]  kernel_init_freeable+0x1dc/0x3bc
[    1.446820]  kernel_init+0x20/0x1d8
[    1.447151]  ret_from_fork+0x10/0x20
[    1.447493] Code: 90022e21 f000e5f5 910de2b5 2a1703e2 (f8767803)
[    1.448040] ---[ end trace 0000000000000000 ]---
[    1.448483] note: swapper/0[1] exited with preempt_count 1
[    1.449047] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.449741] SMP: stopping secondary CPUs
[    1.450105] Kernel Offset: disabled
[    1.450419] CPU features: 0x000000,00080000,20002001,0400421b
[    1.450935] Memory Limit: none
[    1.451217] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Yury: with the fix, the function returns cpu == nr_cpu_ids, and later in

	smp_call_function_any ->
	  smp_call_function_single ->
	     generic_exec_single

we test the cpu for '>= nr_cpu_ids' and return -ENXIO. So everything is
handled correctly.

Fixes: cd7f55359c ("sched: add sched_numa_find_nth_cpu()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-09 18:56:26 +02:00
..
bpf bpf: Fix oob access in cgroup local storage 2025-09-09 18:56:19 +02:00
cgroup cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key 2025-08-28 16:28:47 +02:00
configs
debug
dma dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted 2025-09-04 15:30:27 +02:00
entry
events perf/core: Prevent VMA split of buffer mappings 2025-08-15 12:09:05 +02:00
futex
gcov
irq genirq: Make handle_enforce_irqctx() unconditionally available 2025-02-08 09:51:48 +01:00
kcsan kcsan: test: Initialize dummy variable 2025-08-15 12:08:49 +02:00
livepatch
locking locking/lockdep: Decrease nr_unused_locks if lock unused in zap_class() 2025-04-25 10:45:29 +02:00
module module: Prevent silent truncation of module name in delete_module(2) 2025-08-28 16:28:28 +02:00
power PM: sleep: console: Fix the black screen issue 2025-08-28 16:28:17 +02:00
printk printk: Check CON_SUSPEND when unblanking a console 2025-06-04 14:42:00 +02:00
rcu rcu: Fix racy re-initialization of irq_work causing hangs 2025-08-28 16:28:32 +02:00
sched sched: Fix sched_numa_find_nth_cpu() if mask offline 2025-09-09 18:56:26 +02:00
time clocksource: Fix the CPUs' choice in the watchdog per CPU verification 2025-06-27 11:08:51 +01:00
trace ftrace: Fix potential warning in trace_printk_seq during ftrace_dump 2025-09-04 15:30:19 +02:00
.gitignore
acct.c acct: block access to kernel internal filesystems 2025-02-27 04:10:52 -08:00
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c
audit.c
audit.h audit,module: restore audit logging in load failure case 2025-08-15 12:08:39 +02:00
auditfilter.c
auditsc.c audit,module: restore audit logging in load failure case 2025-08-15 12:08:39 +02:00
backtracetest.c
bounds.c
capability.c
cfi.c
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c hrtimers: Handle CPU state correctly on hotplug 2025-01-23 17:21:17 +01:00
crash_core.c
crash_dump.c
cred.c
delayacct.c
dma.c
exec_domain.c
exit.c perf: Fix sample vs do_exit() 2025-06-27 11:09:03 +01:00
extable.c
fail_function.c
fork.c mm: drop the assumption that VM_SHARED always implies writable 2025-08-28 16:28:39 +02:00
freezer.c sched,freezer: Remove unnecessary warning in __thaw_task 2025-08-15 12:09:07 +02:00
gen_kheaders.sh kheaders: Ignore silly-rename files 2025-01-23 17:21:13 +01:00
groups.c
hung_task.c
iomem.c
irq_work.c
jump_label.c jump_label: Fix static_key_slow_dec() yet again 2024-10-10 11:57:13 +02:00
kallsyms_internal.h
kallsyms_selftest.c
kallsyms_selftest.h
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.kexec
Kconfig.locks
Kconfig.preempt
kcov.c kcov: mark in_softirq_really() as __always_inline 2025-01-09 13:32:07 +01:00
kexec_core.c
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-04-10 14:37:34 +02:00
kexec_file.c
kexec_internal.h
kexec.c
kheaders.c
kprobes.c
ksyms_common.c
ksysfs.c
kthread.c kthread: unpark only parked kthread 2024-10-17 15:24:37 +02:00
latencytop.c
Makefile
module_signature.c
notifier.c
nsproxy.c
numa.c
padata.c padata: do not leak refcount in reorder_work 2025-06-04 14:42:19 +02:00
panic.c objtool, panic: Disable SMAP in __stack_chk_fail() 2025-05-02 07:50:55 +02:00
params.c module: ensure that kobject_put() is safe for module type kobjects 2025-05-18 08:24:08 +02:00
pid_namespace.c
pid_sysctl.h
pid.c
profile.c
ptrace.c
range.c
reboot.c
regset.c
relay.c
resource_kunit.c
resource.c resource: fix false warning in __request_region() 2025-08-01 09:47:31 +01:00
rseq.c rseq: Fix segfault on registration when rseq_cs is non-zero 2025-07-17 18:35:22 +02:00
scftorture.c
scs.c
seccomp.c
signal.c posix-timers: Target group sigqueue to current task only if not exiting 2024-12-09 10:33:11 +01:00
smp.c
smpboot.c
smpboot.h
softirq.c lockdep: Fix wait context check on softirq for PREEMPT_RT 2025-06-04 14:41:55 +02:00
stackleak.c
stacktrace.c
static_call_inline.c x86/static-call: provide a way to do very early static-call updates 2024-12-19 18:11:36 +01:00
static_call.c
stop_machine.c
sys_ni.c
sys.c hrtimer: Use and report correct timerslack values for realtime tasks 2025-03-22 12:50:37 -07:00
sysctl-test.c
sysctl.c
task_work.c task_work: make TWA_NMI_CURRENT handling conditional on IRQ_WORK 2024-11-01 01:58:34 +01:00
taskstats.c
torture.c
tracepoint.c
tsacct.c
ucount.c ucount: fix atomic_long_inc_below() argument type 2025-08-15 12:08:57 +02:00
uid16.c
uid16.h
umh.c
up.c
user_namespace.c
user-return-notifier.c
user.c
usermode_driver.c
utsname_sysctl.c
utsname.c
vhost_task.c
watch_queue.c watch_queue: fix pipe accounting mismatch 2025-04-10 14:37:25 +02:00
watchdog_buddy.c
watchdog_perf.c
watchdog.c watchdog: fix watchdog may detect false positive of softlockup 2025-06-27 11:08:49 +01:00
workqueue_internal.h
workqueue.c workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker 2025-01-17 13:36:25 +01:00