linux-stable/mm
Gavin Guo 22f6368768 mm/huge_memory: fix dereferencing invalid pmd migration entry
commit be6e843fc5 upstream.

When migrating a THP, concurrent access to the PMD migration entry during
a deferred split scan can lead to an invalid address access, as
illustrated below.  To prevent this invalid access, it is necessary to
check the PMD migration entry and return early.  In this context, there is
no need to use pmd_to_swp_entry and pfn_swap_entry_to_page to verify the
equality of the target folio.  Since the PMD migration entry is locked, it
cannot be served as the target.

Mailing list discussion and explanation from Hugh Dickins: "An anon_vma
lookup points to a location which may contain the folio of interest, but
might instead contain another folio: and weeding out those other folios is
precisely what the "folio != pmd_folio((*pmd)" check (and the "risk of
replacing the wrong folio" comment a few lines above it) is for."

BUG: unable to handle page fault for address: ffffea60001db008
CPU: 0 UID: 0 PID: 2199114 Comm: tee Not tainted 6.14.0+ #4 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:split_huge_pmd_locked+0x3b5/0x2b60
Call Trace:
<TASK>
try_to_migrate_one+0x28c/0x3730
rmap_walk_anon+0x4f6/0x770
unmap_folio+0x196/0x1f0
split_huge_page_to_list_to_order+0x9f6/0x1560
deferred_split_scan+0xac5/0x12a0
shrinker_debugfs_scan_write+0x376/0x470
full_proxy_write+0x15c/0x220
vfs_write+0x2fc/0xcb0
ksys_write+0x146/0x250
do_syscall_64+0x6a/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e

The bug is found by syzkaller on an internal kernel, then confirmed on
upstream.

Link: https://lkml.kernel.org/r/20250421113536.3682201-1-gavinguo@igalia.com
Link: https://lore.kernel.org/all/20250414072737.1698513-1-gavinguo@igalia.com/
Link: https://lore.kernel.org/all/20250418085802.2973519-1-gavinguo@igalia.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Gavin Guo <gavinguo@igalia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Florent Revest <revest@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[gavin: backport the migration checking logic to __split_huge_pmd]
Signed-off-by: Gavin Guo <gavinguo@igalia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27 11:07:38 +01:00
..
damon mm/damon/vaddr: fix issue in damon_va_evenly_split_region() 2024-12-14 19:54:52 +01:00
kasan kasan: make report_lock a raw spinlock 2024-12-14 19:54:50 +01:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-21 13:49:48 +01:00
kmsan dma: kmsan: export kmsan_handle_dma() for modules 2025-03-13 12:53:15 +01:00
backing-dev.c
balloon_compaction.c
bootmem_info.c
cma_debug.c
cma_sysfs.c
cma.c
cma.h
compaction.c
debug_page_ref.c
debug_vm_pgtable.c
debug.c
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c mm: fix filemap_get_folios_contig returning batches of identical folios 2025-04-25 10:43:54 +02:00
folio-compat.c
frontswap.c
gup_test.c
gup_test.h
gup.c mm: Fix is_zero_page() usage in try_grab_page() 2025-04-25 10:44:01 +02:00
highmem.c
hmm.c
huge_memory.c mm/huge_memory: fix dereferencing invalid pmd migration entry 2025-06-27 11:07:38 +01:00
hugetlb_cgroup.c
hugetlb_vmemmap.c
hugetlb_vmemmap.h
hugetlb.c mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race 2025-06-27 11:07:38 +01:00
hwpoison-inject.c
init-mm.c
internal.h mm: unconditionally close VMAs on error 2024-11-22 15:37:34 +01:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig
Kconfig.debug
khugepaged.c
kmemleak.c mm: kmemleak: fix upper boundary check for physical address objects 2025-02-21 13:49:49 +01:00
ksm.c
list_lru.c
maccess.c
madvise.c mm,madvise,hugetlb: check for 0-length range after end address adjustment 2025-03-07 16:56:39 +01:00
Makefile
mapping_dirty_helpers.c
memblock.c
memcontrol.c memcg: always call cond_resched() after fn() 2025-06-04 14:40:21 +02:00
memfd.c
memory_hotplug.c hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio 2025-05-22 14:10:09 +02:00
memory-failure.c mm/hwpoison: do not send SIGBUS to processes with recovered clean pages 2025-04-25 10:43:42 +02:00
memory-tiers.c
memory.c mm: fix apply_to_existing_page_range() 2025-04-25 10:44:04 +02:00
mempolicy.c
mempool.c
memremap.c
memtest.c
migrate_device.c
migrate.c mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index 2025-05-22 14:10:08 +02:00
mincore.c
mlock.c
mm_init.c
mm_slot.h
mmap_lock.c
mmap.c mm/hugetlb: unshare page tables during VMA split, not before 2025-06-27 11:07:38 +01:00
mmu_gather.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c
msync.c
nommu.c mm: add nommu variant of vm_insert_pages() 2025-03-28 21:58:53 +01:00
oom_kill.c memcg: fix soft lockup in the OOM process 2025-03-07 16:56:29 +01:00
page_alloc.c mm/page_alloc.c: avoid infinite retries caused by cpuset race 2025-06-04 14:40:21 +02:00
page_counter.c
page_ext.c
page_idle.c
page_io.c
page_isolation.c
page_owner.c
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c
page_vma_mapped.c
page-writeback.c mm: fix ratelimit_pages update error in dirty_ratio_handler() 2025-06-27 11:07:30 +01:00
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c
pgalloc-track.h
pgtable-generic.c
process_vm_access.c
ptdump.c
readahead.c mm/readahead: fix large folio support in async readahead 2025-01-09 13:30:06 +01:00
rmap.c mm/rmap: reject hugetlb folios in folio_make_device_exclusive() 2025-04-25 10:43:42 +02:00
rodata_test.c
secretmem.c
shmem.c mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling 2024-11-22 15:37:34 +01:00
shrinker_debug.c
shuffle.c
shuffle.h
slab_common.c mm: krealloc: Fix MTE false alarm in __do_krealloc 2024-11-17 15:07:22 +01:00
slab.c
slab.h
slob.c
slub.c
sparse-vmemmap.c
sparse.c
swap_cgroup.c
swap_slots.c
swap_state.c
swap.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:54:31 +01:00
swap.h
swapfile.c
truncate.c
usercopy.c
userfaultfd.c
util.c mm: unconditionally close VMAs on error 2024-11-22 15:37:34 +01:00
vmalloc.c mm: don't skip arch_sync_kernel_mappings() in error paths 2025-03-13 12:53:15 +01:00
vmpressure.c
vmscan.c mm: add missing release barrier on PGDAT_RECLAIM_LOCKED unlock 2025-04-25 10:43:42 +02:00
vmstat.c vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event 2024-12-14 19:54:13 +01:00
workingset.c
z3fold.c
zbud.c
zpool.c
zsmalloc.c
zswap.c