Go to file
Yihang Li 2174bbc235 scsi: hisi_sas: Add cond_resched() for no forced preemption model
[ Upstream commit 2233c4a0b9 ]

For no forced preemption model kernel, in the scenario where the
expander is connected to 12 high performance SAS SSDs, the following
call trace may occur:

[  214.409199][  C240] watchdog: BUG: soft lockup - CPU#240 stuck for 22s! [irq/149-hisi_sa:3211]
[  214.568533][  C240] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[  214.575224][  C240] pc : fput_many+0x8c/0xdc
[  214.579480][  C240] lr : fput+0x1c/0xf0
[  214.583302][  C240] sp : ffff80002de2b900
[  214.587298][  C240] x29: ffff80002de2b900 x28: ffff1082aa412000
[  214.593291][  C240] x27: ffff3062a0348c08 x26: ffff80003a9f6000
[  214.599284][  C240] x25: ffff1062bbac5c40 x24: 0000000000001000
[  214.605277][  C240] x23: 000000000000000a x22: 0000000000000001
[  214.611270][  C240] x21: 0000000000001000 x20: 0000000000000000
[  214.617262][  C240] x19: ffff3062a41ae580 x18: 0000000000010000
[  214.623255][  C240] x17: 0000000000000001 x16: ffffdb3a6efe5fc0
[  214.629248][  C240] x15: ffffffffffffffff x14: 0000000003ffffff
[  214.635241][  C240] x13: 000000000000ffff x12: 000000000000029c
[  214.641234][  C240] x11: 0000000000000006 x10: ffff80003a9f7fd0
[  214.647226][  C240] x9 : ffffdb3a6f0482fc x8 : 0000000000000001
[  214.653219][  C240] x7 : 0000000000000002 x6 : 0000000000000080
[  214.659212][  C240] x5 : ffff55480ee9b000 x4 : fffffde7f94c6554
[  214.665205][  C240] x3 : 0000000000000002 x2 : 0000000000000020
[  214.671198][  C240] x1 : 0000000000000021 x0 : ffff3062a41ae5b8
[  214.677191][  C240] Call trace:
[  214.680320][  C240]  fput_many+0x8c/0xdc
[  214.684230][  C240]  fput+0x1c/0xf0
[  214.687707][  C240]  aio_complete_rw+0xd8/0x1fc
[  214.692225][  C240]  blkdev_bio_end_io+0x98/0x140
[  214.696917][  C240]  bio_endio+0x160/0x1bc
[  214.701001][  C240]  blk_update_request+0x1c8/0x3bc
[  214.705867][  C240]  scsi_end_request+0x3c/0x1f0
[  214.710471][  C240]  scsi_io_completion+0x7c/0x1a0
[  214.715249][  C240]  scsi_finish_command+0x104/0x140
[  214.720200][  C240]  scsi_softirq_done+0x90/0x180
[  214.724892][  C240]  blk_mq_complete_request+0x5c/0x70
[  214.730016][  C240]  scsi_mq_done+0x48/0xac
[  214.734194][  C240]  sas_scsi_task_done+0xbc/0x16c [libsas]
[  214.739758][  C240]  slot_complete_v3_hw+0x260/0x760 [hisi_sas_v3_hw]
[  214.746185][  C240]  cq_thread_v3_hw+0xbc/0x190 [hisi_sas_v3_hw]
[  214.752179][  C240]  irq_thread_fn+0x34/0xa4
[  214.756435][  C240]  irq_thread+0xc4/0x130
[  214.760520][  C240]  kthread+0x108/0x13c
[  214.764430][  C240]  ret_from_fork+0x10/0x18

This is because in the hisi_sas driver, both the hardware interrupt
handler and the interrupt thread are executed on the same CPU. In the
performance test scenario, function irq_wait_for_interrupt() will always
return 0 if lots of interrupts occurs and the CPU will be continuously
consumed. As a result, the CPU cannot run the watchdog thread. When the
watchdog time exceeds the specified time, call trace occurs.

To fix it, add cond_resched() to execute the watchdog thread.

Signed-off-by: Yihang Li <liyihang9@huawei.com>
Link: https://lore.kernel.org/r/20241008021822.2617339-8-liyihang9@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14 19:51:40 +01:00
arch s390/cpum_sf: Handle CPU hotplug remove during sampling 2024-12-14 19:51:34 +01:00
block block: fix ordering between checking BLK_MQ_S_STOPPED request adding 2024-12-14 19:51:17 +01:00
certs
crypto crypto: pcrypt - Call crypto layer directly when padata_do_parallel() return -EBUSY 2024-12-14 19:50:44 +01:00
Documentation dt-bindings: serial: rs485: Fix rs485-rts-delay property 2024-12-14 19:51:29 +01:00
drivers scsi: hisi_sas: Add cond_resched() for no forced preemption model 2024-12-14 19:51:40 +01:00
fs jfs: add a check to prevent array-index-out-of-bounds in dbAdjTree 2024-12-14 19:51:39 +01:00
include epoll: annotate racy check 2024-12-14 19:51:34 +01:00
init initramfs: avoid filename buffer overrun 2024-12-14 19:50:42 +01:00
io_uring io_uring: fix possible deadlock in io_register_iowq_max_workers() 2024-11-17 15:06:25 +01:00
ipc
kernel tracing: Use atomic64_inc_return() in trace_clock_counter() 2024-12-14 19:51:40 +01:00
lib lib: string_helpers: silence snprintf() output truncation warning 2024-12-14 19:51:18 +01:00
LICENSES
mm mm: resolve faulty mmap_region() error path behaviour 2024-12-14 19:50:38 +01:00
net netpoll: Use rcu_access_pointer() in __netpoll_setup 2024-12-14 19:51:40 +01:00
samples samples/bpf: Fix a resource leak 2024-12-14 19:51:36 +01:00
scripts modpost: remove incorrect code in do_eisa_entry() 2024-12-14 19:51:21 +01:00
security apparmor: test: Fix memory leak for aa_unpack_strdup() 2024-12-14 19:51:14 +01:00
sound ASoC: hdmi-codec: reorder channel allocation list 2024-12-14 19:51:40 +01:00
tools kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() 2024-12-14 19:51:35 +01:00
usr
virt KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() 2024-10-22 15:40:41 +02:00
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore Remove *.orig pattern from .gitignore 2024-10-17 15:11:10 +02:00
.mailmap
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS
Makefile Linux 5.15.173 2024-11-17 15:06:26 +01:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.