blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
We're seeing crashes from rq_qos_wake_function that look like this:
BUG: unable to handle page fault for address: ffffafe180a40084
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
Oops: Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
try_to_wake_up+0x5a/0x6a0
rq_qos_wake_function+0x71/0x80
__wake_up_common+0x75/0xa0
__wake_up+0x36/0x60
scale_up.part.0+0x50/0x110
wb_timer_fn+0x227/0x450
...
So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).
p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.
What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:
rq_qos_wait() rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
data->got_token = true;
list_del_init(&curr->entry);
if (data.got_token)
break;
finish_wait(&rqw->wait, &data.wq);
^- returns immediately because
list_empty_careful(&wq_entry->entry)
is true
... return, go do something else ...
wake_up_process(data->task)
(NO LONGER VALID!)-^
Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.
The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.
Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().
Analysis and contextual insights are available on OpenCVE Cloud.
No vendor fix or workaround currently provided.
Additional remediation guidance may be available on OpenCVE Cloud.
Tracking
Sign in to view the affected projects.
| Source | ID | Title |
|---|---|---|
Debian DLA |
DLA-4008-1 | linux-6.1 security update |
Debian DLA |
DLA-4075-1 | linux security update |
Ubuntu USN |
USN-7276-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7277-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7288-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7288-2 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7289-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7289-2 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7289-3 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7289-4 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7291-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7293-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7294-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7294-2 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7294-3 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7294-4 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7295-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7305-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7308-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7310-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7331-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7383-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7383-2 | Linux kernel (Real-time) vulnerabilities |
Ubuntu USN |
USN-7384-1 | Linux kernel (Azure) vulnerabilities |
Ubuntu USN |
USN-7384-2 | Linux kernel (Azure) vulnerabilities |
Ubuntu USN |
USN-7385-1 | Linux kernel (IBM) vulnerabilities |
Ubuntu USN |
USN-7386-1 | Linux kernel (OEM) vulnerabilities |
Ubuntu USN |
USN-7388-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7389-1 | Linux kernel (NVIDIA Tegra) vulnerabilities |
Ubuntu USN |
USN-7390-1 | Linux kernel (Xilinx ZynqMP) vulnerabilities |
Ubuntu USN |
USN-7393-1 | Linux kernel (FIPS) vulnerabilities |
Ubuntu USN |
USN-7401-1 | Linux kernel (AWS) vulnerabilities |
Ubuntu USN |
USN-7403-1 | Linux kernel (HWE) vulnerabilities |
Ubuntu USN |
USN-7413-1 | Linux kernel (IoT) vulnerabilities |
Ubuntu USN |
USN-7451-1 | Linux kernel vulnerabilities |
Ubuntu USN |
USN-7458-1 | Linux kernel (IBM) vulnerabilities |
Ubuntu USN |
USN-7468-1 | Linux kernel (Azure, N-Series) vulnerabilities |
Ubuntu USN |
USN-7523-1 | Linux kernel (Raspberry Pi Real-time) vulnerabilities |
Ubuntu USN |
USN-7524-1 | Linux kernel (Raspberry Pi) vulnerabilities |
Ubuntu USN |
USN-7539-1 | Linux kernel (Raspberry Pi) vulnerabilities |
Ubuntu USN |
USN-7540-1 | Linux kernel (Raspberry Pi) vulnerabilities |
Mon, 03 Nov 2025 23:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| References |
|
Mon, 03 Nov 2025 21:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| References |
|
Tue, 15 Jul 2025 13:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Metrics |
epss
|
epss
|
Wed, 14 May 2025 02:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| CPEs | cpe:/a:redhat:enterprise_linux:9 cpe:/o:redhat:enterprise_linux:9 |
Thu, 27 Feb 2025 14:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Weaknesses | CWE-362 | |
| Metrics |
threat_severity
|
threat_severity
|
Thu, 12 Dec 2024 02:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| First Time appeared |
Redhat
Redhat enterprise Linux |
|
| CPEs | cpe:/a:redhat:enterprise_linux:8::nfv cpe:/o:redhat:enterprise_linux:8 |
|
| Vendors & Products |
Redhat
Redhat enterprise Linux |
Fri, 08 Nov 2024 16:15:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| References |
|
Wed, 30 Oct 2024 16:00:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| First Time appeared |
Linux
Linux linux Kernel |
|
| Weaknesses | NVD-CWE-noinfo | |
| CPEs | cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* cpe:2.3:o:linux:linux_kernel:6.12:rc1:*:*:*:*:*:* cpe:2.3:o:linux:linux_kernel:6.12:rc2:*:*:*:*:*:* cpe:2.3:o:linux:linux_kernel:6.12:rc3:*:*:*:*:*:* |
|
| Vendors & Products |
Linux
Linux linux Kernel |
|
| Metrics |
cvssV3_1
|
cvssV3_1
|
Wed, 30 Oct 2024 02:00:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| References |
| |
| Metrics |
threat_severity
|
cvssV3_1
|
Tue, 29 Oct 2024 01:00:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | In the Linux kernel, the following vulnerability has been resolved: blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). | |
| Title | blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race | |
| References |
|
|
Status: PUBLISHED
Assigner: Linux
Published:
Updated: 2026-05-11T20:45:05.420Z
Reserved: 2024-10-21T19:36:19.941Z
Link: CVE-2024-50082
No data.
Status : Modified
Published: 2024-10-29T01:15:05.147
Modified: 2025-11-03T23:16:48.100
Link: CVE-2024-50082
OpenCVE Enrichment
No data.
Debian DLA
Ubuntu USN