Merge latest stable tag v6.12.94#283
Open
gratian wants to merge 1646 commits into
Open
Conversation
[ Upstream commit b003086d76968298f22e7cf62239833b5a3a06b1 ] smb2_oplock_break_noti() and smb2_lease_break_noti() read opinfo->conn into a local with neither READ_ONCE() nor a NULL check. Both run from oplock_break() after opinfo_get_list() has dropped ci->m_lock, so a concurrent SMB2 LOGOFF (session_fd_check()) can set op->conn = NULL under ci->m_lock within that window. ksmbd_conn_r_count_inc(conn) then writes through NULL at offset 0xc4 -- a remotely triggerable oops. Guard both reads the way compare_guid_key() already does: read opinfo->conn with READ_ONCE() and return early if it is NULL, before allocating the work struct so nothing leaks. A NULL conn means the client is gone and the break is moot, so return 0; oplock_break() treats that as success and runs the normal teardown. Fixes: c8efcc7 ("ksmbd: add support for durable handles v1/v2") Assisted-by: Henry (Claude):claude-opus-4 Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c6c5327dd18bec1e1bbf139b2cf5ae53608a9d30 ] With PREEMPT_RCU this triggers a splat because smp_processor_id() can be preempted while inside a RCU critical section. If xt_NFQUEUE target is invoked via nft_compat_eval() path, we are inside a RCU critical section. Just use the raw version instead. Fixes: 0ca743a ("netfilter: nf_tables: add compatibility layer for x_tables") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 193989cc6d80dd8e0460fb3992e69fa03bf0ff9b ] ip_vs_edit_service() while unbinding the old scheduler clears the svc->scheduler ptr after the scheduler module initiates RCU callbacks. This can cause packets to use the old scheduler at the time when svc->sched_data is already freed after RCU grace period. Fix it by clearing the ptr early in ip_vs_unbind_scheduler(), before the done_service method schedules any RCU callbacks. Also, if the new scheduler fails to initialize when replacing the old scheduler, try to restore the old scheduler while still returning the error code. Link: https://sashiko.dev/#/patchset/20260519015506.634185-1-rosenp%40gmail.com Fixes: 05f0050 ("ipvs: fix crash if scheduler is changed") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2fcba19caaeb2a33017459d3430f057967bb91b6 ] As the synproxy infrastructure register netfilter hooks on-demand when a user adds the first iptables target or nftables expression, if done concurrently they can race each other. Introduce a mutex to serialize the refcount control blocks access from both frontends. While a per namespace mutex might be more efficient, it is not needed for target/expression like SYNPROXY. Fixes: ad49d86 ("netfilter: nf_tables: Add synproxy support") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66eba0ffce3b7e11449946b4cbbef8ea36112f56 ] When parsing fails after we've matched the command string we should bail out instead of trying to match a different command. This helper should be deprecated, given prevalence of TLS I doubt it has any relevance in 2026. Fixes: 869f37d ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port") Closes: https://sashiko.dev/#/patchset/20260525182924.28456-1-fw%40strlen.de Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3027ecbdb5fdf9200251c21d4818e4c447ef78e1 ]
I noticed this issue while looking at a historic syzbot report [1].
A rule like the one below is enough to trigger the bug:
table ip t {
chain pre {
type filter hook prerouting priority raw;
ct zone set 1
ct original saddr 1.2.3.4 accept
}
}
The first expression attaches a per-cpu template ct via
nft_ct_set_zone_eval() (nf_ct_tmpl_alloc -> kzalloc, tuple is all
zero, nf_ct_l3num(ct) == 0). The next expression then calls
nft_ct_get_eval() on the same skb, treats the template as a real ct
and hits the 16-byte memcpy path. With dreg at NFT_REG32_15 this
overflows past struct nft_regs on the kernel stack; with smaller
dreg values it silently clobbers adjacent registers.
Reject template ct at the eval entry and in nft_ct_get_fast_eval(),
mirroring the check nft_ct_set_eval() already has. Additionally,
bound the address copy in NFT_CT_SRC / NFT_CT_DST by priv->len
instead of by nf_ct_l3num(ct): nf_ct_get_tuple() zeroes the tuple
before pkt_to_tuple() fills in only the protocol-relevant leading
bytes, so the trailing bytes of tuple->{src,dst}.u3.all are
well-defined zero. priv->len is validated at rule load, so the
copy size is now bounded by the destination register rather than
by an untrusted field on the conntrack.
[1]: https://syzkaller.appspot.com/bug?id=389cf09cb72926114fce90dc85a2c3231dcb647c
Fixes: 45d9bcd ("netfilter: nf_tables: validate len in nft_validate_data_load()")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67ba971ae02514d85818fe0c32549ab4bfa3bf49 ] The ebtables SNAT target keeps the Ethernet source address rewrite behind skb_ensure_writable(skb, 0). This is intentional: at the bridge ebtables hooks the Ethernet header is addressed through skb_mac_header()/eth_hdr(), while skb->data points at the Ethernet payload. Asking skb_ensure_writable() for ETH_HLEN bytes would check the payload, not the Ethernet header, and would reintroduce the small packet regression fixed by commit 63137bc. However, the optional ARP sender hardware address rewrite is different. It writes through skb_store_bits() at an offset relative to skb->data: skb_store_bits(skb, sizeof(struct arphdr), info->mac, ETH_ALEN) skb_header_pointer() only safely reads the ARP header; it does not make the later sender hardware address range writable. If that range is still held in a nonlinear skb fragment backed by a splice-imported file page, skb_store_bits() maps the frag page and copies the new MAC address directly into it. Ensure the ARP SHA range is writable before reading the ARP header and before calling skb_store_bits(). Fixes: 63137bc ("netfilter: ebtables: Fixes dropping of small packets in bridge nat") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3f0a606b9f278ece8a0df626ded9c4044071235 ]
commit 2d1f7b65f5de ("dm cache policy smq: fix missing locks in
invalidating cache blocks") added mq->lock around the destructive part of
smq_invalidate_mapping(), but left the e->allocated check outside the
critical section.
That leaves a check-then-act race. Two concurrent invalidators can both
observe e->allocated as true before either of them takes mq->lock. The
first invalidator that acquires the lock removes the entry from the
queues and hash table and then calls free_entry(), which clears
e->allocated and puts the entry back on the free list. The second
invalidator can then acquire mq->lock and continue with the stale result
of the unlocked check.
This can corrupt the SMQ queues or hash table by deleting an entry that
is no longer on those structures. It can also hit the allocation check in
free_entry() when the same entry is freed again.
Move the allocation check under mq->lock so the predicate and the
destructive operations are serialized by the same lock.
Fixes: 2d1f7b65f5de ("dm cache policy smq: fix missing locks in invalidating cache blocks")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5057e1aca011e51ef51498c940ef96f3d3e8a305 ] When NEWTFILTER and DELFILTER are run concurrently it is possible to create a race with an associated action. Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER: 0: mutex_lock() <-- holds the idr lock 0: rcu_read_lock() 0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR) 0: mutex_unlock() <-- releases the idr lock 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held 1: idr_remove(idr, index) <-- Action removed from IDR 1: mutex_unlock() <-- mutex released allowing us to delete the action 1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory This patch fixes the race condition between NEWTFILTER and DELFILTER by adding struct rcu_head to tc_action used in the deferral and introducing a call_rcu() in the delete path to defer the final kfree(). Note: this is a revert of commit d7fb60b ("net_sched: get rid of tcfa_rcu") but also modernization/simplification to directly use kfree_rcu(). Let's illustrate the new restored code path: 0: rcu_read_lock() 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held 1: idr_remove(idr, index) 1: mutex_unlock() 1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period 0: p = idr_find(idr, index) 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0 1: rcu_read_unlock() <-- release so freeing can run after grace period After CPU1 calls idr_remove(), the object is no longer reachable through the IDR. CPU0's subsequent idr_find() will return NULL, and even if it still held a stale pointer, the immediate kfree() is now deferred until after the RCU grace period, so no UAF can occur. Fixes: d7fb60b ("net_sched: get rid of tcfa_rcu") Suggested-by: Jakub Kicinski <kuba@kernel.org> Reported-by: Kyle Zeng <kylebot@openai.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Tested-by: syzbot@syzkaller.appspotmail.com Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: Kyle Zeng <kylebot@openai.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20260531160812.68020-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2a58899d11009bffc7b4b32a571858f381121837 ] The second memcpy in lowpan_iphc_mcast_ctx_addr_compress() uses &data[1] as destination and &ipaddr->s6_addr[11] as source, but both should be offset by one: &data[2] and &ipaddr->s6_addr[12] respectively. This off-by-one has two consequences: 1. data[1] is overwritten with s6_addr[11], corrupting the RIID field in the compressed multicast address 2. data[5] is never written, so uninitialized kernel stack memory is transmitted over the network via lowpan_push_hc_data(), leaking kernel stack contents The correct inline data layout must match what the decompression function lowpan_uncompress_multicast_ctx_daddr() expects: data[0..1] = s6_addr[1..2] (flags/scope + RIID) data[2..5] = s6_addr[12..15] (group ID) Also zero-initialize the data array as a defensive measure against similar bugs in the future. Fixes: 5609c18 ("6lowpan: iphc: add support for stateful compression") Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn> Reported-by: Ao Wang <wangao@seu.edu.cn> Reported-by: Xuewei Feng <fengxw06@126.com> Reported-by: Qi Li <qli01@tsinghua.edu.cn> Reported-by: Ke Xu <xuke@tsinghua.edu.cn> Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Acked-by: Alexander Aring <aahringo@redhat.com> Link: https://patch.msgid.link/20260527081806.42747-1-zhaoyz24@mails.tsinghua.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a213a8950414c684999dcf03edeea6c46ede172e ] pppol2tp_ioctl() read sock->sk->sk_user_data directly without any locks or reference counting. If a controllable sleep was induced during copy_from_user() (e.g. via a userfaultfd page fault sleep), a concurrent socket close could trigger pppol2tp_session_close() asynchronously. This frees the l2tp_session structure via the l2tp_session_del_work workqueue. Upon resuming, the ioctl thread dereferences the stale session pointer, resulting in a Use-After-Free (UAF). Fix this by securely fetching the session reference using the RCU-safe, refcounted helper pppol2tp_sock_to_session(sk) on entry. This locks the session's refcount across the sleep. We structured the function to exit via standard err breaks, guaranteeing that l2tp_session_put() is cleanly called on all return paths to drop the reference. To preserve existing behavior we validate the session and its magic signature only for the specific L2TP commands that require it. This ensures that generic/unknown ioctls called on an unconnected socket still return -ENOIOCTLCMD and correctly fall back to generic handlers (e.g. in sock_do_ioctl()). Signed-off-by: Lee Jones <lee@kernel.org> Fixes: fd558d1 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Link: https://patch.msgid.link/20260527133630.2120612-1-lee@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3522b21fd7e1863d0734537737bd59f1b90d0190 ] devlink relation state is normally released from devl_unregister(), which calls devlink_rel_put(). This misses devlink instances that get a nested relation before registration and then fail probe before devl_register() is reached. That flow can happen for SFs. The child devlink gets linked to its parent before registration, then a later probe error calls devlink_free() directly. Since the instance was never registered, devl_unregister() is not called and devlink->rel is leaked. Release any pending relation from devlink_free() as well. The registered path is unchanged because devl_unregister() already clears devlink->rel before devlink_free() runs. Fixes: c137743 ("devlink: introduce object and nested devlink relationship infra") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260528191411.3270532-1-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae0383e5a9a4b12d68c76c4769857def4665deff ] Fix the following W=1 kerneldoc warnings by adding the missing parameter descriptions for @phase0_identity and @nn_interpolation in dcss_scaler_filter_design() and @phase0_identity in dcss_scaler_gaussian_filter() Warning: drivers/gpu/drm/imx/dcss/dcss-scaler.c:173 function parameter 'phase0_identity' not described in 'dcss_scaler_gaussian_filter' Warning: drivers/gpu/drm/imx/dcss/dcss-scaler.c:270 function parameter 'phase0_identity' not described in 'dcss_scaler_filter_design' Warning: drivers/gpu/drm/imx/dcss/dcss-scaler.c:270 function parameter 'nn_interpolation' not described in 'dcss_scaler_filter_design' Fixes: 9021c31 ("drm/imx: Add initial support for DCSS on iMX8MQ") Signed-off-by: Yicong Hui <yiconghui@gmail.com> Reviewed-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com> Link: https://patch.msgid.link/20260406180013.2442096-1-yiconghui@gmail.com Signed-off-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
…diotap [ Upstream commit 6c0cf89f36ac0c0fd8687a4ccdce2efb23a9c663 ] When parsing the radiotap header of an injected frame, ieee80211_parse_tx_radiotap() uses the IEEE80211_RADIOTAP_ANTENNA value directly as a shift count: info->control.antennas |= BIT(*iterator.this_arg); *iterator.this_arg is an 8-bit value taken straight from the frame supplied by userspace, so BIT() can be asked to shift by up to 255. That is undefined behaviour on the unsigned long and is reported by UBSAN: UBSAN: shift-out-of-bounds in net/mac80211/tx.c:2174:30 shift exponent 235 is too large for 64-bit type 'unsigned long' Call Trace: ieee80211_parse_tx_radiotap+0xadb/0x1950 net/mac80211/tx.c:2174 ieee80211_monitor_start_xmit+0xb1f/0x1250 net/mac80211/tx.c:2451 ... packet_sendmsg+0x3eb6/0x50f0 net/packet/af_packet.c:3109 info->control.antennas is a 2-bit bitmap (u8 antennas:2), so only antenna indices 0 and 1 can ever be represented. Ignore any larger value instead of shifting out of bounds. Reported-by: syzbot+8e0622f6d9446420271f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8e0622f6d9446420271f Fixes: ef246a1 ("wifi: mac80211: support antenna control in injection") Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Link: https://patch.msgid.link/20260531011721.102941-1-kartikey406@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73bf3cca7de6a73f53b6a52dc3b1c82ae5667a4d ] napi_complete_done may call gro_flush_normal (though not currently, as GRO is unsupported at the moment), which may result in packet TX. This will eventually result in calling pcnet32_start_xmit - resulting in a deadlock while trying to re-acquire the already locked spin lock. It is safe to split the spinlock block into two, because the hardware registers are still protected from concurrent access, and the two blocks perform unrelated operations that don't need to happen atomically. Fixes: 5b2ec6f ("pcnet32: use napi_complete_done()") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20260528140320.5556-1-oscmaes92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b748765019fe9e9234660327090fc1a9665cdbdd ]
UDP TX skb->destructor() is sock_wfree(), and UDP holds lock_sock()
only for UDP_CORK / MSG_MORE sendmsg().
Otherwise, sk->sk_write_space() may be read locklessly while SOCKMAP
rewrites sk->sk_write_space().
Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_write_space().
Note that the write side is annotated by commit 2ef2b20cf4e0
("net: annotate data-races around sk->sk_{data_ready,write_space}").
Fixes: 7b98cd4 ("bpf: sockmap: Add UDP support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://patch.msgid.link/20260529193941.3897256-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afd0f17ca46258cec3a5cc48b8df9327fe772490 ]
syzbot reported the warning [0] in hsr_addr_is_self(),
whose assumption is simply wrong.
hsr->self_node is cleared in hsr_del_self_node(), which
is called from hsr_dellink().
Since dev->rtnl_link_ops->dellink() is called before
unregister_netdevice_many(), there is a window when
user can find the device but without hsr->self_node.
Let's remove WARN_ONCE() in hsr_addr_is_self().
[0]:
HSR: No self node
WARNING: net/hsr/hsr_framereg.c:39 at hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39, CPU#0: syz.4.16848/17220
Modules linked in:
CPU: 0 UID: 0 PID: 17220 Comm: syz.4.16848 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39
Code: 33 2f 41 0f b7 dd 89 ee 09 de 31 ff e8 c8 b4 c6 f6 09 dd 74 54 e8 0f b0 c6 f6 31 ed eb 53 e8 06 b0 c6 f6 48 8d 3d 2f 50 9c 04 <67> 48 0f b9 3a 31 ed eb 42 e8 c1 13 1f 00 89 c5 31 ff 89 c6 e8 96
RSP: 0018:ffffc900041c70e0 EFLAGS: 00010283
RAX: ffffffff8afdc6ca RBX: ffffffff8afdc4e6 RCX: 0000000000080000
RDX: ffffc90010493000 RSI: 0000000000000948 RDI: ffffffff8f9a1700
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffffc900041c71e8 R11: fffff52000838e3f R12: dffffc0000000000
R13: ffff888041f9e3c0 R14: ffff888086ee3802 R15: 0000000000000000
FS: 00007f6fe985d6c0(0000) GS:ffff888126176000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f80bd437dac CR3: 0000000025096000 CR4: 00000000003526f0
DR0: ffffffffffffffff DR1: 00000000000001f8 DR2: 0000000000000002
DR3: ffffffffefffff15 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
check_local_dest net/hsr/hsr_forward.c:592 [inline]
fill_frame_info net/hsr/hsr_forward.c:728 [inline]
hsr_forward_skb+0xa11/0x2a80 net/hsr/hsr_forward.c:739
hsr_dev_xmit+0x253/0x370 net/hsr/hsr_device.c:236
__netdev_start_xmit include/linux/netdevice.h:5368 [inline]
netdev_start_xmit include/linux/netdevice.h:5377 [inline]
xmit_one net/core/dev.c:3888 [inline]
dev_hard_start_xmit+0x2df/0x860 net/core/dev.c:3904
__dev_queue_xmit+0x1428/0x3900 net/core/dev.c:4870
neigh_output include/net/neighbour.h:556 [inline]
ip_finish_output2+0xcec/0x10b0 net/ipv4/ip_output.c:237
ip_send_skb net/ipv4/ip_output.c:1510 [inline]
ip_push_pending_frames+0x8b/0x110 net/ipv4/ip_output.c:1530
raw_sendmsg+0x1547/0x1a50 net/ipv4/raw.c:659
sock_sendmsg_nosec net/socket.c:787 [inline]
__sock_sendmsg net/socket.c:802 [inline]
____sys_sendmsg+0x7da/0x9c0 net/socket.c:2698
___sys_sendmsg+0x2a5/0x360 net/socket.c:2752
__sys_sendmsg net/socket.c:2784 [inline]
__do_sys_sendmsg net/socket.c:2789 [inline]
__se_sys_sendmsg net/socket.c:2787 [inline]
__x64_sys_sendmsg+0x1c3/0x2a0 net/socket.c:2787
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6feb62ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6fe985d028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f6feb8a6090 RCX: 00007f6feb62ce59
RDX: 0000000000000000 RSI: 0000200000000000 RDI: 0000000000000004
RBP: 00007f6feb6c2d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f6feb8a6128 R14: 00007f6feb8a6090 R15: 00007ffcf01cc488
</TASK>
Fixes: f266a68 ("net/hsr: Better frame dispatch")
Reported-by: syzbot+652670cf249077eb498b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a1a861e.b111c304.35cd64.0016.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260530064300.340793-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 16e408e607a94b646fb14a2a98422c6877ae4b3c ]
The receive-side GARP attribute parser computes dlen with reversed
operands:
dlen = sizeof(*ga) - ga->len;
ga->len is the on-wire attribute length and includes the GARP attribute
header. For normal attributes with data, ga->len is larger than
sizeof(*ga), so the subtraction underflows in unsigned arithmetic.
The resulting value is later passed to garp_attr_lookup(), whose length
argument is u8. After truncation, the parsed data length usually no
longer matches the length stored for locally registered attributes, so
received Join/Leave events are ignored. This breaks the GARP receive path
for common attributes, such as GVRP VLAN registration attributes.
Compute the data length as the attribute length minus the header length.
Fixes: eca9eba ("net: Add GARP applicant-only participant")
Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn>
Reported-by: Ao Wang <wangao@seu.edu.cn>
Reported-by: Xuewei Feng <fengxw06@126.com>
Reported-by: Qi Li <qli01@tsinghua.edu.cn>
Reported-by: Ke Xu <xuke@tsinghua.edu.cn>
Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260527083200.42861-1-zhaoyz24@mails.tsinghua.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8173d22b211f615015f7b35f48ab11a6dd78dc99 ] VLAN-tagged interfaces on lan743x devices were previously unreachable via SSH and failed to respond to large ping packets (e.g. "ping -s 1469" given MTU=1500). In these scenarios, "ethtool -S" reports non-zero "RX Oversize Frame Errors". According to Microchip AN2948, the MAC_RX FSE (VLAN field size enforcement) bit determines whether frames with VLAN tags exceeding the base MTU plus tag length are discarded. The driver must set the MAC_RX.FSE bit before setting MAC_RX.RXEN to allow VLAN-tagged frames up to the interface MTU, preventing them from being treated as oversized. As a result, both the base and VLAN-tagged interfaces can use the same MTU without receive errors. Fixes: 23f0703 ("lan743x: Add main source files for new lan743x driver") Signed-off-by: David Thompson <davthompson@nvidia.com> Reviewed-by: Thangaraj Samynathan <Thangaraj.s@microchip.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> # lan7430 on arm64 (RevPi Link: https://patch.msgid.link/20260529210300.433135-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b455410146bf723c7ebcb49ecd5becc0d6611482 ] In fec_resume(), fec_enet_clk_enable() is called before pinctrl_pm_select_default_state() in the non-WoL path, inverting the ordering used in fec_suspend() which correctly switches to the sleep pinctrl state before disabling clocks. For PHYs with the PHY_RST_AFTER_CLK_EN flag (e.g. TI DP83848 or SMSC LAN87xx), fec_enet_clk_enable() triggers a hardware reset pulse via the phy-reset GPIO. With the GPIO pin still in sleep pinctrl state at that point, the GPIO write has no physical effect and the PHY never receives the required reset after clock enable, leading to unreliable link establishment after system resume. Fix by restoring the default pinctrl state before enabling clocks, making resume the proper mirror of suspend. The call is made unconditionally: fec_suspend() only switches to the sleep pinctrl state on the non-WoL path and leaves the pins in the default state when WoL is enabled, so on a WoL resume the device is already in the default state and pinctrl_pm_select_default_state() is a no-op. Fixes: de40ed3 ("net: fec: add Wake-on-LAN support") Signed-off-by: Tapio Reijonen <tapio.reijonen@vaisala.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260529-b4-fec-resume-pinctrl-order-v3-1-6eda0f592fca@vaisala.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43c441edacf953b39517a44f5e5e10a93618b226 ]
rfcomm_get_sock_by_channel() scans rfcomm_sk_list under the list lock,
but returns the selected listener after dropping that lock without
taking a reference. rfcomm_connect_ind() then locks the listener,
queues a child socket on it, and may notify it after unlocking it.
The buggy scenario involves two paths, with each column showing the
order within that path:
rfcomm_connect_ind(): listener close:
1. Find parent in 1. close() enters
rfcomm_get_sock_by_channel() rfcomm_sock_release().
2. Drop rfcomm_sk_list.lock 2. rfcomm_sock_shutdown()
without pinning parent. closes the listener.
3. Call lock_sock(parent) and 3. rfcomm_sock_kill()
bt_accept_enqueue(parent, unlinks and puts parent.
sk, true).
4. Read parent flags and may 4. parent can be freed.
call sk_state_change().
If close wins the race, parent can be freed before
rfcomm_connect_ind() reaches lock_sock(), bt_accept_enqueue(), or the
deferred-setup callback.
Take a reference on the listener before leaving rfcomm_sk_list.lock.
After lock_sock() succeeds, recheck that it is still in BT_LISTEN
before queueing a child, cache the deferred-setup bit while the parent
is locked, and drop the reference after the last parent use.
KASAN reported a slab-use-after-free in lock_sock_nested() from
rfcomm_connect_ind(), with the freeing stack going through
rfcomm_sock_kill() and rfcomm_sock_release().
Fixes: 1da177e ("Linux-2.6.12-rc2")
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit de23fb62259aa01d294f77238ae3b835eb674413 ]
tlv_data_is_valid() reads each advertising data field length from
data[i], then inspects data[i + 1] for managed EIR types before
checking that the current field still fits inside the supplied buffer.
A malformed field whose length byte is the last byte of the buffer can
therefore make the parser read one byte past the advertising data.
KASAN reported the following when a malformed MGMT_OP_ADD_ADVERTISING
request reached that path:
BUG: KASAN: vmalloc-out-of-bounds in tlv_data_is_valid()
Read of size 1
Call trace:
tlv_data_is_valid()
add_advertising()
hci_mgmt_cmd()
hci_sock_sendmsg()
Move the existing element-length check before any type-octet inspection
so each non-empty element is proven to contain its type byte before the
parser looks at data[i + 1].
Fixes: 2bb3687 ("Bluetooth: Unify advertising instance flags check")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23882b828c3c8c51d0c946446a396b10abb3b16b ] The RFCOMM MCC handlers cast skb->data to protocol-specific structs without validating skb->len first. A malicious remote device can send truncated MCC frames and trigger out-of-bounds reads in these handlers. Fix this by using skb_pull_data() to validate and access the required data before dereferencing it. rfcomm_recv_rpn() requires special handling since ETSI TS 07.10 allows 1-byte RPN requests. Handle this by validating only the DLCI byte first, and validating the full struct only when len > 1. Fixes: 1da177e ("Linux-2.6.12-rc2") Suggested-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: SeungJu Cheon <suunj1331@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
…nsion handling [ Upstream commit 72b8deccff17a7644e0367e1aaf1a36cfb014324 ] In bnep_rx_frame(), the BNEP_FILTER_NET_TYPE_SET and BNEP_FILTER_MULTI_ADDR_SET extension header parsing has two bugs: 1) The 2-byte length field is read with *(u16 *)(skb->data + 1), which performs a native-endian read. The BNEP protocol specifies this field in big-endian (network byte order), and the same file correctly uses get_unaligned_be16() for the identical fields in bnep_ctrl_set_netfilter() and bnep_ctrl_set_mcfilter(). 2) The length is multiplied by 2, but unlike BNEP_SETUP_CONN_REQ where the length byte counts UUID pairs (requiring * 2 for two UUIDs per entry), the filter extension length field already represents the total data size in bytes. This is confirmed by bnep_ctrl_set_netfilter() which reads the same field as a byte count and divides by 4 to get the number of filter entries. The bogus * 2 means skb_pull advances twice as far as it should, either dropping valid data from the next header or causing the pull to fail entirely when the doubled length exceeds the remaining skb. Fix by splitting the pull into two steps: first use skb_pull_data() to safely pull and validate the 3-byte fixed header (ctrl type + length), then pull the variable-length data using the properly decoded length. Fixes: bf8b9a9 ("Bluetooth: bnep: Add support to extended headers of control frames") Signed-off-by: Dudu Lu <phx0fer@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6770d3a8acdf9151769180cc3710346c4cfbe6f0 ] A BNEP peer can send a short BNEP SDU. bnep_rx_frame() reads the packet type byte immediately and, for control packets, reads the control opcode and setup UUID-size byte before proving that those bytes are present. bnep_rx_control() also dereferences the control opcode without rejecting an empty control payload. Use skb_pull_data() for the fixed fields in bnep_rx_frame() so a NULL return gates each dereference. Split the control handler so the frame path can pass an opcode that has already been pulled, and keep the byte-buffer wrapper for extension control payloads. For BNEP_SETUP_CONN_REQ, name the UUID-size byte before pulling the setup payload. struct bnep_setup_conn_req carries destination and source service UUIDs after that byte, each uuid_size bytes, so the parser now documents that tuple explicitly instead of leaving the pull length as an opaque multiplication. Validation reproduced this kernel report: KASAN slab-out-of-bounds in bnep_rx_frame.isra.0+0x130c/0x1790 The buggy address belongs to the object at ffff88800c0f7908 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes to the right of allocated 1-byte region [ffff88800c0f7908, ffff88800c0f7909) Read of size 1 Call trace: dump_stack_lvl+0xb3/0x140 (?:?) print_address_description+0x57/0x3a0 (?:?) bnep_rx_frame+0x130c/0x1790 (net/bluetooth/bnep/core.c:306) print_report+0xb9/0x2b0 (?:?) __virt_addr_valid+0x1ba/0x3a0 (?:?) srso_alias_return_thunk+0x5/0xfbef5 (?:?) kasan_addr_to_slab+0x21/0x60 (?:?) kasan_report+0xe0/0x110 (?:?) process_one_work+0xfce/0x17e0 (kernel/workqueue.c:3200) worker_thread+0x65c/0xe40 (?:?) __kthread_parkme+0x184/0x230 (?:?) kthread+0x35e/0x470 (?:?) _raw_spin_unlock_irq+0x28/0x50 (?:?) ret_from_fork+0x586/0x870 (?:?) __switch_to+0x74f/0xdc0 (?:?) ret_from_fork_asm+0x1a/0x30 (?:?) Fixes: 1da177e ("Linux-2.6.12-rc2") Assisted-by: Codex:gpt-5.5 Signed-off-by: Zhang Cen <rollkingzzc@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 37b3009bf5976e8ab77c8b9a9bc3bbd7ff49e37f ] Early failures in Bluetooth HCI UART configuration leak SRCU percpu memory. When device initialization fails before hci_register_dev() completes, the HCI_UNREGISTER flag is never set. As a result, when the device reference count reaches zero, bt_host_release() evaluates this flag as false and falls back to a direct kfree(hdev). Because hci_release_dev() is bypassed, the SRCU struct initialized early in hci_alloc_dev() is never cleaned up, resulting in a leak of percpu memory. Fix the leak by explicitly calling cleanup_srcu_struct() in the fallback (unregistered) branch of bt_host_release() before freeing the device. Reported-by: syzbot+535ecc844591e50588a5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=535ecc844591e50588a5 Tested-by: syzbot+535ecc844591e50588a5@syzkaller.appspotmail.com Fixes: 1d61231 ("Bluetooth: hci_core: Fix use-after-free in vhci_flush()") Signed-off-by: Bharath Reddy <kbreddy.rpbc@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5cbf290b79351971f20c7a533247e8d58a3f970c ] hci_get_route() returns a reference-counted hci_dev pointer via hci_dev_hold(). The function exits normally or with an error without ever releasing it. Fixes: 07a9342 ("Bluetooth: ISO: Send BIG Create Sync via hci_sync") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5842c01 ] Currently bc_sid is being ignore when acting as Broadcast Source role, so this fix it by passing the bc_sid and then use it when programming the PA: < HCI Command: LE Set Exte.. (0x08|0x0036) plen 25 Handle: 0x01 Properties: 0x0000 Min advertising interval: 140.000 msec (0x00e0) Max advertising interval: 140.000 msec (0x00e0) Channel map: 37, 38, 39 (0x07) Own address type: Random (0x01) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Filter policy: Allow Scan Request from Any, Allow Connect Request from Any (0x00) TX power: Host has no preference (0x7f) Primary PHY: LE 1M (0x01) Secondary max skip: 0x00 Secondary PHY: LE 2M (0x02) SID: 0x01 Scan request notifications: Disabled (0x00) Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 9ca7053d6215 ("Bluetooth: ISO: Fix data-race on iso_pi fields in hci_get_route calls") Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ca7053d6215d89c33f28893bfd1625a32919d3f ] iso_connect_bis(), iso_connect_cis(), iso_listen_bis(), and iso_conn_big_sync() call hci_get_route() using iso_pi(sk)->dst, iso_pi(sk)->src, and iso_pi(sk)->src_type without holding lock_sock(). These fields may be modified concurrently by connect() or setsockopt() on the same socket, resulting in data-races reported by KCSAN. Fix this by snapshotting the required fields under lock_sock() before calling hci_get_route(). BUG: KCSAN: data-race in memcmp+0x45/0xb0 race at unknown origin, with read to 0xffff8880122135cf of 1 bytes by task 333 on cpu 1: memcmp+0x45/0xb0 hci_get_route+0x27e/0x490 iso_connect_cis+0x4c/0xa10 iso_sock_connect+0x60e/0xb30 __sys_connect_file+0xbd/0xe0 __sys_connect+0xe0/0x110 __x64_sys_connect+0x40/0x50 x64_sys_call+0xcad/0x1c60 do_syscall_64+0x133/0x590 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 241f519 ("Bluetooth: ISO: Avoid circular locking dependency") Signed-off-by: SeungJu Cheon <suunj1331@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 149324fc762c2a7acef9c26790566f81f475e51f ] bluetoothd has a bug with makes it send extra bytes as part of MGMT_OP_ADD_EXT_ADV_DATA which are now being checked to be the exact the expected length, relax this so only when the expected length is greater than the data length to cause an error since that would result in accessing invalid memory, otherwise just ignore the extra bytes. Link: https://lore.kernel.org/linux-bluetooth/20260602204749.210857-1-luiz.dentz@gmail.com/T/#u Fixes: d3f7d17960ed ("Bluetooth: MGMT: validate Add Extended Advertising Data length") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit badad6fad60def1b9805559dd81dbab3d97b82aa ] If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be re-evaluated to ensure it is properly pinned as RW. Since the umem is hidden inside each driver's mr struct add a ib_umem_check_rereg() function that each driver has to call before processing IB_MR_REREG_ACCESS. mlx4 has to retain its duplicate ib_access_writable check because it implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items in place sequentially while the MR is live, so it will continue to not support this combination. Cc: stable@vger.kernel.org Fixes: b40656a ("RDMA/umem: remove FOLL_FORCE usage") Link: https://patch.msgid.link/r/0-v1-06fb1a2d6cf5+107-rereg_access_jgg@nvidia.com Reported-by: Philip Tsukerman <philiptsukerman@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ff46d1392750444fab5ae5a0194764ffdc4ac0d2 ] Add or correct kernel-doc comments to eliminate warnings: Warning: include/rdma/ib_umem.h:104 function parameter 'biter' not described in 'rdma_umem_for_each_dma_block' Warning: include/rdma/ib_umem.h:140 function parameter 'pgsz_bitmap' not described in 'ib_umem_find_best_pgoff' Warning: include/rdma/ib_umem.h:141 No description found for return value of 'ib_umem_find_best_pgoff' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260224003120.3173892-1-rdunlap@infradead.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Stable-dep-of: 15fe76e23615 ("RDMA/umem: Fix truncation for block sizes >= 4G") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6094ea64c69520ed1e770e7c79c43412de202bfa ] The DMA iterator logic was mixed into verbs and umem-specific code, forcing all users to include rdma/ib_umem.h. Move the block iterator logic into iter.c and rdma/iter.h so that rdma/ib_umem.h and rdma/ib_verbs.h can be separated in a follow-up patch. Link: https://patch.msgid.link/20260213-refactor-umem-v1-1-f3be85847922@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Stable-dep-of: 15fe76e23615 ("RDMA/umem: Fix truncation for block sizes >= 4G") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 15fe76e23615f502d051ef0768f86babaf08746c ] When the iommu is used the linearization of the mapping can give a single block that is very large split across multiple SG entries. When __rdma_block_iter_next() reassembles the split SG entries it is overflowing the 32 bit stack values and computed the wrong DMA addresses for blocks after the truncation. Use the right types to hold DMA addresses. Link: https://patch.msgid.link/r/1-v1-88303e9e509f+f7-ib_umem_types_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: a808273 ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b4aea43cd37afad714b5684fe9fdfcb0e78dba26 ] Commit 081056d ("mm/hugetlb: unshare page tables during VMA split, not before") changed the locking model around hugetlbfs PMD unsharing on VMA split, but did not update the function which asserts the locks, hugetlb_vma_assert_locked(). This function asserts that either the hugetlb VMA lock is held (if a shared mapping) or that the reservation map lock is held (if private). If you get an unfortunate race between something which results in one of these locks being released and a hugetlb VMA split and you have CONFIG_LOCKDEP enabled, you can therefore see a false positive assertion arise when there is in fact no issue. Since this change introduced a new take_locks parameter to hugetlb_unshare_pmds(), which, when set to false, indicates that locking is sufficient, simply pass this to the unsharing logic and predicate the lock assertions on this. This is safe, as we already asserted the file rmap lock and the VMA write lock prior to this (implying exclusive mmap write lock), so we cannot be raced by either rmap or page fault page table walkers which the asserted locks are intended to protect against (we don't mind GUP-fast). Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a check_locks parameter, and update hugetlb_unshare_pmds() to pass this parameter to it. This leaves all other callers of huge_pmd_unshare() still correctly asserting the locks. The below reproducer will trigger the assert in a kernel with CONFIG_LOCKDEP enabled by racing process teardown (which will release the hugetlb lock) against a hugetlb split. void execute_one(void) { void *ptr; pid_t pid; /* * Create a hugetlb mapping spanning a PUD entry. * * We force the hugetlb page allocation with populate and * noreserve. * * |---------------------| * | | * |---------------------| * 0 PUD boundary */ ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_SHARED | MAP_ANON | MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE, -1, 0); if (ptr == MAP_FAILED) { perror("mmap"); exit(EXIT_FAILURE); } /* * Fork but with a bogus stack pointer so we try to execute code in * a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap(). * * The clone will cause PMD page table sharing between the * processes first via: * copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share() * * Then tear down and release the hugetlb 'VMA' lock via: * exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free() */ pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0); if (pid < 0) { perror("clone"); exit(EXIT_FAILURE); } if (pid == 0) { /* Pop stack... */ return; } /* * We are the parent process. * * Race the child process's teardown with a PMD unshare. * * We do this by triggering: * * __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds() * * Which, importantly, doesn't hold the hugetlb VMA lock (nor can * it), meaning we assert in hugetlb_vma_assert_locked(). * * . * |----------.----------| * | . | * |----------.----------| * 0 . PUD boundary */ mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0); } int main(void) { int i; /* Kick off fork children. */ for (i = 0; i < NUM_FORKS; i++) { pid_t pid = fork(); if (pid < 0) { perror("fork"); exit(EXIT_FAILURE); } /* Fork children do their work and exit. */ if (!pid) { int j; for (j = 0; j < NUM_ITERS; j++) execute_one(); return EXIT_SUCCESS; } } /* If we succeeded, wait on children. */ for (i = 0; i < NUM_FORKS; i++) wait(NULL); return EXIT_SUCCESS; } [ljs@kernel.org: account for the !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING case] Link: https://lore.kernel.org/agWZsPGYid08uU6O@lucifer Link: https://lore.kernel.org/20260513085658.45264-1-ljs@kernel.org Fixes: 081056d ("mm/hugetlb: unshare page tables during VMA split, not before") Signed-off-by: Lorenzo Stoakes <ljs@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Jann Horn <jannh@google.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Lorenzo Stoakes <ljs@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d8d28738f24b75616d6ca7a27cb4aed88520343 ] The mptcp_recvmsg() can fill MPTCP socket receive queue via mptcp_move_skbs(), but currently does not try to wakeup any listener, because the same process is going to check the receive queue soon. When multiple threads are reading from the same fd, the above can cause stall. Add the missing wakeup. Fixes: 6771bfd ("mptcp: update mptcp ack sequence from work queue") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260602-net-mptcp-misc-fixes-7-1-rc7-v2-1-856831229976@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91eb7ec7261254b6875909df767185838598e21e upstream.
A section was in {} that didn't need to be, move the variable
definition to the top and set th eindentino properly.
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8aebe93a4938c0ca1941eeaae821738f869be3d upstream. Cleanup code was checking the thread for NULL, but it was possibly a PTR_ERR() in one spot. Spotted with static analysis. Link: https://sourceforge.net/p/openipmi/mailman/message/59324676/ Fixes: 75c486cb1bca ("ipmi:ssif: Clean up kthread on errors") Cc: <stable@vger.kernel.org> # 91eb7ec72612: ipmi:ssif: Remove unnecessary indention Cc: stable@vger.kernel.org Signed-off-by: Corey Minyard <corey@minyard.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05cfe9863ef049d98141dc2969eefde72fb07625 upstream. Protocol checksum validation fails for IPv6 if there are extension headers before the protocol header. iph->len already contains its offset, so use it to fix the problem. Fixes: 2906f66 ("ipvs: SCTP Trasport Loadbalancing Support") Fixes: 0bbdd42 ("IPVS: Extend protocol DNAT/SNAT and state handlers") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Nazar Kalashnikov <nazarkalashnikov0@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 059b7dbd20a6f0c539a45ddff1573cb8946685b5 upstream. virtio_transport_inc_rx_pkt() checks vvs->rx_bytes + len > vvs->buf_alloc. virtio_transport_recv_enqueue() skips coalescing for packets with VIRTIO_VSOCK_SEQ_EOM. If fed with packets with len == 0 and VIRTIO_VSOCK_SEQ_EOM, a very large number of packets can be queued because vvs->rx_bytes stays at 0. Fix this by estimating the skb metadata size: (Number of skbs in the queue) * SKB_TRUESIZE(0) Fixes: 0777061 ("virtio/vsock: don't use skbuff state to account credit") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: "Eugenio Pérez" <eperezma@redhat.com> Cc: virtualization@lists.linux.dev Link: https://patch.msgid.link/20260430122653.554058-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c6087c5aaad6d1b8be1a1a641e0a422218ade911 upstream.
After commit 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb
queue"), virtio_transport_inc_rx_pkt() subtracts per-skb overhead from
buf_alloc when checking whether a new packet fits. This reduces the
effective receive buffer below what the user configured via
SO_VM_SOCKETS_BUFFER_SIZE, causing legitimate data packets to be
silently dropped and applications that rely on the full buffer size
to deadlock.
Also, the reduced space is not communicated to the remote peer, so
its credit calculation accounts more credit than the receiver will
actually accept, causing data loss (there is no retransmission).
With this approach we currently have failures in
tools/testing/vsock/vsock_test.c. Test 18 sometimes fails, while
test 22 always fails in this way:
18 - SOCK_STREAM MSG_ZEROCOPY...hash mismatch
22 - SOCK_STREAM virtio credit update + SO_RCVLOWAT...send failed:
Resource temporarily unavailable
Fix by allowing at most `buf_alloc * 2` as the total budget for payload
plus skb overhead in virtio_transport_inc_rx_pkt(), similar to how
SO_RCVBUF is doubled to reserve space for sk_buff metadata.
This preserves the full buf_alloc for payload under normal operation,
while still bounding the skb queue growth.
With this patch, all tests in tools/testing/vsock/vsock_test.c are
now passing again.
Fixes: 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue")
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260518090656.134588-3-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 836efd35c472d89c838d7b17ef339ddb3286ffc5 upstream.
Shin'ichiro reported hard to reproduce unaligned write errors with zoned
block devices. Under normal operation conditions (e.g. running XFS on an
SMR disk), these errors are nearly impossible to trigger. But using a
"slow" kernel with many debug options enables and some specific use
cases (e.g. fio zbd test case 46), the errors can be reproduced fairly
easily.
The unaligned write errors come from mishandling a valid reference
counting pattern of zone write plugs. Such pattern triggers for instance
if a process A writes a zone (not necessarilly to the full state),
another process B immediately resets the zone and immediately following
the completion of the zone reset, starts issuing writes to the zone.
With such pattern, in some cases, the zone write plugs worker thread of
the device may still be holding a reference to the zone write plug of
the zone taken when process A was writing to the zone. The following
zone reset from process B marks the zone as dead but does not remove the
zone write plug from the device hash table as a reference to the plug
still exist. Once process B starts issuing new writes, the zone write
plug is seen as dead and the writes from process B are immediately
failed, despite this write pattern being perfectly legal.
Fix this by allowing restoring a dead zone write plug to a live state if
a write is issued to the zone when the zone is: marked as dead, empty
and the write sector corresponds to the first sector of the zone (that
is, the write is aligned to the zone write pointer). This is done with
the new helper function disk_check_zone_wplug_dead(), which restores a
dead zone write plug to a live state by clearing the BLK_ZONE_WPLUG_DEAD
flag and restoring the initial reference to the zone write plug taken
when the plug was added to the device hash table.
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: b7d4ffb51037 ("block: fix zone write plug removal")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://patch.msgid.link/20260513111129.108809-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[ context conflict due to different line offsets in blk-zoned.c ]
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e185c8a upstream. Add cpu part and model macro definitions for NVIDIA Olympus core. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60349e64a6c65f9f0aa118af711b3c7e137f07ff upstream. Add cputype definitions for C1-Ultra. These will be used for errata detection in subsequent patches. These values can be found in the C1-Ultra TRM: https://developer.arm.com/documentation/108014/0100/ ... in section A.5.1 ("MIDR_EL1, Main ID Register"). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d28413bfc5a255957241f1df5d7fd0c2cd74fe18 upstream. Add cputype definitions for C1-Premium. These will be used for errata detection in subsequent patches. These values can be found in the C1-Premium TRM: https://developer.arm.com/documentation/109416/0100/ ... in section A.5.1 ("MIDR_EL1, Main ID Register"). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfd391e74134db664feb499d43af286380b10ba8 upstream. A number of CPUs developed by Arm suffer from errata whereby a broadcast TLBI;DSB sequence may complete before the global observation of writes which are translated by an affected TLB entry. These errata ONLY affect the completion of memory accesses which have been translated by an invalidated TLB entry, and these errata DO NOT affect the actual invalidation of TLB entries. TLB entries are removed correctly. This issue has been assigned CVE ID CVE-2025-10263. To mitigate this issue, Arm recommends that software follows any affected TLBI;DSB sequence with an additional TLBI;DSB, which will ensure that all memory write effects affected by the first TLBI have been globally observed. The additional TLBI can use any operation that is broadcast to affected CPUs, and the additional DSB can use any option that is sufficient to complete the additional TLBI. The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the issue. Enable this workaround for affected CPUs, and update the silicon errata documentation accordingly. Note that due to the manner in which Arm develops IP and tracks errata, some CPUs share a common erratum number. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec7216f92e4ebd485b1c6dc6aa3f6064b71a5768 upstream. NVIDIA Olympus cores are affected by the TLBI completion issue tracked as CVE-2025-10263. The existing ARM64_ERRATUM_4118414 handling already uses ARM64_WORKAROUND_REPEAT_TLBI to issue an additional broadcast TLBI;DSB sequence and ensure affected memory write effects are globally observed. Add MIDR_NVIDIA_OLYMPUS to the repeat-TLBI match list so the same mitigation is enabled on affected Olympus systems. Also document the NVIDIA Olympus erratum in the arm64 silicon errata table and list it in the Kconfig help text. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1940e70a8144bf75e6df26bf6f600862ea7f7ea1 upstream. Commit fb091ff ("arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata") states that Microsoft Azure Cobalt 100 CPU "is a Microsoft implemented CPU based on r0p0 of the ARM Neoverse N2 CPU, and therefore suffers from all the same errata.". So enable the workaround for the latest broadcast TLB invalidation bug on these parts. Signed-off-by: Will Deacon <will@kernel.org> [Mark: backport to v6.12.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 54568a8 ] We have many EXPORT_SYMBOL(x) in networking tree because IPv6 can be built as a module. CONFIG_IPV6=y is becoming the norm. Define a EXPORT_IPV6_MOD(x) which only exports x for modular IPv6. Same principle applies to EXPORT_IPV6_MOD_GPL() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Link: https://patch.msgid.link/20250212132418.1524422-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 54568a8) [needed as dependency for tcp: secure_seq: add back ports to TS offset] Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6dc4c25 ] Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need to be exported unless CONFIG_IPV6=m tcp_hashinfo and tcp_openreq_init_rwin() are no longer used from any module anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Link: https://patch.msgid.link/20250212132418.1524422-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 6dc4c25) [needed as dependency for tcp: secure_seq: add back ports to TS offset] Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 165573e41f2f66ef98940cf65f838b2cb575d9d1 ] This reverts 28ee1b7 ("secure_seq: downgrade to per-host timestamp offsets") tcp_tw_recycle went away in 2017. Zhouyan Deng reported off-path TCP source port leakage via SYN cookie side-channel that can be fixed in multiple ways. One of them is to bring back TCP ports in TS offset randomization. As a bonus, we perform a single siphash() computation to provide both an ISN and a TS offset. Fixes: 28ee1b7 ("secure_seq: downgrade to per-host timestamp offsets") Reported-by: Zhouyan Deng <dengzhouyan_nwpu@163.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20260302205527.1982836-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 165573e41f2f66ef98940cf65f838b2cb575d9d1) [kept the DCCP functions in the header, as DCCP was not retired yet in 6.12] Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14e9fea30b68fc75b2b3d97396a7e6adb544bd2a upstream. The userspace PM increments extra_subflows after __mptcp_subflow_connect() succeeds, but __mptcp_subflow_connect() calls mptcp_pm_close_subflow() on failure to roll back the pre-increment done by the kernel PM's fill_*() helpers. Because the userspace PM hasn't incremented yet at that point, this decrement is spurious and causes extra_subflows to underflow. Fix it by aligning the userspace PM with the kernel PM: increment extra_subflows before calling __mptcp_subflow_connect(), so the existing error path in subflow.c correctly rolls it back on failure. Also simplify the error handling by taking pm.lock only when needed for cleanup. Fixes: 77e4b94 ("mptcp: update userspace pm infos") Cc: stable@vger.kernel.org Signed-off-by: Tao Cui <cuitao@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260602-net-mptcp-misc-fixes-7-1-rc7-v2-5-856831229976@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…tions" This reverts commit fa36156, which is commit 3d07b69 upstream. The cited commit allows testptp to set a configurable clock_id. That is done via a PTP_SYS_OFFSET_EXTENDED ioctl call, whose argument is struct ptp_sys_offset_extended, where the clock_id is set. However, this Linux version does not support the ptp_sys_offset_extended.clockid field, and the test case cannot be built against this tree's own UAPI headers. The reverted commit was introduced to resolve a missing dependency of commit c6dc458227a3 ("testptp: Add option to open PHC in readonly mode"), which is 7686864 upstream. My suspicion is that the only conflict between the two is the getopt string, and there is otherwise no direct dependency between the two. This patch therefore reverts the cited commit, with hand-resolving the getopt string to include 'r' (as introduced by c6dc458227a3), but not 'y' (introduced by 06954f715deb). Reported-by: Yong Wang <yongwang@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4157501b9a8ff1bbe32ff5a7d8aece7ab18eff40 upstream.
On 32-bit architectures, both skb_queue_len() and SKB_TRUESIZE(0) evaluate
to 32-bit values. The multiplication can overflow before being assigned to
the u64 skb_overhead variable, making the skb overhead check ineffective.
Cast skb_queue_len() to u64 so the multiplication is always performed in
64-bit arithmetic.
This issue was reported by Sashiko while reviewing another patch.
Fixes: 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue")
Closes: https://sashiko.dev/#/patchset/20260518090656.134588-1-sgarzare%40redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260521124732.125771-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 62443dc21114c0bbc476fa62973db89743f2f137 ] `ip6t_eui64`, `xt_mac`, the `bitmap:ip,mac`, `hash:ip,mac`, and `hash:mac` ipset types, and `nf_log_syslog` access `eth_hdr(skb)` after either assuming that the skb is associated with an Ethernet device or checking only that the `ETH_HLEN` bytes at `skb_mac_header(skb)` lie between `skb->head` and `skb->data`. Make these paths first verify that the skb is associated with an Ethernet device, that the MAC header was set, and that it spans at least a full Ethernet header before accessing `eth_hdr(skb)`. Suggested-by: Florian Westphal <fw@strlen.de> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Link: https://lore.kernel.org/r/20260616145044.869532709@linuxfoundation.org Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Tested-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Pavel Machek (CIP) <pavel@nabladev.com> Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is the 6.12.94 stable release [gratian: fix do_sect_fault() conflict with fed889e ("ARM: fix branch predictor hardening") by enabling interrupts after the harden_branch_predictor() call] Signed-off-by: Gratian Crisan <gratian.crisan@emerson.com>
Commit fed889e ("ARM: fix branch predictor hardening") converted do_translation_fault() to use do_kernel_address_page_fault() and call harden_branch_predictor() for translation faults. Calling harden_branch_predictor() assumes interrupts are disabled. This creates a conflict with commit 35aa8d4 ("ARM: enable irq in translation/section permission fault handlers") which added a conditional local_irq_enable() early in do_translation_fault(). Move local_irq_enable() after the harden_branch_predictor() call but before __do_user_fault(). This satisfies the requirement that the branch predictor should be called with interrupts disabled but retains the behavior of enabling interrupts before executing the fault handler. The change was modeled after the v6.18-rt commit 2ab18da ("ARM: Sync with latest upstream submission") which follows a similar pattern. Signed-off-by: Gratian Crisan <gratian.crisan@emerson.com>
chaitu236
approved these changes
Jun 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is an out-of-cycle merge for the latest available
stabletagv6.12.94in order to uptake latest kernel CVE fixes.A non-trivial conflict between commit 35aa8d4 ("ARM: enable irq in translation/section permission fault handlers") and commit fed889e ("ARM: fix branch predictor hardening") was resolved by adding commit 91411b5 ("ARM: move irq enable after harden_branch_predictor").
AB#3909929
Testing