Changelog in Linux kernel 6.1.127

 
ACPI: resource: acpi_dev_irq_override(): Check DMI match last [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sat Dec 28 17:52:53 2024 +0100

    ACPI: resource: acpi_dev_irq_override(): Check DMI match last
    
    [ Upstream commit cd4a7b2e6a2437a5502910c08128ea3bad55a80b ]
    
    acpi_dev_irq_override() gets called approx. 30 times during boot (15 legacy
    IRQs * 2 override_table entries). Of these 30 calls at max 1 will match
    the non DMI checks done by acpi_dev_irq_override(). The dmi_check_system()
    check is by far the most expensive check done by acpi_dev_irq_override(),
    make this call the last check done by acpi_dev_irq_override() so that it
    will be called at max 1 time instead of 30 times.
    
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://patch.msgid.link/20241228165253.42584-1-hdegoede@redhat.com
    [ rjw: Subject edit ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ALSA: hda/realtek: Add support for Ayaneo System using CS35L41 HDA [+ + +]
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date:   Thu Jan 9 16:54:48 2025 +0000

    ALSA: hda/realtek: Add support for Ayaneo System using CS35L41 HDA
    
    commit de5afaddd5a7af6b9c48900741b410ca03e453ae upstream.
    
    Add support for Ayaneo Portable Game System.
    
    System use 2 CS35L41 Amps with HDA, using Internal boost, with I2C
    
    Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20250109165455.645810-1-sbinding@opensource.cirrus.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
block: fix uaf for flush rq while iterating tags [+ + +]
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon Nov 4 19:00:05 2024 +0800

    block: fix uaf for flush rq while iterating tags
    
    commit 3802f73bd80766d70f319658f334754164075bc3 upstream.
    
    blk_mq_clear_flush_rq_mapping() is not called during scsi probe, by
    checking blk_queue_init_done(). However, QUEUE_FLAG_INIT_DONE is cleared
    in del_gendisk by commit aec89dc5d421 ("block: keep q_usage_counter in
    atomic mode after del_gendisk"), hence for disk like scsi, following
    blk_mq_destroy_queue() will not clear flush rq from tags->rqs[] as well,
    cause following uaf that is found by our syzkaller for v6.6:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in blk_mq_find_and_get_req+0x16e/0x1a0 block/blk-mq-tag.c:261
    Read of size 4 at addr ffff88811c969c20 by task kworker/1:2H/224909
    
    CPU: 1 PID: 224909 Comm: kworker/1:2H Not tainted 6.6.0-ga836a5060850 #32
    Workqueue: kblockd blk_mq_timeout_work
    Call Trace:
    
    __dump_stack lib/dump_stack.c:88 [inline]
    dump_stack_lvl+0x91/0xf0 lib/dump_stack.c:106
    print_address_description.constprop.0+0x66/0x300 mm/kasan/report.c:364
    print_report+0x3e/0x70 mm/kasan/report.c:475
    kasan_report+0xb8/0xf0 mm/kasan/report.c:588
    blk_mq_find_and_get_req+0x16e/0x1a0 block/blk-mq-tag.c:261
    bt_iter block/blk-mq-tag.c:288 [inline]
    __sbitmap_for_each_set include/linux/sbitmap.h:295 [inline]
    sbitmap_for_each_set include/linux/sbitmap.h:316 [inline]
    bt_for_each+0x455/0x790 block/blk-mq-tag.c:325
    blk_mq_queue_tag_busy_iter+0x320/0x740 block/blk-mq-tag.c:534
    blk_mq_timeout_work+0x1a3/0x7b0 block/blk-mq.c:1673
    process_one_work+0x7c4/0x1450 kernel/workqueue.c:2631
    process_scheduled_works kernel/workqueue.c:2704 [inline]
    worker_thread+0x804/0xe40 kernel/workqueue.c:2785
    kthread+0x346/0x450 kernel/kthread.c:388
    ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
    ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:293
    
    Allocated by task 942:
    kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
    kasan_set_track+0x25/0x30 mm/kasan/common.c:52
    ____kasan_kmalloc mm/kasan/common.c:374 [inline]
    __kasan_kmalloc mm/kasan/common.c:383 [inline]
    __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:380
    kasan_kmalloc include/linux/kasan.h:198 [inline]
    __do_kmalloc_node mm/slab_common.c:1007 [inline]
    __kmalloc_node+0x69/0x170 mm/slab_common.c:1014
    kmalloc_node include/linux/slab.h:620 [inline]
    kzalloc_node include/linux/slab.h:732 [inline]
    blk_alloc_flush_queue+0x144/0x2f0 block/blk-flush.c:499
    blk_mq_alloc_hctx+0x601/0x940 block/blk-mq.c:3788
    blk_mq_alloc_and_init_hctx+0x27f/0x330 block/blk-mq.c:4261
    blk_mq_realloc_hw_ctxs+0x488/0x5e0 block/blk-mq.c:4294
    blk_mq_init_allocated_queue+0x188/0x860 block/blk-mq.c:4350
    blk_mq_init_queue_data block/blk-mq.c:4166 [inline]
    blk_mq_init_queue+0x8d/0x100 block/blk-mq.c:4176
    scsi_alloc_sdev+0x843/0xd50 drivers/scsi/scsi_scan.c:335
    scsi_probe_and_add_lun+0x77c/0xde0 drivers/scsi/scsi_scan.c:1189
    __scsi_scan_target+0x1fc/0x5a0 drivers/scsi/scsi_scan.c:1727
    scsi_scan_channel drivers/scsi/scsi_scan.c:1815 [inline]
    scsi_scan_channel+0x14b/0x1e0 drivers/scsi/scsi_scan.c:1791
    scsi_scan_host_selected+0x2fe/0x400 drivers/scsi/scsi_scan.c:1844
    scsi_scan+0x3a0/0x3f0 drivers/scsi/scsi_sysfs.c:151
    store_scan+0x2a/0x60 drivers/scsi/scsi_sysfs.c:191
    dev_attr_store+0x5c/0x90 drivers/base/core.c:2388
    sysfs_kf_write+0x11c/0x170 fs/sysfs/file.c:136
    kernfs_fop_write_iter+0x3fc/0x610 fs/kernfs/file.c:338
    call_write_iter include/linux/fs.h:2083 [inline]
    new_sync_write+0x1b4/0x2d0 fs/read_write.c:493
    vfs_write+0x76c/0xb00 fs/read_write.c:586
    ksys_write+0x127/0x250 fs/read_write.c:639
    do_syscall_x64 arch/x86/entry/common.c:51 [inline]
    do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
    entry_SYSCALL_64_after_hwframe+0x78/0xe2
    
    Freed by task 244687:
    kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
    kasan_set_track+0x25/0x30 mm/kasan/common.c:52
    kasan_save_free_info+0x2b/0x50 mm/kasan/generic.c:522
    ____kasan_slab_free mm/kasan/common.c:236 [inline]
    __kasan_slab_free+0x12a/0x1b0 mm/kasan/common.c:244
    kasan_slab_free include/linux/kasan.h:164 [inline]
    slab_free_hook mm/slub.c:1815 [inline]
    slab_free_freelist_hook mm/slub.c:1841 [inline]
    slab_free mm/slub.c:3807 [inline]
    __kmem_cache_free+0xe4/0x520 mm/slub.c:3820
    blk_free_flush_queue+0x40/0x60 block/blk-flush.c:520
    blk_mq_hw_sysfs_release+0x4a/0x170 block/blk-mq-sysfs.c:37
    kobject_cleanup+0x136/0x410 lib/kobject.c:689
    kobject_release lib/kobject.c:720 [inline]
    kref_put include/linux/kref.h:65 [inline]
    kobject_put+0x119/0x140 lib/kobject.c:737
    blk_mq_release+0x24f/0x3f0 block/blk-mq.c:4144
    blk_free_queue block/blk-core.c:298 [inline]
    blk_put_queue+0xe2/0x180 block/blk-core.c:314
    blkg_free_workfn+0x376/0x6e0 block/blk-cgroup.c:144
    process_one_work+0x7c4/0x1450 kernel/workqueue.c:2631
    process_scheduled_works kernel/workqueue.c:2704 [inline]
    worker_thread+0x804/0xe40 kernel/workqueue.c:2785
    kthread+0x346/0x450 kernel/kthread.c:388
    ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
    ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:293
    
    Other than blk_mq_clear_flush_rq_mapping(), the flag is only used in
    blk_register_queue() from initialization path, hence it's safe not to
    clear the flag in del_gendisk. And since QUEUE_FLAG_REGISTERED already
    make sure that queue should only be registered once, there is no need
    to test the flag as well.
    
    Fixes: 6cfeadbff3f8 ("blk-mq: don't clear flush_rq from tags->rqs[]")
    Depends-on: commit aec89dc5d421 ("block: keep q_usage_counter in atomic mode after del_gendisk")
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Reviewed-by: Ming Lei <ming.lei@redhat.com>
    Link: https://lore.kernel.org/r/20241104110005.1412161-1-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: BRUNO VERNAY <bruno.vernay@se.com>
    Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
bpf: Fix bpf_sk_select_reuseport() memory leak [+ + +]
Author: Michal Luczaj <mhal@rbox.co>
Date:   Fri Jan 10 14:21:55 2025 +0100

    bpf: Fix bpf_sk_select_reuseport() memory leak
    
    [ Upstream commit b3af60928ab9129befa65e6df0310d27300942bf ]
    
    As pointed out in the original comment, lookup in sockmap can return a TCP
    ESTABLISHED socket. Such TCP socket may have had SO_ATTACH_REUSEPORT_EBPF
    set before it was ESTABLISHED. In other words, a non-NULL sk_reuseport_cb
    does not imply a non-refcounted socket.
    
    Drop sk's reference in both error paths.
    
    unreferenced object 0xffff888101911800 (size 2048):
      comm "test_progs", pid 44109, jiffies 4297131437
      hex dump (first 32 bytes):
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        80 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace (crc 9336483b):
        __kmalloc_noprof+0x3bf/0x560
        __reuseport_alloc+0x1d/0x40
        reuseport_alloc+0xca/0x150
        reuseport_attach_prog+0x87/0x140
        sk_reuseport_attach_bpf+0xc8/0x100
        sk_setsockopt+0x1181/0x1990
        do_sock_setsockopt+0x12b/0x160
        __sys_setsockopt+0x7b/0xc0
        __x64_sys_setsockopt+0x1b/0x30
        do_syscall_64+0x93/0x180
        entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Fixes: 64d85290d79c ("bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH")
    Signed-off-by: Michal Luczaj <mhal@rbox.co>
    Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
    Link: https://patch.msgid.link/20250110-reuseport-memleak-v1-1-fa1ddab0adfe@rbox.co
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cachefiles: Parse the "secctx" immediately [+ + +]
Author: Max Kellermann <max.kellermann@ionos.com>
Date:   Fri Dec 13 13:50:05 2024 +0000

    cachefiles: Parse the "secctx" immediately
    
    [ Upstream commit e5a8b6446c0d370716f193771ccacf3260a57534 ]
    
    Instead of storing an opaque string, call security_secctx_to_secid()
    right in the "secctx" command handler and store only the numeric
    "secid".  This eliminates an unnecessary string allocation and allows
    the daemon to receive errors when writing the "secctx" command instead
    of postponing the error to the "bind" command handler.  For example,
    if the kernel was built without `CONFIG_SECURITY`, "bind" will return
    `EOPNOTSUPP`, but the daemon doesn't know why.  With this patch, the
    "secctx" will instead return `EOPNOTSUPP` which is the right context
    for this error.
    
    This patch adds a boolean flag `have_secid` because I'm not sure if we
    can safely assume that zero is the special secid value for "not set".
    This appears to be true for SELinux, Smack and AppArmor, but since
    this attribute is not documented, I'm unable to derive a stable
    guarantee for that.
    
    Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/20241209141554.638708-1-max.kellermann@ionos.com/
    Link: https://lore.kernel.org/r/20241213135013.2964079-6-dhowells@redhat.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create' [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Sep 25 20:04:15 2024 +0530

    drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create'
    
    commit 63de35a8fcfca59ae8750d469a7eb220c7557baf upstream.
    
    An issue was identified in the dcn21_link_encoder_create function where
    an out-of-bounds access could occur when the hpd_source index was used
    to reference the link_enc_hpd_regs array. This array has a fixed size
    and the index was not being checked against the array's bounds before
    accessing it.
    
    This fix adds a conditional check to ensure that the hpd_source index is
    within the valid range of the link_enc_hpd_regs array. If the index is
    out of bounds, the function now returns NULL to prevent undefined
    behavior.
    
    References:
    
    [   65.920507] ------------[ cut here ]------------
    [   65.920510] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312:29
    [   65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers [5]'
    [   65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G           OE      6.8.0-cleanershaderfeatureresetasdntipmi200nv2132 #13
    [   65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS WMJ0429N_Weekly_20_04_2 04/29/2020
    [   65.920527] Call Trace:
    [   65.920529]  <TASK>
    [   65.920532]  dump_stack_lvl+0x48/0x70
    [   65.920541]  dump_stack+0x10/0x20
    [   65.920543]  __ubsan_handle_out_of_bounds+0xa2/0xe0
    [   65.920549]  dcn21_link_encoder_create+0xd9/0x140 [amdgpu]
    [   65.921009]  link_create+0x6d3/0xed0 [amdgpu]
    [   65.921355]  create_links+0x18a/0x4e0 [amdgpu]
    [   65.921679]  dc_create+0x360/0x720 [amdgpu]
    [   65.921999]  ? dmi_matches+0xa0/0x220
    [   65.922004]  amdgpu_dm_init+0x2b6/0x2c90 [amdgpu]
    [   65.922342]  ? console_unlock+0x77/0x120
    [   65.922348]  ? dev_printk_emit+0x86/0xb0
    [   65.922354]  dm_hw_init+0x15/0x40 [amdgpu]
    [   65.922686]  amdgpu_device_init+0x26a8/0x33a0 [amdgpu]
    [   65.922921]  amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu]
    [   65.923087]  amdgpu_pci_probe+0x1b7/0x630 [amdgpu]
    [   65.923087]  local_pci_probe+0x4b/0xb0
    [   65.923087]  pci_device_probe+0xc8/0x280
    [   65.923087]  really_probe+0x187/0x300
    [   65.923087]  __driver_probe_device+0x85/0x130
    [   65.923087]  driver_probe_device+0x24/0x110
    [   65.923087]  __driver_attach+0xac/0x1d0
    [   65.923087]  ? __pfx___driver_attach+0x10/0x10
    [   65.923087]  bus_for_each_dev+0x7d/0xd0
    [   65.923087]  driver_attach+0x1e/0x30
    [   65.923087]  bus_add_driver+0xf2/0x200
    [   65.923087]  driver_register+0x64/0x130
    [   65.923087]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
    [   65.923087]  __pci_register_driver+0x61/0x70
    [   65.923087]  amdgpu_init+0x7d/0xff0 [amdgpu]
    [   65.923087]  do_one_initcall+0x49/0x310
    [   65.923087]  ? kmalloc_trace+0x136/0x360
    [   65.923087]  do_init_module+0x6a/0x270
    [   65.923087]  load_module+0x1fce/0x23a0
    [   65.923087]  init_module_from_file+0x9c/0xe0
    [   65.923087]  ? init_module_from_file+0x9c/0xe0
    [   65.923087]  idempotent_init_module+0x179/0x230
    [   65.923087]  __x64_sys_finit_module+0x5d/0xa0
    [   65.923087]  do_syscall_64+0x76/0x120
    [   65.923087]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [   65.923087] RIP: 0033:0x7f2d80f1e88d
    [   65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
    [   65.923087] RSP: 002b:00007ffc7bc1aa78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    [   65.923087] RAX: ffffffffffffffda RBX: 0000564c9c1db130 RCX: 00007f2d80f1e88d
    [   65.923087] RDX: 0000000000000000 RSI: 0000564c9c1e5480 RDI: 000000000000000f
    [   65.923087] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002
    [   65.923087] R10: 000000000000000f R11: 0000000000000246 R12: 0000564c9c1e5480
    [   65.923087] R13: 0000564c9c1db260 R14: 0000000000000000 R15: 0000564c9c1e54b0
    [   65.923087]  </TASK>
    [   65.923927] ---[ end trace ]---
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Roman Li <roman.li@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Bin Lan <lanbincn@qq.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amdgpu: fix usage slab after free [+ + +]
Author: Vitaly Prosyak <vitaly.prosyak@amd.com>
Date:   Mon Nov 11 17:24:08 2024 -0500

    drm/amdgpu: fix usage slab after free
    
    commit b61badd20b443eabe132314669bb51a263982e5c upstream.
    
    [  +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147
    
    [  +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1
    [  +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
    [  +0.000016] Call Trace:
    [  +0.000008]  <TASK>
    [  +0.000009]  dump_stack_lvl+0x76/0xa0
    [  +0.000017]  print_report+0xce/0x5f0
    [  +0.000017]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000019]  ? srso_return_thunk+0x5/0x5f
    [  +0.000015]  ? kasan_complete_mode_report_info+0x72/0x200
    [  +0.000016]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000019]  kasan_report+0xbe/0x110
    [  +0.000015]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000023]  __asan_report_load8_noabort+0x14/0x30
    [  +0.000014]  drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000020]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? __kasan_check_write+0x14/0x30
    [  +0.000016]  ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
    [  +0.000020]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? __kasan_check_write+0x14/0x30
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? enable_work+0x124/0x220
    [  +0.000015]  ? __pfx_enable_work+0x10/0x10
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? free_large_kmalloc+0x85/0xf0
    [  +0.000016]  drm_sched_entity_destroy+0x18/0x30 [gpu_sched]
    [  +0.000020]  amdgpu_vce_sw_fini+0x55/0x170 [amdgpu]
    [  +0.000735]  ? __kasan_check_read+0x11/0x20
    [  +0.000016]  vce_v4_0_sw_fini+0x80/0x110 [amdgpu]
    [  +0.000726]  amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu]
    [  +0.000679]  ? mutex_unlock+0x80/0xe0
    [  +0.000017]  ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu]
    [  +0.000662]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? __kasan_check_write+0x14/0x30
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? mutex_unlock+0x80/0xe0
    [  +0.000016]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
    [  +0.000663]  drm_minor_release+0xc9/0x140 [drm]
    [  +0.000081]  drm_release+0x1fd/0x390 [drm]
    [  +0.000082]  __fput+0x36c/0xad0
    [  +0.000018]  __fput_sync+0x3c/0x50
    [  +0.000014]  __x64_sys_close+0x7d/0xe0
    [  +0.000014]  x64_sys_call+0x1bc6/0x2680
    [  +0.000014]  do_syscall_64+0x70/0x130
    [  +0.000014]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? irqentry_exit_to_user_mode+0x60/0x190
    [  +0.000015]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? irqentry_exit+0x43/0x50
    [  +0.000012]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? exc_page_fault+0x7c/0x110
    [  +0.000015]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [  +0.000014] RIP: 0033:0x7ffff7b14f67
    [  +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
    [  +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
    [  +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
    [  +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003
    [  +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000
    [  +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
    [  +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
    [  +0.000020]  </TASK>
    
    [  +0.000016] Allocated by task 383 on cpu 7 at 26.880319s:
    [  +0.000014]  kasan_save_stack+0x28/0x60
    [  +0.000008]  kasan_save_track+0x18/0x70
    [  +0.000007]  kasan_save_alloc_info+0x38/0x60
    [  +0.000007]  __kasan_kmalloc+0xc1/0xd0
    [  +0.000007]  kmalloc_trace_noprof+0x180/0x380
    [  +0.000007]  drm_sched_init+0x411/0xec0 [gpu_sched]
    [  +0.000012]  amdgpu_device_init+0x695f/0xa610 [amdgpu]
    [  +0.000658]  amdgpu_driver_load_kms+0x1a/0x120 [amdgpu]
    [  +0.000662]  amdgpu_pci_probe+0x361/0xf30 [amdgpu]
    [  +0.000651]  local_pci_probe+0xe7/0x1b0
    [  +0.000009]  pci_device_probe+0x248/0x890
    [  +0.000008]  really_probe+0x1fd/0x950
    [  +0.000008]  __driver_probe_device+0x307/0x410
    [  +0.000007]  driver_probe_device+0x4e/0x150
    [  +0.000007]  __driver_attach+0x223/0x510
    [  +0.000006]  bus_for_each_dev+0x102/0x1a0
    [  +0.000007]  driver_attach+0x3d/0x60
    [  +0.000006]  bus_add_driver+0x2ac/0x5f0
    [  +0.000006]  driver_register+0x13d/0x490
    [  +0.000008]  __pci_register_driver+0x1ee/0x2b0
    [  +0.000007]  llc_sap_close+0xb0/0x160 [llc]
    [  +0.000009]  do_one_initcall+0x9c/0x3e0
    [  +0.000008]  do_init_module+0x241/0x760
    [  +0.000008]  load_module+0x51ac/0x6c30
    [  +0.000006]  __do_sys_init_module+0x234/0x270
    [  +0.000007]  __x64_sys_init_module+0x73/0xc0
    [  +0.000006]  x64_sys_call+0xe3/0x2680
    [  +0.000006]  do_syscall_64+0x70/0x130
    [  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    [  +0.000015] Freed by task 2147 on cpu 6 at 160.507651s:
    [  +0.000013]  kasan_save_stack+0x28/0x60
    [  +0.000007]  kasan_save_track+0x18/0x70
    [  +0.000007]  kasan_save_free_info+0x3b/0x60
    [  +0.000007]  poison_slab_object+0x115/0x1c0
    [  +0.000007]  __kasan_slab_free+0x34/0x60
    [  +0.000007]  kfree+0xfa/0x2f0
    [  +0.000007]  drm_sched_fini+0x19d/0x410 [gpu_sched]
    [  +0.000012]  amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu]
    [  +0.000662]  amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu]
    [  +0.000653]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
    [  +0.000655]  drm_minor_release+0xc9/0x140 [drm]
    [  +0.000071]  drm_release+0x1fd/0x390 [drm]
    [  +0.000071]  __fput+0x36c/0xad0
    [  +0.000008]  __fput_sync+0x3c/0x50
    [  +0.000007]  __x64_sys_close+0x7d/0xe0
    [  +0.000007]  x64_sys_call+0x1bc6/0x2680
    [  +0.000007]  do_syscall_64+0x70/0x130
    [  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    [  +0.000014] The buggy address belongs to the object at ffff8881b8605f80
                   which belongs to the cache kmalloc-64 of size 64
    [  +0.000020] The buggy address is located 8 bytes inside of
                   freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0)
    
    [  +0.000028] The buggy address belongs to the physical page:
    [  +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605
    [  +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
    [  +0.000007] page_type: 0xffffefff(slab)
    [  +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001
    [  +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000
    [  +0.000006] page dumped because: kasan: bad access detected
    
    [  +0.000012] Memory state around the buggy address:
    [  +0.000011]  ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
    [  +0.000015]  ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
    [  +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
    [  +0.000013]                       ^
    [  +0.000011]  ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
    [  +0.000014]  ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb
    [  +0.000013] ==================================================================
    
    The issue reproduced on VG20 during the IGT pci_unplug test.
    The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill.
    In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by
    each entity within the run queue, leading to invalid memory access.
    To resolve this, the order of cleanup calls is updated:
    
        Before:
            amdgpu_fence_driver_sw_fini
            amdgpu_device_ip_fini
    
        After:
            amdgpu_device_ip_fini
            amdgpu_fence_driver_sw_fini
    
    This updated order ensures that all entities in the IPs are cleaned up first, followed by proper
    cleanup of the schedulers.
    
    Additional Investigation:
    
    During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo
    buffer must be freed only as the final step in the cleanup process to prevent any premature
    access during earlier cleanup stages.
    
    v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini.
    
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Alva Lan <alvalan9@foxmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/i915/fb: Relax clear color alignment to 64 bytes [+ + +]
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri Nov 29 08:50:11 2024 +0200

    drm/i915/fb: Relax clear color alignment to 64 bytes
    
    commit 1a5401ec3018c101c456cdbda2eaef9482db6786 upstream.
    
    Mesa changed its clear color alignment from 4k to 64 bytes
    without informing the kernel side about the change. This
    is now likely to cause framebuffer creation to fail.
    
    The only thing we do with the clear color buffer in i915 is:
    1. map a single page
    2. read out bytes 16-23 from said page
    3. unmap the page
    
    So the only requirement we really have is that those 8 bytes
    are all contained within one page. Thus we can deal with the
    Mesa regression by reducing the alignment requiment from 4k
    to the same 64 bytes in the kernel. We could even go as low as
    32 bytes, but IIRC 64 bytes is the hardware requirement on
    the 3D engine side so matching that seems sensible.
    
    Note that the Mesa alignment chages were partially undone
    so the regression itself was already fixed on userspace
    side.
    
    Cc: stable@vger.kernel.org
    Cc: Sagar Ghuge <sagar.ghuge@intel.com>
    Cc: Nanley Chery <nanley.g.chery@intel.com>
    Reported-by: Xi Ruoyao <xry111@xry111.site>
    Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13057
    Closes: https://lore.kernel.org/all/45a5bba8de009347262d86a4acb27169d9ae0d9f.camel@xry111.site/
    Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/17f97a69c13832a6c1b0b3aad45b06f07d4b852f
    Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/888f63cf1baf34bc95e847a30a041dc7798edddb
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20241129065014.8363-2-ville.syrjala@linux.intel.com
    Tested-by: Xi Ruoyao <xry111@xry111.site>
    Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
    (cherry picked from commit ed3a892e5e3d6b3f6eeb76db7c92a968aeb52f3d)
    Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/v3d: Ensure job pointer is set to NULL after job completion [+ + +]
Author: Maíra Canal <mcanal@igalia.com>
Date:   Mon Jan 13 12:47:40 2025 -0300

    drm/v3d: Ensure job pointer is set to NULL after job completion
    
    [ Upstream commit e4b5ccd392b92300a2b341705cc4805681094e49 ]
    
    After a job completes, the corresponding pointer in the device must
    be set to NULL. Failing to do so triggers a warning when unloading
    the driver, as it appears the job is still active. To prevent this,
    assign the job pointer to NULL after completing the job, indicating
    the job has finished.
    
    Fixes: 14d1d1908696 ("drm/v3d: Remove the bad signaled() implementation.")
    Signed-off-by: Maíra Canal <mcanal@igalia.com>
    Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20250113154741.67520-1-mcanal@igalia.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
erofs: handle NONHEAD !delta[1] lclusters gracefully [+ + +]
Author: Gao Xiang <xiang@kernel.org>
Date:   Sat Nov 16 01:36:51 2024 +0800

    erofs: handle NONHEAD !delta[1] lclusters gracefully
    
    commit 0bc8061ffc733a0a246b8689b2d32a3e9204f43c upstream.
    
    syzbot reported a WARNING in iomap_iter_done:
     iomap_fiemap+0x73b/0x9b0 fs/iomap/fiemap.c:80
     ioctl_fiemap fs/ioctl.c:220 [inline]
    
    Generally, NONHEAD lclusters won't have delta[1]==0, except for crafted
    images and filesystems created by pre-1.0 mkfs versions.
    
    Previously, it would immediately bail out if delta[1]==0, which led to
    inadequate decompressed lengths (thus FIEMAP is impacted).  Treat it as
    delta[1]=1 to work around these legacy mkfs versions.
    
    `lclusterbits > 14` is illegal for compact indexes, error out too.
    
    Reported-by: syzbot+6c0b301317aa0156f9eb@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/r/67373c0c.050a0220.2a2fcc.0079.GAE@google.com
    Tested-by: syzbot+6c0b301317aa0156f9eb@syzkaller.appspotmail.com
    Fixes: d95ae5e25326 ("erofs: add support for the full decompressed length")
    Fixes: 001b8ccd0650 ("erofs: fix compact 4B support for 16k block size")
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20241115173651.3339514-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

erofs: tidy up EROFS on-disk naming [+ + +]
Author: Gao Xiang <xiang@kernel.org>
Date:   Fri Mar 31 14:31:49 2023 +0800

    erofs: tidy up EROFS on-disk naming
    
    commit 1c7f49a76773bcf95d3c840cff6cd449114ddf56 upstream.
    
     - Get rid of all "vle" (variable-length extents) expressions
       since they only expand overall name lengths unnecessarily;
     - Rename COMPRESSION_LEGACY to COMPRESSED_FULL;
     - Move on-disk directory definitions ahead of compression;
     - Drop unused extended attribute definitions;
     - Move inode ondisk union `i_u` out as `union erofs_inode_i_u`.
    
    No actual logical change.
    
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Yue Hu <huyue2@coolpad.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20230331063149.25611-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
filemap: avoid truncating 64-bit offset to 32 bits [+ + +]
Author: Marco Nelissen <marco.nelissen@gmail.com>
Date:   Thu Jan 2 11:04:11 2025 -0800

    filemap: avoid truncating 64-bit offset to 32 bits
    
    commit f505e6c91e7a22d10316665a86d79f84d9f0ba76 upstream.
    
    On 32-bit kernels, folio_seek_hole_data() was inadvertently truncating a
    64-bit value to 32 bits, leading to a possible infinite loop when writing
    to an xfs filesystem.
    
    Link: https://lkml.kernel.org/r/20250102190540.1356838-1-marco.nelissen@gmail.com
    Fixes: 54fa39ac2e00 ("iomap: use mapping_seek_hole_data")
    Signed-off-by: Marco Nelissen <marco.nelissen@gmail.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
fs/proc: fix softlockup in __read_vmcore (part 2) [+ + +]
Author: Rik van Riel <riel@surriel.com>
Date:   Fri Jan 10 10:28:21 2025 -0500

    fs/proc: fix softlockup in __read_vmcore (part 2)
    
    commit cbc5dde0a461240046e8a41c43d7c3b76d5db952 upstream.
    
    Since commit 5cbcb62dddf5 ("fs/proc: fix softlockup in __read_vmcore") the
    number of softlockups in __read_vmcore at kdump time have gone down, but
    they still happen sometimes.
    
    In a memory constrained environment like the kdump image, a softlockup is
    not just a harmless message, but it can interfere with things like RCU
    freeing memory, causing the crashdump to get stuck.
    
    The second loop in __read_vmcore has a lot more opportunities for natural
    sleep points, like scheduling out while waiting for a data write to
    happen, but apparently that is not always enough.
    
    Add a cond_resched() to the second loop in __read_vmcore to (hopefully)
    get rid of the softlockups.
    
    Link: https://lkml.kernel.org/r/20250110102821.2a37581b@fangorn
    Fixes: 5cbcb62dddf5 ("fs/proc: fix softlockup in __read_vmcore")
    Signed-off-by: Rik van Riel <riel@surriel.com>
    Reported-by: Breno Leitao <leitao@debian.org>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
fs: fix missing declaration of init_files [+ + +]
Author: Zhang Kunbo <zhangkunbo@huawei.com>
Date:   Tue Dec 17 07:18:36 2024 +0000

    fs: fix missing declaration of init_files
    
    [ Upstream commit 2b2fc0be98a828cf33a88a28e9745e8599fb05cf ]
    
    fs/file.c should include include/linux/init_task.h  for
     declaration of init_files. This fixes the sparse warning:
    
    fs/file.c:501:21: warning: symbol 'init_files' was not declared. Should it be static?
    
    Signed-off-by: Zhang Kunbo <zhangkunbo@huawei.com>
    Link: https://lore.kernel.org/r/20241217071836.2634868-1-zhangkunbo@huawei.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gpiolib: cdev: Fix use after free in lineinfo_changed_notify [+ + +]
Author: Zhongqiu Han <quic_zhonhan@quicinc.com>
Date:   Sun May 5 22:11:56 2024 +0800

    gpiolib: cdev: Fix use after free in lineinfo_changed_notify
    
    commit 02f6b0e1ec7e0e7d059dddc893645816552039da upstream.
    
    The use-after-free issue occurs as follows: when the GPIO chip device file
    is being closed by invoking gpio_chrdev_release(), watched_lines is freed
    by bitmap_free(), but the unregistration of lineinfo_changed_nb notifier
    chain failed due to waiting write rwsem. Additionally, one of the GPIO
    chip's lines is also in the release process and holds the notifier chain's
    read rwsem. Consequently, a race condition leads to the use-after-free of
    watched_lines.
    
    Here is the typical stack when issue happened:
    
    [free]
    gpio_chrdev_release()
      --> bitmap_free(cdev->watched_lines)                  <-- freed
      --> blocking_notifier_chain_unregister()
        --> down_write(&nh->rwsem)                          <-- waiting rwsem
              --> __down_write_common()
                --> rwsem_down_write_slowpath()
                      --> schedule_preempt_disabled()
                        --> schedule()
    
    [use]
    st54spi_gpio_dev_release()
      --> gpio_free()
        --> gpiod_free()
          --> gpiod_free_commit()
            --> gpiod_line_state_notify()
              --> blocking_notifier_call_chain()
                --> down_read(&nh->rwsem);                  <-- held rwsem
                --> notifier_call_chain()
                  --> lineinfo_changed_notify()
                    --> test_bit(xxxx, cdev->watched_lines) <-- use after free
    
    The side effect of the use-after-free issue is that a GPIO line event is
    being generated for userspace where it shouldn't. However, since the chrdev
    is being closed, userspace won't have the chance to read that event anyway.
    
    To fix the issue, call the bitmap_free() function after the unregistration
    of lineinfo_changed_nb notifier chain.
    
    Fixes: 51c1064e82e7 ("gpiolib: add new ioctl() for monitoring changes in line info")
    Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
    Link: https://lore.kernel.org/r/20240505141156.2944912-1-quic_zhonhan@quicinc.com
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Bruno VERNAY <bruno.vernay@se.com>
    Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
gtp: Destroy device along with udp socket's netns dismantle. [+ + +]
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Jan 10 10:47:53 2025 +0900

    gtp: Destroy device along with udp socket's netns dismantle.
    
    [ Upstream commit eb28fd76c0a08a47b470677c6cef9dd1c60e92d1 ]
    
    gtp_newlink() links the device to a list in dev_net(dev) instead of
    src_net, where a udp tunnel socket is created.
    
    Even when src_net is removed, the device stays alive on dev_net(dev).
    Then, removing src_net triggers the splat below. [0]
    
    In this example, gtp0 is created in ns2, and the udp socket is created
    in ns1.
    
      ip netns add ns1
      ip netns add ns2
      ip -n ns1 link add netns ns2 name gtp0 type gtp role sgsn
      ip netns del ns1
    
    Let's link the device to the socket's netns instead.
    
    Now, gtp_net_exit_batch_rtnl() needs another netdev iteration to remove
    all gtp devices in the netns.
    
    [0]:
    ref_tracker: net notrefcnt@000000003d6e7d05 has 1/2 users at
         sk_alloc (./include/net/net_namespace.h:345 net/core/sock.c:2236)
         inet_create (net/ipv4/af_inet.c:326 net/ipv4/af_inet.c:252)
         __sock_create (net/socket.c:1558)
         udp_sock_create4 (net/ipv4/udp_tunnel_core.c:18)
         gtp_create_sock (./include/net/udp_tunnel.h:59 drivers/net/gtp.c:1423)
         gtp_create_sockets (drivers/net/gtp.c:1447)
         gtp_newlink (drivers/net/gtp.c:1507)
         rtnl_newlink (net/core/rtnetlink.c:3786 net/core/rtnetlink.c:3897 net/core/rtnetlink.c:4012)
         rtnetlink_rcv_msg (net/core/rtnetlink.c:6922)
         netlink_rcv_skb (net/netlink/af_netlink.c:2542)
         netlink_unicast (net/netlink/af_netlink.c:1321 net/netlink/af_netlink.c:1347)
         netlink_sendmsg (net/netlink/af_netlink.c:1891)
         ____sys_sendmsg (net/socket.c:711 net/socket.c:726 net/socket.c:2583)
         ___sys_sendmsg (net/socket.c:2639)
         __sys_sendmsg (net/socket.c:2669)
         do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
    
    WARNING: CPU: 1 PID: 60 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179)
    Modules linked in:
    CPU: 1 UID: 0 PID: 60 Comm: kworker/u16:2 Not tainted 6.13.0-rc5-00147-g4c1224501e9d #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    Workqueue: netns cleanup_net
    RIP: 0010:ref_tracker_dir_exit (lib/ref_tracker.c:179)
    Code: 00 00 00 fc ff df 4d 8b 26 49 bd 00 01 00 00 00 00 ad de 4c 39 f5 0f 85 df 00 00 00 48 8b 74 24 08 48 89 df e8 a5 cc 12 02 90 <0f> 0b 90 48 8d 6b 44 be 04 00 00 00 48 89 ef e8 80 de 67 ff 48 89
    RSP: 0018:ff11000009a07b60 EFLAGS: 00010286
    RAX: 0000000000002bd3 RBX: ff1100000f4e1aa0 RCX: 1ffffffff0e40ac6
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8423ee3c
    RBP: ff1100000f4e1af0 R08: 0000000000000001 R09: fffffbfff0e395ae
    R10: 0000000000000001 R11: 0000000000036001 R12: ff1100000f4e1af0
    R13: dead000000000100 R14: ff1100000f4e1af0 R15: dffffc0000000000
    FS:  0000000000000000(0000) GS:ff1100006ce80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f9b2464bd98 CR3: 0000000005286005 CR4: 0000000000771ef0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     <TASK>
     ? __warn (kernel/panic.c:748)
     ? ref_tracker_dir_exit (lib/ref_tracker.c:179)
     ? report_bug (lib/bug.c:201 lib/bug.c:219)
     ? handle_bug (arch/x86/kernel/traps.c:285)
     ? exc_invalid_op (arch/x86/kernel/traps.c:309 (discriminator 1))
     ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
     ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
     ? ref_tracker_dir_exit (lib/ref_tracker.c:179)
     ? __pfx_ref_tracker_dir_exit (lib/ref_tracker.c:158)
     ? kfree (mm/slub.c:4613 mm/slub.c:4761)
     net_free (net/core/net_namespace.c:476 net/core/net_namespace.c:467)
     cleanup_net (net/core/net_namespace.c:664 (discriminator 3))
     process_one_work (kernel/workqueue.c:3229)
     worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391)
     kthread (kernel/kthread.c:389)
     ret_from_fork (arch/x86/kernel/process.c:147)
     ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
     </TASK>
    
    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Reported-by: Xiao Liang <shaw.leon@gmail.com>
    Closes: https://lore.kernel.org/netdev/20250104125732.17335-1-shaw.leon@gmail.com/
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gtp: use exit_batch_rtnl() method [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Feb 6 14:43:03 2024 +0000

    gtp: use exit_batch_rtnl() method
    
    [ Upstream commit 6eedda01b2bfdcf427b37759e053dc27232f3af1 ]
    
    exit_batch_rtnl() is called while RTNL is held,
    and devices to be unregistered can be queued in the dev_kill_list.
    
    This saves one rtnl_lock()/rtnl_unlock() pair per netns
    and one unregister_netdevice_many() call per netns.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Antoine Tenart <atenart@kernel.org>
    Link: https://lore.kernel.org/r/20240206144313.2050392-8-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 46841c7053e6 ("gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp(). [+ + +]
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Jan 10 10:47:52 2025 +0900

    gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().
    
    [ Upstream commit 46841c7053e6d25fb33e0534ef023833bf03e382 ]
    
    gtp_newlink() links the gtp device to a list in dev_net(dev).
    
    However, even after the gtp device is moved to another netns,
    it stays on the list but should be invisible.
    
    Let's use for_each_netdev_rcu() for netdev traversal in
    gtp_genl_dump_pdp().
    
    Note that gtp_dev_list is no longer used under RCU, so list
    helpers are converted to the non-RCU variant.
    
    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Reported-by: Xiao Liang <shaw.leon@gmail.com>
    Closes: https://lore.kernel.org/netdev/CABAhCOQdBL6h9M2C+kd+bGivRJ9Q72JUxW+-gur0nub_=PmFPA@mail.gmail.com/
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
hfs: Sanity check the root record [+ + +]
Author: Leo Stone <leocstone@gmail.com>
Date:   Sat Nov 30 21:14:19 2024 -0800

    hfs: Sanity check the root record
    
    [ Upstream commit b905bafdea21a75d75a96855edd9e0b6051eee30 ]
    
    In the syzbot reproducer, the hfs_cat_rec for the root dir has type
    HFS_CDR_FIL after being read with hfs_bnode_read() in hfs_super_fill().
    This indicates it should be used as an hfs_cat_file, which is 102 bytes.
    Only the first 70 bytes of that struct are initialized, however,
    because the entrylength passed into hfs_bnode_read() is still the length of
    a directory record. This causes uninitialized values to be used later on,
    when the hfs_cat_rec union is treated as the larger hfs_cat_file struct.
    
    Add a check to make sure the retrieved record has the correct type
    for the root directory (HFS_CDR_DIR), and make sure we load the correct
    number of bytes for a directory record.
    
    Reported-by: syzbot+2db3c7526ba68f4ea776@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=2db3c7526ba68f4ea776
    Tested-by: syzbot+2db3c7526ba68f4ea776@syzkaller.appspotmail.com
    Tested-by: Leo Stone <leocstone@gmail.com>
    Signed-off-by: Leo Stone <leocstone@gmail.com>
    Link: https://lore.kernel.org/r/20241201051420.77858-1-leocstone@gmail.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
hrtimers: Handle CPU state correctly on hotplug [+ + +]
Author: Koichiro Den <koichiro.den@canonical.com>
Date:   Fri Dec 20 22:44:21 2024 +0900

    hrtimers: Handle CPU state correctly on hotplug
    
    commit 2f8dea1692eef2b7ba6a256246ed82c365fdc686 upstream.
    
    Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway
    through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to
    CPUHP_ONLINE:
    
    Since hrtimers_prepare_cpu() does not run, cpu_base.hres_active remains set
    to 1 throughout. However, during a CPU unplug operation, the tick and the
    clockevents are shut down at CPUHP_AP_TICK_DYING. On return to the online
    state, for instance CFS incorrectly assumes that the hrtick is already
    active, and the chance of the clockevent device to transition to oneshot
    mode is also lost forever for the CPU, unless it goes back to a lower state
    than CPUHP_HRTIMERS_PREPARE once.
    
    This round-trip reveals another issue; cpu_base.online is not set to 1
    after the transition, which appears as a WARN_ON_ONCE in enqueue_hrtimer().
    
    Aside of that, the bulk of the per CPU state is not reset either, which
    means there are dangling pointers in the worst case.
    
    Address this by adding a corresponding startup() callback, which resets the
    stale per CPU state and sets the online flag.
    
    [ tglx: Make the new callback unconditionally available, remove the online
            modification in the prepare() callback and clear the remaining
            state in the starting callback instead of the prepare callback ]
    
    Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
    Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20241220134421.3809834-1-koichiro.den@canonical.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
hwmon: (tmp513) Fix division of negative numbers [+ + +]
Author: David Lechner <dlechner@baylibre.com>
Date:   Tue Jan 14 15:45:52 2025 -0600

    hwmon: (tmp513) Fix division of negative numbers
    
    [ Upstream commit e2c68cea431d65292b592c9f8446c918d45fcf78 ]
    
    Fix several issues with division of negative numbers in the tmp513
    driver.
    
    The docs on the DIV_ROUND_CLOSEST macro explain that dividing a negative
    value by an unsigned type is undefined behavior. The driver was doing
    this in several places, i.e. data->shunt_uohms has type of u32. The
    actual "undefined" behavior is that it converts both values to unsigned
    before doing the division, for example:
    
        int ret = DIV_ROUND_CLOSEST(-100, 3U);
    
    results in ret == 1431655732 instead of -33.
    
    Furthermore the MILLI macro has a type of unsigned long. Multiplying a
    signed long by an unsigned long results in an unsigned long.
    
    So, we need to cast both MILLI and data data->shunt_uohms to long when
    using the DIV_ROUND_CLOSEST macro.
    
    Fixes: f07f9d2467f4 ("hwmon: (tmp513) Use SI constants from units.h")
    Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
    Signed-off-by: David Lechner <dlechner@baylibre.com>
    Link: https://lore.kernel.org/r/20250114-fix-si-prefix-macro-sign-bugs-v1-1-696fd8d10f00@baylibre.com
    [groeck: Drop some continuation lines]
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
i2c: mux: demux-pinctrl: check initial mux selection, too [+ + +]
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date:   Wed Jan 15 08:29:45 2025 +0100

    i2c: mux: demux-pinctrl: check initial mux selection, too
    
    [ Upstream commit ca89f73394daf92779ddaa37b42956f4953f3941 ]
    
    When misconfigured, the initial setup of the current mux channel can
    fail, too. It must be checked as well.
    
    Fixes: 50a5ba876908 ("i2c: mux: demux-pinctrl: add driver")
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i2c: rcar: fix NACK handling when being a target [+ + +]
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date:   Wed Jan 15 13:36:23 2025 +0100

    i2c: rcar: fix NACK handling when being a target
    
    [ Upstream commit 093f70c134f70e4632b295240f07d2b50b74e247 ]
    
    When this controller is a target, the NACK handling had two issues.
    First, the return value from the backend was not checked on the initial
    WRITE_REQUESTED. So, the driver missed to send a NACK in this case.
    Also, the NACK always arrives one byte late on the bus, even in the
    WRITE_RECEIVED case. This seems to be a HW issue. We should then not
    rely on the backend to correctly NACK the superfluous byte as well. Fix
    both issues by introducing a flag which gets set whenever the backend
    requests a NACK and keep sending it until we get a STOP condition.
    
    Fixes: de20d1857dd6 ("i2c: rcar: add slave support")
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iio: adc: rockchip_saradc: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:12 2024 +0100

    iio: adc: rockchip_saradc: fix information leak in triggered buffer
    
    commit 38724591364e1e3b278b4053f102b49ea06ee17c upstream.
    
    The 'data' local struct is used to push data to user space from a
    triggered buffer, but it does not set values for inactive channels, as
    it only uses iio_for_each_active_channel() to assign new values.
    
    Initialize the struct to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Bin Lan <lanbincn@qq.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: imu: inv_icm42600: fix spi burst write not supported [+ + +]
Author: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Date:   Tue Nov 12 10:30:10 2024 +0100

    iio: imu: inv_icm42600: fix spi burst write not supported
    
    commit c0f866de4ce447bca3191b9cefac60c4b36a7922 upstream.
    
    Burst write with SPI is not working for all icm42600 chips. It was
    only used for setting user offsets with regmap_bulk_write.
    
    Add specific SPI regmap config for using only single write with SPI.
    
    Fixes: 9f9ff91b775b ("iio: imu: inv_icm42600: add SPI driver for inv_icm42600 driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
    Link: https://patch.msgid.link/20241112-inv-icm42600-fix-spi-burst-write-not-supported-v2-1-97690dc03607@tdk.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on [+ + +]
Author: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Date:   Wed Nov 13 21:25:45 2024 +0100

    iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on
    
    commit 65a60a590142c54a3f3be11ff162db2d5b0e1e06 upstream.
    
    Currently suspending while sensors are one will result in timestamping
    continuing without gap at resume. It can work with monotonic clock but
    not with other clocks. Fix that by resetting timestamping.
    
    Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
    Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-v1-1-dfc77c394173@tdk.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
iomap: avoid avoid truncating 64-bit offset to 32 bits [+ + +]
Author: Marco Nelissen <marco.nelissen@gmail.com>
Date:   Wed Jan 8 20:11:50 2025 -0800

    iomap: avoid avoid truncating 64-bit offset to 32 bits
    
    [ Upstream commit c13094b894de289514d84b8db56d1f2931a0bade ]
    
    on 32-bit kernels, iomap_write_delalloc_scan() was inadvertently using a
    32-bit position due to folio_next_index() returning an unsigned long.
    This could lead to an infinite loop when writing to an xfs filesystem.
    
    Signed-off-by: Marco Nelissen <marco.nelissen@gmail.com>
    Link: https://lore.kernel.org/r/20250109041253.2494374-1-marco.nelissen@gmail.com
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity() [+ + +]
Author: Tomas Krcka <krckatom@amazon.de>
Date:   Mon Dec 30 15:08:25 2024 +0000

    irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity()
    
    commit 35cb2c6ce7da545f3b5cb1e6473ad7c3a6f08310 upstream.
    
    The following call-chain leads to enabling interrupts in a nested interrupt
    disabled section:
    
    irq_set_vcpu_affinity()
      irq_get_desc_lock()
         raw_spin_lock_irqsave()   <--- Disable interrupts
      its_irq_set_vcpu_affinity()
         guard(raw_spinlock_irq)   <--- Enables interrupts when leaving the guard()
      irq_put_desc_unlock()        <--- Warns because interrupts are enabled
    
    This was broken in commit b97e8a2f7130, which replaced the original
    raw_spin_[un]lock() pair with guard(raw_spinlock_irq).
    
    Fix the issue by using guard(raw_spinlock).
    
    [ tglx: Massaged change log ]
    
    Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
    Signed-off-by: Tomas Krcka <krckatom@amazon.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Marc Zyngier <maz@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20241230150825.62894-1-krckatom@amazon.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly [+ + +]
Author: Yogesh Lal <quic_ylal@quicinc.com>
Date:   Fri Dec 20 15:09:07 2024 +0530

    irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly
    
    commit 0d62a49ab55c99e8deb4593b8d9f923de1ab5c18 upstream.
    
    When a CPU attempts to enter low power mode, it disables the redistributor
    and Group 1 interrupts and reinitializes the system registers upon wakeup.
    
    If the transition into low power mode fails, then the CPU_PM framework
    invokes the PM notifier callback with CPU_PM_ENTER_FAILED to allow the
    drivers to undo the state changes.
    
    The GIC V3 driver ignores CPU_PM_ENTER_FAILED, which leaves the GIC in
    disabled state.
    
    Handle CPU_PM_ENTER_FAILED in the same way as CPU_PM_EXIT to restore normal
    operation.
    
    [ tglx: Massage change log, add Fixes tag ]
    
    Fixes: 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier")
    Signed-off-by: Yogesh Lal <quic_ylal@quicinc.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Marc Zyngier <maz@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20241220093907.2747601-1-quic_ylal@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
irqchip: Plug a OF node reference leak in platform_irqchip_probe() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Sun Dec 15 12:39:45 2024 +0900

    irqchip: Plug a OF node reference leak in platform_irqchip_probe()
    
    commit 9322d1915f9d976ee48c09d800fbd5169bc2ddcc upstream.
    
    platform_irqchip_probe() leaks a OF node when irq_init_cb() fails. Fix it
    by declaring par_np with the __free(device_node) cleanup construct.
    
    This bug was found by an experimental static analysis tool that I am
    developing.
    
    Fixes: f8410e626569 ("irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20241215033945.3414223-1-joe@pf.is.s.u-tokyo.ac.jp
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
kheaders: Ignore silly-rename files [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Fri Dec 13 13:50:01 2024 +0000

    kheaders: Ignore silly-rename files
    
    [ Upstream commit 973b710b8821c3401ad7a25360c89e94b26884ac ]
    
    Tell tar to ignore silly-rename files (".__afs*" and ".nfs*") when building
    the header archive.  These occur when a file that is open is unlinked
    locally, but hasn't yet been closed.  Such files are visible to the user
    via the getdents() syscall and so programs may want to do things with them.
    
    During the kernel build, such files may be made during the processing of
    header files and the cleanup may get deferred by fput() which may result in
    tar seeing these files when it reads the directory, but they may have
    disappeared by the time it tries to open them, causing tar to fail with an
    error.  Further, we don't want to include them in the tarball if they still
    exist.
    
    With CONFIG_HEADERS_INSTALL=y, something like the following may be seen:
    
       find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory
       tar: ./include/linux/greybus/.__afs3C95: File removed before we read it
    
    The find warning doesn't seem to cause a problem.
    
    Fix this by telling tar when called from in gen_kheaders.sh to exclude such
    files.  This only affects afs and nfs; cifs uses the Windows Hidden
    attribute to prevent the file from being seen.
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/20241213135013.2964079-2-dhowells@redhat.com
    cc: Masahiro Yamada <masahiroy@kernel.org>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    cc: linux-nfs@vger.kernel.org
    cc: linux-kernel@vger.kernel.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Linux: Linux 6.1.127 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Jan 23 17:17:18 2025 +0100

    Linux 6.1.127
    
    Link: https://lore.kernel.org/r/20250121174521.568417761@linuxfoundation.org
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: SeongJae Park <sj@kernel.org>
    Link: https://lore.kernel.org/r/20250122073827.056636718@linuxfoundation.org
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mac802154: check local interfaces before deleting sdata list [+ + +]
Author: Lizhi Xu <lizhi.xu@windriver.com>
Date:   Wed Nov 13 17:51:29 2024 +0800

    mac802154: check local interfaces before deleting sdata list
    
    [ Upstream commit eb09fbeb48709fe66c0d708aed81e910a577a30a ]
    
    syzkaller reported a corrupted list in ieee802154_if_remove. [1]
    
    Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
    hardware device from the system.
    
    CPU0                                    CPU1
    ====                                    ====
    genl_family_rcv_msg_doit                ieee802154_unregister_hw
    ieee802154_del_iface                    ieee802154_remove_interfaces
    rdev_del_virtual_intf_deprecated        list_del(&sdata->list)
    ieee802154_if_remove
    list_del_rcu
    
    The net device has been unregistered, since the rcu grace period,
    unregistration must be run before ieee802154_if_remove.
    
    To avoid this issue, add a check for local->interfaces before deleting
    sdata list.
    
    [1]
    kernel BUG at lib/list_debug.c:58!
    Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56
    Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7
    RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246
    RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00
    RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
    RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d
    R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000
    R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0
    FS:  0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     __list_del_entry_valid include/linux/list.h:124 [inline]
     __list_del_entry include/linux/list.h:215 [inline]
     list_del_rcu include/linux/rculist.h:157 [inline]
     ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687
     rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline]
     ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323
     genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
     genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
     genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
     netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
     genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
     netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
     netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
     netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
     sock_sendmsg_nosec net/socket.c:729 [inline]
     __sock_sendmsg+0x221/0x270 net/socket.c:744
     ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607
     ___sys_sendmsg net/socket.c:2661 [inline]
     __sys_sendmsg+0x292/0x380 net/socket.c:2690
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
    Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
    Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/20241113095129.1457225-1-lizhi.xu@windriver.com
    Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mptcp: be sure to send ack when mptcp-level window re-opens [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Mon Jan 13 16:44:56 2025 +0100

    mptcp: be sure to send ack when mptcp-level window re-opens
    
    commit 2ca06a2f65310aeef30bb69b7405437a14766e4d upstream.
    
    mptcp_cleanup_rbuf() is responsible to send acks when the user-space
    reads enough data to update the receive windows significantly.
    
    It tries hard to avoid acquiring the subflow sockets locks by checking
    conditions similar to the ones implemented at the TCP level.
    
    To avoid too much code duplication - the MPTCP protocol can't reuse the
    TCP helpers as part of the relevant status is maintained into the msk
    socket - and multiple costly window size computation, mptcp_cleanup_rbuf
    uses a rough estimate for the most recently advertised window size:
    the MPTCP receive free space, as recorded as at last-ack time.
    
    Unfortunately the above does not allow mptcp_cleanup_rbuf() to detect
    a zero to non-zero win change in some corner cases, skipping the
    tcp_cleanup_rbuf call and leaving the peer stuck.
    
    After commit ea66758c1795 ("tcp: allow MPTCP to update the announced
    window"), MPTCP has actually cheap access to the announced window value.
    Use it in mptcp_cleanup_rbuf() for a more accurate ack generation.
    
    Fixes: e3859603ba13 ("mptcp: better msk receive window updates")
    Cc: stable@vger.kernel.org
    Reported-by: Jakub Kicinski <kuba@kernel.org>
    Closes: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-1-0d986ee7b1b6@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/mlx5: Clear port select structure when fail to create [+ + +]
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Wed Jan 15 13:39:07 2025 +0200

    net/mlx5: Clear port select structure when fail to create
    
    [ Upstream commit 5641e82cb55b4ecbc6366a499300917d2f3e6790 ]
    
    Clear the port select structure on error so no stale values left after
    definers are destroyed. That's because the mlx5_lag_destroy_definers()
    always try to destroy all lag definers in the tt_map, so in the flow
    below lag definers get double-destroyed and cause kernel crash:
    
      mlx5_lag_port_sel_create()
        mlx5_lag_create_definers()
          mlx5_lag_create_definer()     <- Failed on tt 1
            mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed
      mlx5_lag_port_sel_create()
        mlx5_lag_create_definers()
          mlx5_lag_create_definer()     <- Failed on tt 0
            mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed
    
     Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
     Mem abort info:
       ESR = 0x0000000096000005
       EC = 0x25: DABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
       FSC = 0x05: level 1 translation fault
     Data abort info:
       ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
       CM = 0, WnR = 0, TnD = 0, TagAccess = 0
       GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
     user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00
     [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
     Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
     Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)]
      CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G           OE      6.11.0+ #2
      Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
      Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
      lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
      sp : ffff800085fafb00
      x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000
      x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000
      x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000
      x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000
      x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350
      x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0
      x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c
      x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190
      x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000
      x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000
      Call trace:
       mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
       mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
       mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core]
       mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core]
       mlx5_activate_lag+0x60c/0x6f8 [mlx5_core]
       mlx5_do_bond_work+0x284/0x5c8 [mlx5_core]
       process_one_work+0x170/0x3e0
       worker_thread+0x2d8/0x3e0
       kthread+0x11c/0x128
       ret_from_fork+0x10/0x20
      Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400)
      ---[ end trace 0000000000000000 ]---
    
    Fixes: dc48516ec7d3 ("net/mlx5: Lag, add support to create definers for LAG")
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Fix RDMA TX steering prio [+ + +]
Author: Patrisious Haddad <phaddad@nvidia.com>
Date:   Wed Jan 15 13:39:04 2025 +0200

    net/mlx5: Fix RDMA TX steering prio
    
    [ Upstream commit c08d3e62b2e73e14da318a1d20b52d0486a28ee0 ]
    
    User added steering rules at RDMA_TX were being added to the first prio,
    which is the counters prio.
    Fix that so that they are correctly added to the BYPASS_PRIO instead.
    
    Fixes: 24670b1a3166 ("net/mlx5: Add support for RDMA TX steering")
    Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: add exit_batch_rtnl() method [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Feb 6 14:42:57 2024 +0000

    net: add exit_batch_rtnl() method
    
    [ Upstream commit fd4f101edbd9f99567ab2adb1f2169579ede7c13 ]
    
    Many (struct pernet_operations)->exit_batch() methods have
    to acquire rtnl.
    
    In presence of rtnl mutex pressure, this makes cleanup_net()
    very slow.
    
    This patch adds a new exit_batch_rtnl() method to reduce
    number of rtnl acquisitions from cleanup_net().
    
    exit_batch_rtnl() handlers are called while rtnl is locked,
    and devices to be killed can be queued in a list provided
    as their second argument.
    
    A single unregister_netdevice_many() is called right
    before rtnl is released.
    
    exit_batch_rtnl() handlers are called before ->exit() and
    ->exit_batch() handlers.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Antoine Tenart <atenart@kernel.org>
    Link: https://lore.kernel.org/r/20240206144313.2050392-2-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 46841c7053e6 ("gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field() [+ + +]
Author: Sudheer Kumar Doredla <s-doredla@ti.com>
Date:   Wed Jan 8 22:54:33 2025 +0530

    net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()
    
    [ Upstream commit 03d120f27d050336f7e7d21879891542c4741f81 ]
    
    CPSW ALE has 75-bit ALE entries stored across three 32-bit words.
    The cpsw_ale_get_field() and cpsw_ale_set_field() functions support
    ALE field entries spanning up to two words at the most.
    
    The cpsw_ale_get_field() and cpsw_ale_set_field() functions work as
    expected when ALE field spanned across word1 and word2, but fails when
    ALE field spanned across word2 and word3.
    
    For example, while reading the ALE field spanned across word2 and word3
    (i.e. bits 62 to 64), the word3 data shifted to an incorrect position
    due to the index becoming zero while flipping.
    The same issue occurred when setting an ALE entry.
    
    This issue has not been seen in practice but will be an issue in the future
    if the driver supports accessing ALE fields spanning word2 and word3
    
    Fix the methods to handle getting/setting fields spanning up to two words.
    
    Fixes: b685f1a58956 ("net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()")
    Signed-off-by: Sudheer Kumar Doredla <s-doredla@ti.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Roger Quadros <rogerq@kernel.org>
    Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
    Link: https://patch.msgid.link/20250108172433.311694-1-s-doredla@ti.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
net: ethernet: xgbe: re-add aneg to supported features in PHY quirks [+ + +]
Author: Heiner Kallweit <hkallweit1@gmail.com>
Date:   Sun Jan 12 22:59:59 2025 +0100

    net: ethernet: xgbe: re-add aneg to supported features in PHY quirks
    
    commit 6be7aca91009865d8c2b73589270224a6b6e67ab upstream.
    
    In 4.19, before the switch to linkmode bitmaps, PHY_GBIT_FEATURES
    included feature bits for aneg and TP/MII ports.
    
                                     SUPPORTED_TP | \
                                     SUPPORTED_MII)
    
                                     SUPPORTED_10baseT_Full)
    
                                     SUPPORTED_100baseT_Full)
    
                                     SUPPORTED_1000baseT_Full)
    
                                     PHY_100BT_FEATURES | \
                                     PHY_DEFAULT_FEATURES)
    
                                     PHY_1000BT_FEATURES)
    
    Referenced commit expanded PHY_GBIT_FEATURES, silently removing
    PHY_DEFAULT_FEATURES. The removed part can be re-added by using
    the new PHY_GBIT_FEATURES definition.
    Not clear to me is why nobody seems to have noticed this issue.
    
    I stumbled across this when checking what it takes to make
    phy_10_100_features_array et al private to phylib.
    
    Fixes: d0939c26c53a ("net: ethernet: xgbe: expand PHY_GBIT_FEAUTRES")
    Cc: stable@vger.kernel.org
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Link: https://patch.msgid.link/46521973-7738-4157-9f5e-0bb6f694acba@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fix data-races around sk->sk_forward_alloc [+ + +]
Author: Wang Liang <wangliang74@huawei.com>
Date:   Thu Nov 7 10:34:05 2024 +0800

    net: fix data-races around sk->sk_forward_alloc
    
    commit 073d89808c065ac4c672c0a613a71b27a80691cb upstream.
    
    Syzkaller reported this warning:
     ------------[ cut here ]------------
     WARNING: CPU: 0 PID: 16 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x1c5/0x1e0
     Modules linked in:
     CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc5 #26
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
     RIP: 0010:inet_sock_destruct+0x1c5/0x1e0
     Code: 24 12 4c 89 e2 5b 48 c7 c7 98 ec bb 82 41 5c e9 d1 18 17 ff 4c 89 e6 5b 48 c7 c7 d0 ec bb 82 41 5c e9 bf 18 17 ff 0f 0b eb 83 <0f> 0b eb 97 0f 0b eb 87 0f 0b e9 68 ff ff ff 66 66 2e 0f 1f 84 00
     RSP: 0018:ffffc9000008bd90 EFLAGS: 00010206
     RAX: 0000000000000300 RBX: ffff88810b172a90 RCX: 0000000000000007
     RDX: 0000000000000002 RSI: 0000000000000300 RDI: ffff88810b172a00
     RBP: ffff88810b172a00 R08: ffff888104273c00 R09: 0000000000100007
     R10: 0000000000020000 R11: 0000000000000006 R12: ffff88810b172a00
     R13: 0000000000000004 R14: 0000000000000000 R15: ffff888237c31f78
     FS:  0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007ffc63fecac8 CR3: 000000000342e000 CR4: 00000000000006f0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      <TASK>
      ? __warn+0x88/0x130
      ? inet_sock_destruct+0x1c5/0x1e0
      ? report_bug+0x18e/0x1a0
      ? handle_bug+0x53/0x90
      ? exc_invalid_op+0x18/0x70
      ? asm_exc_invalid_op+0x1a/0x20
      ? inet_sock_destruct+0x1c5/0x1e0
      __sk_destruct+0x2a/0x200
      rcu_do_batch+0x1aa/0x530
      ? rcu_do_batch+0x13b/0x530
      rcu_core+0x159/0x2f0
      handle_softirqs+0xd3/0x2b0
      ? __pfx_smpboot_thread_fn+0x10/0x10
      run_ksoftirqd+0x25/0x30
      smpboot_thread_fn+0xdd/0x1d0
      kthread+0xd3/0x100
      ? __pfx_kthread+0x10/0x10
      ret_from_fork+0x34/0x50
      ? __pfx_kthread+0x10/0x10
      ret_from_fork_asm+0x1a/0x30
      </TASK>
     ---[ end trace 0000000000000000 ]---
    
    Its possible that two threads call tcp_v6_do_rcv()/sk_forward_alloc_add()
    concurrently when sk->sk_state == TCP_LISTEN with sk->sk_lock unlocked,
    which triggers a data-race around sk->sk_forward_alloc:
    tcp_v6_rcv
        tcp_v6_do_rcv
            skb_clone_and_charge_r
                sk_rmem_schedule
                    __sk_mem_schedule
                        sk_forward_alloc_add()
                skb_set_owner_r
                    sk_mem_charge
                        sk_forward_alloc_add()
            __kfree_skb
                skb_release_all
                    skb_release_head_state
                        sock_rfree
                            sk_mem_uncharge
                                sk_forward_alloc_add()
                                sk_mem_reclaim
                                    // set local var reclaimable
                                    __sk_mem_reclaim
                                        sk_forward_alloc_add()
    
    In this syzkaller testcase, two threads call
    tcp_v6_do_rcv() with skb->truesize=768, the sk_forward_alloc changes like
    this:
     (cpu 1)             | (cpu 2)             | sk_forward_alloc
     ...                 | ...                 | 0
     __sk_mem_schedule() |                     | +4096 = 4096
                         | __sk_mem_schedule() | +4096 = 8192
     sk_mem_charge()     |                     | -768  = 7424
                         | sk_mem_charge()     | -768  = 6656
     ...                 |    ...              |
     sk_mem_uncharge()   |                     | +768  = 7424
     reclaimable=7424    |                     |
                         | sk_mem_uncharge()   | +768  = 8192
                         | reclaimable=8192    |
     __sk_mem_reclaim()  |                     | -4096 = 4096
                         | __sk_mem_reclaim()  | -8192 = -4096 != 0
    
    The skb_clone_and_charge_r() should not be called in tcp_v6_do_rcv() when
    sk->sk_state is TCP_LISTEN, it happens later in tcp_v6_syn_recv_sock().
    Fix the same issue in dccp_v6_do_rcv().
    
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
    Signed-off-by: Wang Liang <wangliang74@huawei.com>
    Link: https://patch.msgid.link/20241107023405.889239-1-wangliang74@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Alva Lan <alvalan9@foxmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: xilinx: axienet: Fix IRQ coalescing packet count overflow [+ + +]
Author: Sean Anderson <sean.anderson@linux.dev>
Date:   Mon Jan 13 11:30:00 2025 -0500

    net: xilinx: axienet: Fix IRQ coalescing packet count overflow
    
    [ Upstream commit c17ff476f53afb30f90bb3c2af77de069c81a622 ]
    
    If coalesce_count is greater than 255 it will not fit in the register and
    will overflow. This can be reproduced by running
    
        # ethtool -C ethX rx-frames 256
    
    which will result in a timeout of 0us instead. Fix this by checking for
    invalid values and reporting an error.
    
    Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
    Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
    Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
    Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
    Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfp: bpf: prevent integer overflow in nfp_bpf_event_output() [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Mon Jan 13 09:18:39 2025 +0300

    nfp: bpf: prevent integer overflow in nfp_bpf_event_output()
    
    [ Upstream commit 16ebb6f5b6295c9688749862a39a4889c56227f8 ]
    
    The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could
    potentially have an integer wrapping bug on 32bit systems.  Check for
    this and return an error.
    
    Fixes: 9816dd35ecec ("nfp: bpf: perf event output helpers support")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfsd: add list_head nf_gc to struct nfsd_file [+ + +]
Author: Youzhong Yang <youzhong@gmail.com>
Date:   Wed Jul 10 10:40:35 2024 -0400

    nfsd: add list_head nf_gc to struct nfsd_file
    
    commit 8e6e2ffa6569a205f1805cbaeca143b556581da6 upstream.
    
    nfsd_file_put() in one thread can race with another thread doing
    garbage collection (running nfsd_file_gc() -> list_lru_walk() ->
    nfsd_file_lru_cb()):
    
      * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
      * nfsd_file_lru_add() returns true (with NFSD_FILE_REFERENCED bit set)
      * garbage collector kicks in, nfsd_file_lru_cb() clears REFERENCED bit and
        returns LRU_ROTATE.
      * garbage collector kicks in again, nfsd_file_lru_cb() now decrements nf->nf_ref
        to 0, runs nfsd_file_unhash(), removes it from the LRU and adds to the dispose
        list [list_lru_isolate_move(lru, &nf->nf_lru, head)]
      * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
        the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))]. The 'nf' has been added
        to the 'dispose' list by nfsd_file_lru_cb(), so nfsd_file_lru_remove(nf) simply
        treats it as part of the LRU and removes it, which leads to its removal from
        the 'dispose' list.
      * At this moment, 'nf' is unhashed with its nf_ref being 0, and not on the LRU.
        nfsd_file_put() continues its execution [if (refcount_dec_and_test(&nf->nf_ref))],
        as nf->nf_ref is already 0, nf->nf_ref is set to REFCOUNT_SATURATED, and the 'nf'
        gets no chance of being freed.
    
    nfsd_file_put() can also race with nfsd_file_cond_queue():
      * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
      * nfsd_file_lru_add() sets REFERENCED bit and returns true.
      * Some userland application runs 'exportfs -f' or something like that, which triggers
        __nfsd_file_cache_purge() -> nfsd_file_cond_queue().
      * In nfsd_file_cond_queue(), it runs [if (!nfsd_file_unhash(nf))], unhash is done
        successfully.
      * nfsd_file_cond_queue() runs [if (!nfsd_file_get(nf))], now nf->nf_ref goes to 2.
      * nfsd_file_cond_queue() runs [if (nfsd_file_lru_remove(nf))], it succeeds.
      * nfsd_file_cond_queue() runs [if (refcount_sub_and_test(decrement, &nf->nf_ref))]
        (with "decrement" being 2), so the nf->nf_ref goes to 0, the 'nf' is added to the
        dispose list [list_add(&nf->nf_lru, dispose)]
      * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
        the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))], although the 'nf' is not
        in the LRU, but it is linked in the 'dispose' list, nfsd_file_lru_remove() simply
        treats it as part of the LRU and removes it. This leads to its removal from
        the 'dispose' list!
      * Now nf->ref is 0, unhashed. nfsd_file_put() continues its execution and set
        nf->nf_ref to REFCOUNT_SATURATED.
    
    As shown in the above analysis, using nf_lru for both the LRU list and dispose list
    can cause the leaks. This patch adds a new list_head nf_gc in struct nfsd_file, and uses
    it for the dispose list. This does not fix the nfsd_file leaking issue completely.
    
    Signed-off-by: Youzhong Yang <youzhong@gmail.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
nvmet: propagate npwg topology [+ + +]
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Tue Dec 17 18:33:25 2024 -0800

    nvmet: propagate npwg topology
    
    [ Upstream commit b579d6fdc3a9149bb4d2b3133cc0767130ed13e6 ]
    
    Ensure we propagate npwg to the target as well instead
    of assuming its the same logical blocks per physical block.
    
    This ensures devices with large IUs information properly
    propagated on the target.
    
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
openvswitch: fix lockup on tx to unregistering netdev with carrier [+ + +]
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Thu Jan 9 13:21:24 2025 +0100

    openvswitch: fix lockup on tx to unregistering netdev with carrier
    
    [ Upstream commit 47e55e4b410f7d552e43011baa5be1aab4093990 ]
    
    Commit in a fixes tag attempted to fix the issue in the following
    sequence of calls:
    
        do_output
        -> ovs_vport_send
           -> dev_queue_xmit
              -> __dev_queue_xmit
                 -> netdev_core_pick_tx
                    -> skb_tx_hash
    
    When device is unregistering, the 'dev->real_num_tx_queues' goes to
    zero and the 'while (unlikely(hash >= qcount))' loop inside the
    'skb_tx_hash' becomes infinite, locking up the core forever.
    
    But unfortunately, checking just the carrier status is not enough to
    fix the issue, because some devices may still be in unregistering
    state while reporting carrier status OK.
    
    One example of such device is a net/dummy.  It sets carrier ON
    on start, but it doesn't implement .ndo_stop to set the carrier off.
    And it makes sense, because dummy doesn't really have a carrier.
    Therefore, while this device is unregistering, it's still easy to hit
    the infinite loop in the skb_tx_hash() from the OVS datapath.  There
    might be other drivers that do the same, but dummy by itself is
    important for the OVS ecosystem, because it is frequently used as a
    packet sink for tcpdump while debugging OVS deployments.  And when the
    issue is hit, the only way to recover is to reboot.
    
    Fix that by also checking if the device is running.  The running
    state is handled by the net core during unregistering, so it covers
    unregistering case better, and we don't really need to send packets
    to devices that are not running anyway.
    
    While only checking the running state might be enough, the carrier
    check is preserved.  The running and the carrier states seem disjoined
    throughout the code and different drivers.  And other core functions
    like __dev_direct_xmit() check both before attempting to transmit
    a packet.  So, it seems safer to check both flags in OVS as well.
    
    Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
    Reported-by: Friedrich Weber <f.weber@proxmox.com>
    Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Tested-by: Friedrich Weber <f.weber@proxmox.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pktgen: Avoid out-of-bounds access in get_imix_entries [+ + +]
Author: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Date:   Thu Jan 9 11:30:39 2025 +0300

    pktgen: Avoid out-of-bounds access in get_imix_entries
    
    [ Upstream commit 76201b5979768500bca362871db66d77cb4c225e ]
    
    Passing a sufficient amount of imix entries leads to invalid access to the
    pkt_dev->imix_entries array because of the incorrect boundary check.
    
    UBSAN: array-index-out-of-bounds in net/core/pktgen.c:874:24
    index 20 is out of range for type 'imix_pkt [20]'
    CPU: 2 PID: 1210 Comm: bash Not tainted 6.10.0-rc1 #121
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
    Call Trace:
    <TASK>
    dump_stack_lvl lib/dump_stack.c:117
    __ubsan_handle_out_of_bounds lib/ubsan.c:429
    get_imix_entries net/core/pktgen.c:874
    pktgen_if_write net/core/pktgen.c:1063
    pde_write fs/proc/inode.c:334
    proc_reg_write fs/proc/inode.c:346
    vfs_write fs/read_write.c:593
    ksys_write fs/read_write.c:644
    do_syscall_64 arch/x86/entry/common.c:83
    entry_SYSCALL_64_after_hwframe arch/x86/entry/entry_64.S:130
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 52a62f8603f9 ("pktgen: Parse internet mix (imix) input")
    Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
    [ fp: allow to fill the array completely; minor changelog cleanup ]
    Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pmdomain: imx8mp-blk-ctrl: add missing loop break condition [+ + +]
Author: Xiaolei Wang <xiaolei.wang@windriver.com>
Date:   Wed Jan 15 09:41:18 2025 +0800

    pmdomain: imx8mp-blk-ctrl: add missing loop break condition
    
    commit 726efa92e02b460811e8bc6990dd742f03b645ea upstream.
    
    Currently imx8mp_blk_ctrl_remove() will continue the for loop
    until an out-of-bounds exception occurs.
    
    pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : dev_pm_domain_detach+0x8/0x48
    lr : imx8mp_blk_ctrl_shutdown+0x58/0x90
    sp : ffffffc084f8bbf0
    x29: ffffffc084f8bbf0 x28: ffffff80daf32ac0 x27: 0000000000000000
    x26: ffffffc081658d78 x25: 0000000000000001 x24: ffffffc08201b028
    x23: ffffff80d0db9490 x22: ffffffc082340a78 x21: 00000000000005b0
    x20: ffffff80d19bc180 x19: 000000000000000a x18: ffffffffffffffff
    x17: ffffffc080a39e08 x16: ffffffc080a39c98 x15: 4f435f464f006c72
    x14: 0000000000000004 x13: ffffff80d0172110 x12: 0000000000000000
    x11: ffffff80d0537740 x10: ffffff80d05376c0 x9 : ffffffc0808ed2d8
    x8 : ffffffc084f8bab0 x7 : 0000000000000000 x6 : 0000000000000000
    x5 : ffffff80d19b9420 x4 : fffffffe03466e60 x3 : 0000000080800077
    x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000000
    Call trace:
     dev_pm_domain_detach+0x8/0x48
     platform_shutdown+0x2c/0x48
     device_shutdown+0x158/0x268
     kernel_restart_prepare+0x40/0x58
     kernel_kexec+0x58/0xe8
     __do_sys_reboot+0x198/0x258
     __arm64_sys_reboot+0x2c/0x40
     invoke_syscall+0x5c/0x138
     el0_svc_common.constprop.0+0x48/0xf0
     do_el0_svc+0x24/0x38
     el0_svc+0x38/0xc8
     el0t_64_sync_handler+0x120/0x130
     el0t_64_sync+0x190/0x198
    Code: 8128c2d0 ffffffc0 aa1e03e9 d503201f
    
    Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl")
    Cc: stable@vger.kernel.org
    Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
    Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
    Reviewed-by: Fabio Estevam <festevam@gmail.com>
    Reviewed-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/r/20250115014118.4086729-1-xiaolei.wang@windriver.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() [+ + +]
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Tue Jan 7 17:27:17 2025 +0100

    poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
    
    [ Upstream commit cacd9ae4bf801ff4125d8961bb9a3ba955e51680 ]
    
    As the comment above waitqueue_active() explains, it can only be used
    if both waker and waiter have mb()'s that pair with each other. However
    __pollwait() is broken in this respect.
    
    This is not pipe-specific, but let's look at pipe_poll() for example:
    
            poll_wait(...); // -> __pollwait() -> add_wait_queue()
    
            LOAD(pipe->head);
            LOAD(pipe->head);
    
    In theory these LOAD()'s can leak into the critical section inside
    add_wait_queue() and can happen before list_add(entry, wq_head), in this
    case pipe_poll() can race with wakeup_pipe_readers/writers which do
    
            smp_mb();
            if (waitqueue_active(wq_head))
                    wake_up_interruptible(wq_head);
    
    There are more __pollwait()-like functions (grep init_poll_funcptr), and
    it seems that at least ep_ptable_queue_proc() has the same problem, so the
    patch adds smp_mb() into poll_wait().
    
    Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/rxe: Fix the qp flush warnings in req [+ + +]
Author: Zhu Yanjun <yanjun.zhu@linux.dev>
Date:   Fri Oct 25 17:20:36 2024 +0200

    RDMA/rxe: Fix the qp flush warnings in req
    
    commit ea4c990fa9e19ffef0648e40c566b94ba5ab31be upstream.
    
    When the qp is in error state, the status of WQEs in the queue should be
    set to error. Or else the following will appear.
    
    [  920.617269] WARNING: CPU: 1 PID: 21 at drivers/infiniband/sw/rxe/rxe_comp.c:756 rxe_completer+0x989/0xcc0 [rdma_rxe]
    [  920.617744] Modules linked in: rnbd_client(O) rtrs_client(O) rtrs_core(O) rdma_ucm rdma_cm iw_cm ib_cm crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core loop brd null_blk ipv6
    [  920.618516] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G           O       6.1.113-storage+ #65
    [  920.618986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
    [  920.619396] RIP: 0010:rxe_completer+0x989/0xcc0 [rdma_rxe]
    [  920.619658] Code: 0f b6 84 24 3a 02 00 00 41 89 84 24 44 04 00 00 e9 2a f7 ff ff 39 ca bb 03 00 00 00 b8 0e 00 00 00 48 0f 45 d8 e9 15 f7 ff ff <0f> 0b e9 cb f8 ff ff 41 bf f5 ff ff ff e9 08 f8 ff ff 49 8d bc 24
    [  920.620482] RSP: 0018:ffff97b7c00bbc38 EFLAGS: 00010246
    [  920.620817] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000008
    [  920.621183] RDX: ffff960dc396ebc0 RSI: 0000000000005400 RDI: ffff960dc4e2fbac
    [  920.621548] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffffffac406450
    [  920.621884] R10: ffffffffac4060c0 R11: 0000000000000001 R12: ffff960dc4e2f800
    [  920.622254] R13: ffff960dc4e2f928 R14: ffff97b7c029c580 R15: 0000000000000000
    [  920.622609] FS:  0000000000000000(0000) GS:ffff960ef7d00000(0000) knlGS:0000000000000000
    [  920.622979] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  920.623245] CR2: 00007fa056965e90 CR3: 00000001107f1000 CR4: 00000000000006e0
    [  920.623680] Call Trace:
    [  920.623815]  <TASK>
    [  920.623933]  ? __warn+0x79/0xc0
    [  920.624116]  ? rxe_completer+0x989/0xcc0 [rdma_rxe]
    [  920.624356]  ? report_bug+0xfb/0x150
    [  920.624594]  ? handle_bug+0x3c/0x60
    [  920.624796]  ? exc_invalid_op+0x14/0x70
    [  920.624976]  ? asm_exc_invalid_op+0x16/0x20
    [  920.625203]  ? rxe_completer+0x989/0xcc0 [rdma_rxe]
    [  920.625474]  ? rxe_completer+0x329/0xcc0 [rdma_rxe]
    [  920.625749]  rxe_do_task+0x80/0x110 [rdma_rxe]
    [  920.626037]  rxe_requester+0x625/0xde0 [rdma_rxe]
    [  920.626310]  ? rxe_cq_post+0xe2/0x180 [rdma_rxe]
    [  920.626583]  ? do_complete+0x18d/0x220 [rdma_rxe]
    [  920.626812]  ? rxe_completer+0x1a3/0xcc0 [rdma_rxe]
    [  920.627050]  rxe_do_task+0x80/0x110 [rdma_rxe]
    [  920.627285]  tasklet_action_common.constprop.0+0xa4/0x120
    [  920.627522]  handle_softirqs+0xc2/0x250
    [  920.627728]  ? sort_range+0x20/0x20
    [  920.627942]  run_ksoftirqd+0x1f/0x30
    [  920.628158]  smpboot_thread_fn+0xc7/0x1b0
    [  920.628334]  kthread+0xd6/0x100
    [  920.628504]  ? kthread_complete_and_exit+0x20/0x20
    [  920.628709]  ret_from_fork+0x1f/0x30
    [  920.628892]  </TASK>
    
    Fixes: ae720bdb703b ("RDMA/rxe: Generate error completion for error requester QP state")
    Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
    Link: https://patch.msgid.link/20241025152036.121417-1-yanjun.zhu@linux.dev
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Bin Lan <lanbincn@qq.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/amdgpu: rework resume handling for display (v2)" [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jan 21 14:14:58 2025 +0100

    Revert "drm/amdgpu: rework resume handling for display (v2)"
    
    This reverts commit c807ab3a861f656bc39471a20a16b36632ac6b04 which is
    commit 73dae652dcac776296890da215ee7dec357a1032 upstream.
    
    The original patch 73dae652dcac (drm/amdgpu: rework resume handling for
    display (v2)), was only targeted at kernels 6.11 and newer.  It did not
    apply cleanly to 6.12 so I backported it and it backport landed as
    99a02eab8251 ("drm/amdgpu: rework resume handling for display (v2)"),
    however there was a bug in the backport that was subsequently fixed in
    063d380ca28e ("drm/amdgpu: fix backport of commit 73dae652dcac").  None
    of this was intended for kernels older than 6.11, however the original
    backport eventually landed in 6.6, 6.1, and 5.15.
    
    Please revert the change from kernels 6.6, 6.1, and 5.15.
    
    Link: https://lore.kernel.org/r/BL1PR12MB5144D5363FCE6F2FD3502534F7E72@BL1PR12MB5144.namprd12.prod.outlook.com
    Link: https://lore.kernel.org/r/BL1PR12MB51449ADCFBF2314431F8BCFDF7132@BL1PR12MB5144.namprd12.prod.outlook.com
    Reported-by: Salvatore Bonaccorso <carnil@debian.org>
    Reported-by: Christian König <christian.koenig@amd.com>
    Reported-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "mtd: spi-nor: core: replace dummy buswidth from addr to data" [+ + +]
Author: Pratyush Yadav <pratyush@kernel.org>
Date:   Wed Jan 15 13:41:56 2025 +0000

    Revert "mtd: spi-nor: core: replace dummy buswidth from addr to data"
    
    [ Upstream commit d15638bf76ad47874ecb5dc386f0945fc0b2a875 ]
    
    This reverts commit 98d1fb94ce75f39febd456d6d3cbbe58b6678795.
    
    The commit uses data nbits instead of addr nbits for dummy phase. This
    causes a regression for all boards where spi-tx-bus-width is smaller
    than spi-rx-bus-width. It is a common pattern for boards to have
    spi-tx-bus-width == 1 and spi-rx-bus-width > 1. The regression causes
    all reads with a dummy phase to become unavailable for such boards,
    leading to a usually slower 0-dummy-cycle read being selected.
    
    Most controllers' supports_op hooks call spi_mem_default_supports_op().
    In spi_mem_default_supports_op(), spi_mem_check_buswidth() is called to
    check if the buswidths for the op can actually be supported by the
    board's wiring. This wiring information comes from (among other things)
    the spi-{tx,rx}-bus-width DT properties. Based on these properties,
    SPI_TX_* or SPI_RX_* flags are set by of_spi_parse_dt().
    spi_mem_check_buswidth() then uses these flags to make the decision
    whether an op can be supported by the board's wiring (in a way,
    indirectly checking against spi-{rx,tx}-bus-width).
    
    Now the tricky bit here is that spi_mem_check_buswidth() does:
    
            if (op->dummy.nbytes &&
                spi_check_buswidth_req(mem, op->dummy.buswidth, true))
                    return false;
    
    The true argument to spi_check_buswidth_req() means the op is treated as
    a TX op. For a board that has say 1-bit TX and 4-bit RX, a 4-bit dummy
    TX is considered as unsupported, and the op gets rejected.
    
    The commit being reverted uses the data buswidth for dummy buswidth. So
    for reads, the RX buswidth gets used for the dummy phase, uncovering
    this issue. In reality, a dummy phase is neither RX nor TX. As the name
    suggests, these are just dummy cycles that send or receive no data, and
    thus don't really need to have any buswidth at all.
    
    Ideally, dummy phases should not be checked against the board's wiring
    capabilities at all, and should only be sanity-checked for having a sane
    buswidth value. Since we are now at rc7 and such a change might
    introduce many unexpected bugs, revert the commit for now. It can be
    sent out later along with the spi_mem_check_buswidth() fix.
    
    Fixes: 98d1fb94ce75 ("mtd: spi-nor: core: replace dummy buswidth from addr to data")
    Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Closes: https://lore.kernel.org/linux-mtd/3342163.44csPzL39Z@steina-w/
    Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Revert "PCI: Use preserve_config in place of pci_flags" [+ + +]
Author: Terry Tritton <terry.tritton@linaro.org>
Date:   Fri Jan 17 15:16:39 2025 +0000

    Revert "PCI: Use preserve_config in place of pci_flags"
    
    This reverts commit f858b0fab28d8bc2d0f0e8cd4afc3216f347cfcc which is
    commit 7246a4520b4bf1494d7d030166a11b5226f6d508 upstream.
    
    This patch causes a regression in cuttlefish/crossvm boot on arm64.
    
    The patch was part of a series that when applied will not cause a regression
    but this patch was backported to the 6.1 branch by itself.
    
    The other patches do not apply cleanly to the 6.1 branch.
    
    Signed-off-by: Terry Tritton <terry.tritton@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "regmap: detach regmap from dev on regmap_exit" [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jan 21 14:24:18 2025 +0100

    Revert "regmap: detach regmap from dev on regmap_exit"
    
    This reverts commit 48dc44f3c1afa29390cb2fbc8badad1b1111cea4 which is
    commit 3061e170381af96d1e66799d34264e6414d428a7 upstream.
    
    It was backported incorrectly, a fixed version will be applied later.
    
    Cc: Cosmin Tanislav <demonsingur@gmail.com>
    Cc: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20250115033244.2540522-1-tzungbi@kernel.org
    Reported-by: Tzung-Bi Shih <tzungbi@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
scsi: sg: Fix slab-use-after-free read in sg_release() [+ + +]
Author: Suraj Sonawane <surajsonawane0215@gmail.com>
Date:   Wed Nov 20 18:29:44 2024 +0530

    scsi: sg: Fix slab-use-after-free read in sg_release()
    
    commit f10593ad9bc36921f623361c9e3dd96bd52d85ee upstream.
    
    Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN:
    
    BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30
    kernel/locking/lockdep.c:5838
    __mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912
    sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407
    
    In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is
    called before releasing the open_rel_lock mutex. The kref_put() call may
    decrement the reference count of sfp to zero, triggering its cleanup
    through sg_remove_sfp(). This cleanup includes scheduling deferred work
    via sg_remove_sfp_usercontext(), which ultimately frees sfp.
    
    After kref_put(), sg_release() continues to unlock open_rel_lock and may
    reference sfp or sdp. If sfp has already been freed, this results in a
    slab-use-after-free error.
    
    Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the
    open_rel_lock mutex. This ensures:
    
     - No references to sfp or sdp occur after the reference count is
       decremented.
    
     - Cleanup functions such as sg_remove_sfp() and
       sg_remove_sfp_usercontext() can safely execute without impacting the
       mutex handling in sg_release().
    
    The fix has been tested and validated by syzbot. This patch closes the
    bug reported at the following syzkaller link and ensures proper
    sequencing of resource cleanup and mutex operations, eliminating the
    risk of use-after-free errors in sg_release().
    
    Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b
    Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
    Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling")
    Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com>
    Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Alva Lan <alvalan9@foxmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers [+ + +]
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date:   Thu Dec 19 22:20:42 2024 +0530

    scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers
    
    [ Upstream commit bb9850704c043e48c86cc9df90ee102e8a338229 ]
    
    Otherwise, the default levels will override the levels set by the host
    controller drivers.
    
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-2-63c4b95a70b9@linaro.org
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests: mptcp: avoid spurious errors on disconnect [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Mon Jan 13 16:44:58 2025 +0100

    selftests: mptcp: avoid spurious errors on disconnect
    
    commit 218cc166321fb3cc8786677ffe0d09a78778a910 upstream.
    
    The disconnect test-case generates spurious errors:
    
      INFO: disconnect
      INFO: extra options: -I 3 -i /tmp/tmp.r43niviyoI
      01 ns1 MPTCP -> ns1 (10.0.1.1:10000      ) MPTCP (duration 140ms) [FAIL]
      file received by server does not match (in, out):
      Unexpected revents: POLLERR/POLLNVAL(19)
      -rw-r--r-- 1 root root 10028676 Jan 10 10:47 /tmp/tmp.r43niviyoI.disconnect
      Trailing bytes are:
      ��\����R���!8��u2��5N%
      -rw------- 1 root root 9992290 Jan 10 10:47 /tmp/tmp.Os4UbnWbI1
      Trailing bytes are:
      ��\����R���!8��u2��5N%
      02 ns1 MPTCP -> ns1 (dead:beef:1::1:10001) MPTCP (duration 206ms) [ OK ]
      03 ns1 MPTCP -> ns1 (dead:beef:1::1:10002) TCP   (duration  31ms) [ OK ]
      04 ns1 TCP   -> ns1 (dead:beef:1::1:10003) MPTCP (duration  26ms) [ OK ]
      [FAIL] Tests of the full disconnection have failed
      Time: 2 seconds
    
    The root cause is actually in the user-space bits: the test program
    currently disconnects as soon as all the pending data has been spooled,
    generating an FASTCLOSE. If such option reaches the peer before the
    latter has reached the closed status, the msk socket will report an
    error to the user-space, as per protocol specification, causing the
    above failure.
    
    Address the issue explicitly waiting for all the relevant sockets to
    reach a closed status before performing the disconnect.
    
    Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-3-0d986ee7b1b6@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: tc-testing: reduce rshift value [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Jan 3 10:24:58 2025 -0800

    selftests: tc-testing: reduce rshift value
    
    [ Upstream commit e95274dfe86490ec2a5633035c24b2de6722841f ]
    
    After previous change rshift >= 32 is no longer allowed.
    Modify the test to use 31, the test doesn't seem to send
    any traffic so the exact value shouldn't matter.
    
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20250103182458.1213486-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vsock/virtio: cancel close work in the destructor [+ + +]
Author: Stefano Garzarella <sgarzare@redhat.com>
Date:   Fri Jan 10 09:35:09 2025 +0100

    vsock/virtio: cancel close work in the destructor
    
    commit df137da9d6d166e87e40980e36eb8e0bc90483ef upstream.
    
    During virtio_transport_release() we can schedule a delayed work to
    perform the closing of the socket before destruction.
    
    The destructor is called either when the socket is really destroyed
    (reference counter to zero), or it can also be called when we are
    de-assigning the transport.
    
    In the former case, we are sure the delayed work has completed, because
    it holds a reference until it completes, so the destructor will
    definitely be called after the delayed work is finished.
    But in the latter case, the destructor is called by AF_VSOCK core, just
    after the release(), so there may still be delayed work scheduled.
    
    Refactor the code, moving the code to delete the close work already in
    the do_close() to a new function. Invoke it during destruction to make
    sure we don't leave any pending work.
    
    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Cc: stable@vger.kernel.org
    Reported-by: Hyunwoo Kim <v4bel@theori.io>
    Closes: https://lore.kernel.org/netdev/Z37Sh+utS+iV3+eb@v4bel-B760M-AORUS-ELITE-AX/
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
    Tested-by: Hyunwoo Kim <v4bel@theori.io>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vsock/virtio: discard packets if the transport changes [+ + +]
Author: Stefano Garzarella <sgarzare@redhat.com>
Date:   Fri Jan 10 09:35:07 2025 +0100

    vsock/virtio: discard packets if the transport changes
    
    commit 2cb7c756f605ec02ffe562fb26828e4bcc5fdfc1 upstream.
    
    If the socket has been de-assigned or assigned to another transport,
    we must discard any packets received because they are not expected
    and would cause issues when we access vsk->transport.
    
    A possible scenario is described by Hyunwoo Kim in the attached link,
    where after a first connect() interrupted by a signal, and a second
    connect() failed, we can find `vsk->transport` at NULL, leading to a
    NULL pointer dereference.
    
    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Cc: stable@vger.kernel.org
    Reported-by: Hyunwoo Kim <v4bel@theori.io>
    Reported-by: Wongi Lee <qwerty@theori.io>
    Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Reviewed-by: Hyunwoo Kim <v4bel@theori.io>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
vsock: prevent null-ptr-deref in vsock_*[has_data|has_space] [+ + +]
Author: Stefano Garzarella <sgarzare@redhat.com>
Date:   Fri Jan 10 09:35:11 2025 +0100

    vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]
    
    commit 91751e248256efc111e52e15115840c35d85abaf upstream.
    
    Recent reports have shown how we sometimes call vsock_*_has_data()
    when a vsock socket has been de-assigned from a transport (see attached
    links), but we shouldn't.
    
    Previous commits should have solved the real problems, but we may have
    more in the future, so to avoid null-ptr-deref, we can return 0
    (no space, no data available) but with a warning.
    
    This way the code should continue to run in a nearly consistent state
    and have a warning that allows us to debug future problems.
    
    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/netdev/Z2K%2FI4nlHdfMRTZC@v4bel-B760M-AORUS-ELITE-AX/
    Link: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/
    Link: https://lore.kernel.org/netdev/677f84a8.050a0220.25a300.01b3.GAE@google.com/
    Co-developed-by: Hyunwoo Kim <v4bel@theori.io>
    Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
    Co-developed-by: Wongi Lee <qwerty@theori.io>
    Signed-off-by: Wongi Lee <qwerty@theori.io>
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
    Reviewed-by: Hyunwoo Kim <v4bel@theori.io>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vsock: reset socket state when de-assigning the transport [+ + +]
Author: Stefano Garzarella <sgarzare@redhat.com>
Date:   Fri Jan 10 09:35:10 2025 +0100

    vsock: reset socket state when de-assigning the transport
    
    commit a24009bc9be60242651a21702609381b5092459e upstream.
    
    Transport's release() and destruct() are called when de-assigning the
    vsock transport. These callbacks can touch some socket state like
    sock flags, sk_state, and peer_shutdown.
    
    Since we are reassigning the socket to a new transport during
    vsock_connect(), let's reset these fields to have a clean state with
    the new transport.
    
    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
wifi: ath10k: avoid NULL pointer error during sdio remove [+ + +]
Author: Kang Yang <quic_kangyang@quicinc.com>
Date:   Tue Oct 8 10:22:46 2024 +0800

    wifi: ath10k: avoid NULL pointer error during sdio remove
    
    commit 95c38953cb1ecf40399a676a1f85dfe2b5780a9a upstream.
    
    When running 'rmmod ath10k', ath10k_sdio_remove() will free sdio
    workqueue by destroy_workqueue(). But if CONFIG_INIT_ON_FREE_DEFAULT_ON
    is set to yes, kernel panic will happen:
    Call trace:
     destroy_workqueue+0x1c/0x258
     ath10k_sdio_remove+0x84/0x94
     sdio_bus_remove+0x50/0x16c
     device_release_driver_internal+0x188/0x25c
     device_driver_detach+0x20/0x2c
    
    This is because during 'rmmod ath10k', ath10k_sdio_remove() will call
    ath10k_core_destroy() before destroy_workqueue(). wiphy_dev_release()
    will finally be called in ath10k_core_destroy(). This function will free
    struct cfg80211_registered_device *rdev and all its members, including
    wiphy, dev and the pointer of sdio workqueue. Then the pointer of sdio
    workqueue will be set to NULL due to CONFIG_INIT_ON_FREE_DEFAULT_ON.
    
    After device release, destroy_workqueue() will use NULL pointer then the
    kernel panic happen.
    
    Call trace:
    ath10k_sdio_remove
      ->ath10k_core_unregister
        ……
        ->ath10k_core_stop
          ->ath10k_hif_stop
            ->ath10k_sdio_irq_disable
        ->ath10k_hif_power_down
          ->del_timer_sync(&ar_sdio->sleep_timer)
      ->ath10k_core_destroy
        ->ath10k_mac_destroy
          ->ieee80211_free_hw
            ->wiphy_free
        ……
              ->wiphy_dev_release
      ->destroy_workqueue
    
    Need to call destroy_workqueue() before ath10k_core_destroy(), free
    the work queue buffer first and then free pointer of work queue by
    ath10k_core_destroy(). This order matches the error path order in
    ath10k_sdio_probe().
    
    No work will be queued on sdio workqueue between it is destroyed and
    ath10k_core_destroy() is called. Based on the call_stack above, the
    reason is:
    Only ath10k_sdio_sleep_timer_handler(), ath10k_sdio_hif_tx_sg() and
    ath10k_sdio_irq_disable() will queue work on sdio workqueue.
    Sleep timer will be deleted before ath10k_core_destroy() in
    ath10k_hif_power_down().
    ath10k_sdio_irq_disable() only be called in ath10k_hif_stop().
    ath10k_core_unregister() will call ath10k_hif_power_down() to stop hif
    bus, so ath10k_sdio_hif_tx_sg() won't be called anymore.
    
    Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00189
    
    Signed-off-by: Kang Yang <quic_kangyang@quicinc.com>
    Tested-by: David Ruth <druth@chromium.org>
    Reviewed-by: David Ruth <druth@chromium.org>
    Link: https://patch.msgid.link/20241008022246.1010-1-quic_kangyang@quicinc.com
    Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
x86/asm: Make serialize() always_inline [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Wed Dec 18 11:09:18 2024 +0100

    x86/asm: Make serialize() always_inline
    
    [ Upstream commit ae02ae16b76160f0aeeae2c5fb9b15226d00a4ef ]
    
    In order to allow serialize() to be used from noinstr code, make it
    __always_inline.
    
    Fixes: 0ef8047b737d ("x86/static-call: provide a way to do very early static-call updates")
    Closes: https://lore.kernel.org/oe-kbuild-all/202412181756.aJvzih2K-lkp@intel.com/
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/20241218100918.22167-1-jgross@suse.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/xen: fix SLS mitigation in xen_hypercall_iret() [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Fri Jan 17 12:05:51 2025 +0100

    x86/xen: fix SLS mitigation in xen_hypercall_iret()
    
    The backport of upstream patch a2796dff62d6 ("x86/xen: don't do PV iret
    hypercall through hypercall page") missed to adapt the SLS mitigation
    config check from CONFIG_MITIGATION_SLS to CONFIG_SLS.
    
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
zram: fix potential UAF of zram table [+ + +]
Author: Kairui Song <kasong@tencent.com>
Date:   Tue Jan 7 14:54:46 2025 +0800

    zram: fix potential UAF of zram table
    
    commit 212fe1c0df4a150fb6298db2cfff267ceaba5402 upstream.
    
    If zram_meta_alloc failed early, it frees allocated zram->table without
    setting it NULL.  Which will potentially cause zram_meta_free to access
    the table if user reset an failed and uninitialized device.
    
    Link: https://lkml.kernel.org/r/20250107065446.86928-1-ryncsn@gmail.com
    Fixes: 74363ec674cb ("zram: fix uninitialized ZRAM not releasing backing device")
    Signed-off-by: Kairui Song <kasong@tencent.com>
    Reviewed-by:  Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>