Author: Murad Masimov <m.masimov@mt-integration.ru>
Date: Thu Jan 23 19:39:45 2025 +0300
acpi: nfit: fix narrowing conversion in acpi_nfit_ctl
commit 2ff0e408db36c21ed3fa5e3c1e0e687c82cf132f upstream.
Syzkaller has reported a warning in to_nfit_bus_uuid(): "only secondary
bus families can be translated". This warning is emited if the argument
is equal to NVDIMM_BUS_FAMILY_NFIT == 0. Function acpi_nfit_ctl() first
verifies that a user-provided value call_pkg->nd_family of type u64 is
not equal to 0. Then the value is converted to int, and only after that
is compared to NVDIMM_BUS_FAMILY_MAX. This can lead to passing an invalid
argument to acpi_nfit_ctl(), if call_pkg->nd_family is non-zero, while
the lower 32 bits are zero.
Furthermore, it is best to return EINVAL immediately upon seeing the
invalid user input. The WARNING is insufficient to prevent further
undefined behavior based on other invalid user input.
All checks of the input value should be applied to the original variable
call_pkg->nd_family.
[iweiny: update commit message]
Fixes: 6450ddbd5d8e ("ACPI: NFIT: Define runtime firmware activation commands")
Cc: stable@vger.kernel.org
Reported-by: syzbot+c80d8dc0d9fa81a3cd8c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c80d8dc0d9fa81a3cd8c
Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru>
Link: https://patch.msgid.link/20250123163945.251-1-m.masimov@mt-integration.ru
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Giovanni Gherdovich <ggherdovich@suse.cz>
Date: Fri Mar 28 15:30:39 2025 +0100
ACPI: processor: idle: Return an error if both P_LVL{2,3} idle states are invalid
[ Upstream commit 9e9b893404d43894d69a18dd2fc8fcf1c36abb7e ]
Prior to commit 496121c02127 ("ACPI: processor: idle: Allow probing on
platforms with one ACPI C-state"), the acpi_idle driver wouldn't load on
systems without a valid C-State at least as deep as C2.
The behavior was desirable for guests on hypervisors such as VMWare
ESXi, which by default don't have the _CST ACPI method, and set the C2
and C3 latencies to 101 and 1001 microseconds respectively via the FADT,
to signify they're unsupported.
Since the above change though, these virtualized deployments end up
loading acpi_idle, and thus entering the default C1 C-State set by
acpi_processor_get_power_info_default(); this is undesirable for a
system that's communicating to the OS it doesn't want C-States (missing
_CST, and invalid C2/C3 in FADT).
Make acpi_processor_get_power_info_fadt() return -ENODEV in that case,
so that acpi_processor_get_cstate_info() exits early and doesn't set
pr->flags.power = 1.
Fixes: 496121c02127 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state")
Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Link: https://patch.msgid.link/20250328143040.9348-1-ggherdovich@suse.cz
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Menzel <pmenzel@molgen.mpg.de>
Date: Tue Mar 18 17:09:02 2025 +0100
ACPI: resource: Skip IRQ override on ASUS Vivobook 14 X1404VAP
commit 2da31ea2a085cd189857f2db0f7b78d0162db87a upstream.
Like the ASUS Vivobook X1504VAP and Vivobook X1704VAP, the ASUS Vivobook 14
X1404VAP has its keyboard IRQ (1) described as ActiveLow in the DSDT, which
the kernel overrides to EdgeHigh breaking the keyboard.
$ sudo dmidecode
[…]
System Information
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: ASUS Vivobook 14 X1404VAP_X1404VA
[…]
$ grep -A 30 PS2K dsdt.dsl | grep IRQ -A 1
IRQ (Level, ActiveLow, Exclusive, )
{1}
Add the X1404VAP to the irq1_level_low_skip_override[] quirk table to fix
this.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Anton Shyndin <mrcold.il@gmail.com>
Link: https://patch.msgid.link/20250318160903.77107-1-pmenzel@molgen.mpg.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hans de Goede <hdegoede@redhat.com>
Date: Tue Mar 25 22:04:50 2025 +0100
ACPI: x86: Extend Lenovo Yoga Tab 3 quirk with skip GPIO event-handlers
commit 2fa87c71d2adb4b82c105f9191e6120340feff00 upstream.
Depending on the secureboot signature on EFI\BOOT\BOOTX86.EFI the
Lenovo Yoga Tab 3 UEFI will switch its OSID ACPI variable between
1 (Windows) and 4 (Android(GMIN)).
In Windows mode a GPIO event handler gets installed for GPO1 pin 5,
causing Linux' x86-android-tables code which deals with the general
brokenness of this device's ACPI tables to fail to probe with:
[ 17.853705] x86_android_tablets: error -16 getting GPIO INT33FF:01 5
[ 17.859623] x86_android_tablets x86_android_tablets: probe with driver
which renders sound, the touchscreen, charging-management,
battery-monitoring and more non functional.
Add ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS to the existing quirks for this
device to fix this.
Reported-by: Agoston Lorincz <pipacsba@gmail.com>
Closes: https://lore.kernel.org/platform-driver-x86/CAMEzqD+DNXrAvUOHviB2O2bjtcbmo3xH=kunKr4nubuMLbb_0A@mail.gmail.com/
Cc: All applicable <stable@kernel.org>
Fixes: fe820db35275 ("ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Tab 3 Pro (YT3-X90F)")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20250325210450.358506-1-hdegoede@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Simon Tatham <anakin@pobox.com>
Date: Thu Feb 20 08:14:44 2025 +0000
affs: don't write overlarge OFS data block size fields
[ Upstream commit 011ea742a25a77bac3d995f457886a67d178c6f0 ]
If a data sector on an OFS floppy contains a value > 0x1e8 (the
largest amount of data that fits in the sector after its header), then
an Amiga reading the file can return corrupt data, by taking the
overlarge size at its word and reading past the end of the buffer it
read the disk sector into!
The cause: when affs_write_end_ofs() writes data to an OFS filesystem,
the new size field for a data block was computed by adding the amount
of data currently being written (into the block) to the existing value
of the size field. This is correct if you're extending the file at the
end, but if you seek backwards in the file and overwrite _existing_
data, it can lead to the size field being larger than the maximum
legal value.
This commit changes the calculation so that it sets the size field to
the max of its previous size and the position within the block that we
just wrote up to.
Signed-off-by: Simon Tatham <anakin@pobox.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simon Tatham <anakin@pobox.com>
Date: Thu Feb 20 08:14:43 2025 +0000
affs: generate OFS sequence numbers starting at 1
[ Upstream commit e4cf8ec4de4e13f156c1d61977d282d90c221085 ]
If I write a file to an OFS floppy image, and try to read it back on
an emulated Amiga running Workbench 1.3, the Amiga reports a disk
error trying to read the file. (That is, it's unable to read it _at
all_, even to copy it to the NIL: device. It isn't a matter of getting
the wrong data and being unable to parse the file format.)
This is because the 'sequence number' field in the OFS data block
header is supposed to be based at 1, but affs writes it based at 0.
All three locations changed by this patch were setting the sequence
number to a variable 'bidx' which was previously obtained by dividing
a file position by bsize, so bidx will naturally use 0 for the first
block. Therefore all three should add 1 to that value before writing
it into the sequence number field.
With this change, the Amiga successfully reads the file.
For data block reference: https://wiki.osdev.org/FFS_(Amiga)
Signed-off-by: Simon Tatham <anakin@pobox.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Navon John Lukose <navonjohnlukose@gmail.com>
Date: Sat Mar 8 03:03:19 2025 +0530
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion x360 14-dy1xxx
[ Upstream commit b11a74ac4f545626d0dc95a8ca8c41df90532bf3 ]
Add a fixup to enable the mute LED on HP Pavilion x360 Convertible
14-dy1xxx with ALC295 codec. The appropriate coefficient index and bits
were identified through a brute-force method, as detailed in
https://bbs.archlinux.org/viewtopic.php?pid=2079504#p2079504.
Signed-off-by: Navon John Lukose <navonjohnlukose@gmail.com>
Link: https://patch.msgid.link/20250307213319.35507-1-navonjohnlukose@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:49 2025 +0000
ALSA: hda/realtek: Add support for ASUS B3405 and B3605 Laptops using CS35L41 HDA
[ Upstream commit 7ab61d0a9a35e32497bcf2233310fec79ee3338f ]
Add support for ASUS B3405CCA / P3405CCA, B3605CCA / P3605CCA,
B3405CCA, B3605CCA.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-6-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:50 2025 +0000
ALSA: hda/realtek: Add support for ASUS B5405 and B5605 Laptops using CS35L41 HDA
[ Upstream commit c86dd79a7c338fff9bebb9503857e07db9845eca ]
Add support for ASUS B5605CCA and B5405CCA.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-7-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:47 2025 +0000
ALSA: hda/realtek: Add support for ASUS ROG Strix G614 Laptops using CS35L41 HDA
[ Upstream commit 9120b2b4ad0dad2f6bbb6bcacd0456f806fda62d ]
Add support for ASUS G614PH/PM/PP and G614FH/FM/FP.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-4-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:45 2025 +0000
ALSA: hda/realtek: Add support for ASUS ROG Strix G814 Laptop using CS35L41 HDA
[ Upstream commit f2c11231b57b5163bf16cdfd65271d53d61dd996 ]
Add support for ASUS G814PH/PM/PP and G814FH/FM/FP.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C.
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-2-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:46 2025 +0000
ALSA: hda/realtek: Add support for ASUS ROG Strix GA603 Laptops using CS35L41 HDA
[ Upstream commit 16dc157346dd4404b02b42e73b88604be3652039 ]
Add support for ASUS GA603KP, GA603KM and GA603KH.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-3-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:51 2025 +0000
ALSA: hda/realtek: Add support for ASUS Zenbook UM3406KA Laptops using CS35L41 HDA
[ Upstream commit 8463d2adbe1901247937fcdfe4b525130f6db10b ]
Laptop uses 2 CS35L41 Amps with HDA, using External boost with I2C
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-8-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Binding <sbinding@opensource.cirrus.com>
Date: Wed Mar 5 17:06:48 2025 +0000
ALSA: hda/realtek: Add support for various ASUS Laptops using CS35L41 HDA
[ Upstream commit 859a11917001424776e1cca02b762efcabb4044e ]
Add support for ASUS B3405CVA, B5405CVA, B5605CVA, B3605CVA.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250305170714.755794-5-sbinding@opensource.cirrus.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Sat Mar 15 15:30:19 2025 +0100
ALSA: hda/realtek: Always honor no_shutup_pins
[ Upstream commit 5a0c72c1da3cbc0cd4940a95d1be2830104c6edf ]
The workaround for Dell machines to skip the pin-shutup for mic pins
introduced alc_headset_mic_no_shutup() that is replaced from the
generic snd_hda_shutup_pins() for certain codecs. The problem is that
the call is done unconditionally even if spec->no_shutup_pins is set.
This seems causing problems on other platforms like Lenovo.
This patch corrects the behavior and the driver honors always
spec->no_shutup_pins flag and skips alc_headset_mic_no_shutup() if
it's set.
Fixes: dad3197da7a3 ("ALSA: hda/realtek - Fixup headphone noise via runtime suspend")
Reported-and-tested-by: Oleg Gorobets <oleg.goro@gmail.com>
Link: https://patch.msgid.link/20250315143020.27184-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antheas Kapenekakis <lkml@antheas.dev>
Date: Thu Feb 27 18:51:07 2025 +0100
ALSA: hda/realtek: Fix Asus Z13 2025 audio
[ Upstream commit 12784ca33b62fd327631749e6a0cd2a10110a56c ]
Use the basic quirk for this type of amplifier. Sound works in speakers,
headphones, and microphone. Whereas none worked before.
Tested-by: Kyle Gospodnetich <me@kylegospodneti.ch>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Link: https://patch.msgid.link/20250227175107.33432-3-lkml@antheas.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Wed Mar 26 16:22:01 2025 +0100
ALSA: hda/realtek: Fix built-in mic breakage on ASUS VivoBook X515JA
[ Upstream commit 84c3c08f5a6c2e2209428b76156bcaf349c3a62d ]
ASUS VivoBook X515JA with PCI SSID 1043:14f2 also hits the same issue
as other VivoBook model about the mic pin assignment, and the same
workaround is required to apply ALC256_FIXUP_ASUS_MIC_NO_PRESENCE
quirk.
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219902
Link: https://patch.msgid.link/20250326152205.26733-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Wed Apr 2 09:42:07 2025 +0200
ALSA: hda/realtek: Fix built-in mic on another ASUS VivoBook model
[ Upstream commit 8983dc1b66c0e1928a263b8af0bb06f6cb9229c4 ]
There is another VivoBook model which built-in mic got broken recently
by the fix of the pin sort. Apply the correct quirk
ALC256_FIXUP_ASUS_MIC_NO_PRESENCE to this model for addressing the
regression, too.
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort")
Closes: https://lore.kernel.org/Z95s5T6OXFPjRnKf@eldamar.lan
Link: https://patch.msgid.link/20250402074208.7347-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Bárta <daniel.barta@trustlab.cz>
Date: Thu Feb 27 17:12:55 2025 +0100
ALSA: hda: Fix speakers on ASUS EXPERTBOOK P5405CSA 1.0
[ Upstream commit f479ecc5ef15ed8d774968c1a8726a49420f11a0 ]
After some digging around I have found that this laptop has Cirrus's smart
aplifiers connected to SPI bus (spi1-CSC3551:00-cs35l41-hda).
To get them correctly detected and working I had to modify patch_realtek.c
with ASUS EXPERTBOOK P5405CSA 1.0 SystemID (0x1043, 0x1f63) and add
corresponding hda_quirk (ALC245_FIXUP_CS35L41_SPI_2).
Signed-off-by: Daniel Bárta <daniel.barta@trustlab.cz>
Link: https://patch.msgid.link/20250227161256.18061-2-daniel.barta@trustlab.cz
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Fri Mar 21 18:26:52 2025 +0100
ALSA: timer: Don't take register_mutex with copy_from/to_user()
[ Upstream commit 3424c8f53bc63c87712a7fc22dc13d0cc85fb0d6 ]
The infamous mmap_lock taken in copy_from/to_user() can be often
problematic when it's called inside another mutex, as they might lead
to deadlocks.
In the case of ALSA timer code, the bad pattern is with
guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which
was mistakenly introduced at converting to guard(), and it had been
carefully worked around in the past.
This patch fixes those pieces simply by moving copy_from/to_user() out
of the register mutex lock again.
Fixes: 3923de04c817 ("ALSA: pcm: oss: Use guard() for setup")
Reported-by: syzbot+2b96f44164236dda0f3b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/67dd86c8.050a0220.25ae54.0059.GAE@google.com
Link: https://patch.msgid.link/20250321172653.14310-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Anshuman Khandual <anshuman.khandual@arm.com>
Date: Wed Feb 26 17:54:01 2025 +0530
arch/powerpc: drop GENERIC_PTDUMP from mpc885_ads_defconfig
[ Upstream commit 2c5e6ac2db64ace51f66a9f3b3b3ab9553d748e8 ]
GENERIC_PTDUMP gets selected on powerpc explicitly and hence can be
dropped off from mpc885_ads_defconfig. Replace with CONFIG_PTDUMP_DEBUGFS
instead.
Link: https://lkml.kernel.org/r/20250226122404.1927473-3-anshuman.khandual@arm.com
Fixes: e084728393a5 ("powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Henry Martin <bsdhenrymartin@gmail.com>
Date: Wed Apr 2 21:50:36 2025 +0800
arcnet: Add NULL check in com20020pci_probe()
[ Upstream commit fda8c491db2a90ff3e6fbbae58e495b4ddddeca3 ]
devm_kasprintf() returns NULL when memory allocation fails. Currently,
com20020pci_probe() does not check for this case, which results in a
NULL pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue and ensure
no resources are left allocated.
Fixes: 6b17a597fc2f ("arcnet: restoring support for multiple Sohard Arcnet cards")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Link: https://patch.msgid.link/20250402135036.44697-1-bsdhenrymartin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Angelos Oikonomopoulos <angelos@igalia.com>
Date: Tue Apr 1 10:51:50 2025 +0200
arm64: Don't call NULL in do_compat_alignment_fixup()
commit c28f31deeacda307acfee2f18c0ad904e5123aac upstream.
do_alignment_t32_to_handler() only fixes up alignment faults for
specific instructions; it returns NULL otherwise (e.g. LDREX). When
that's the case, signal to the caller that it needs to proceed with the
regular alignment fault handling (i.e. SIGBUS). Without this patch, the
kernel panics:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000086000006
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=00000800164aa000
[0000000000000000] pgd=0800081fdbd22003, p4d=0800081fdbd22003, pud=08000815d51c6003, pmd=0000000000000000
Internal error: Oops: 0000000086000006 [#1] SMP
Modules linked in: cfg80211 rfkill xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter veth nvme_fa>
libcrc32c crc32c_generic raid0 multipath linear dm_mod dax raid1 md_mod xhci_pci nvme xhci_hcd nvme_core t10_pi usbcore igb crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c>
CPU: 2 PID: 3932954 Comm: WPEWebProcess Not tainted 6.1.0-31-arm64 #1 Debian 6.1.128-1
Hardware name: GIGABYTE MP32-AR1-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : do_compat_alignment_fixup+0xd8/0x3dc
sp : ffff80000f973dd0
x29: ffff80000f973dd0 x28: ffff081b42526180 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
x23: 0000000000000004 x22: 0000000000000000 x21: 0000000000000001
x20: 00000000e8551f00 x19: ffff80000f973eb0 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffaebc949bc488
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : 0000000000400000 x4 : 0000fffffffffffe x3 : 0000000000000000
x2 : ffff80000f973eb0 x1 : 00000000e8551f00 x0 : 0000000000000001
Call trace:
0x0
do_alignment_fault+0x40/0x50
do_mem_abort+0x4c/0xa0
el0_da+0x48/0xf0
el0t_32_sync_handler+0x110/0x140
el0t_32_sync+0x190/0x194
Code: bad PC value
---[ end trace 0000000000000000 ]---
Signed-off-by: Angelos Oikonomopoulos <angelos@igalia.com>
Fixes: 3fc24ef32d3b ("arm64: compat: Implement misalignment fixups for multiword loads")
Cc: <stable@vger.kernel.org> # 6.1.x
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20250401085150.148313-1-angelos@igalia.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nathan Chancellor <nathan@kernel.org>
Date: Thu Mar 20 22:33:49 2025 +0100
ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE
commit e7607f7d6d81af71dcc5171278aadccc94d277cd upstream.
ld.lld prior to 21.0.0 does not support using the KEEP keyword within an
overlay description, which may be needed to avoid discarding necessary
sections within an overlay with '--gc-sections', which can be enabled
for the kernel via CONFIG_LD_DEAD_CODE_DATA_ELIMINATION.
Disallow CONFIG_LD_DEAD_CODE_DATA_ELIMINATION without support for KEEP
within OVERLAY and introduce a macro, OVERLAY_KEEP, that can be used to
conditionally add KEEP when it is properly supported to avoid breaking
old versions of ld.lld.
Cc: stable@vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/381599f1fe973afad3094e55ec99b1620dba7d8c
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[nathan: Fix conflict in init/Kconfig due to lack of RUSTC symbols]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Christian Eggers <ceggers@arri.de>
Date: Thu Mar 20 22:33:51 2025 +0100
ARM: 9444/1: add KEEP() keyword to ARM_VECTORS
commit c3d944a367c0d9e4e125c7006e52f352e75776dc upstream.
Without this, the vectors are removed if LD_DEAD_CODE_DATA_ELIMINATION
is enabled. At startup, the CPU (silently) hangs in the undefined
instruction exception as soon as the first timer interrupt arrives.
On my setup, the system also boots fine without the 2nd and 3rd KEEP()
statements, so I cannot tell whether these are actually required.
[nathan: Use OVERLAY_KEEP() to avoid breaking old ld.lld versions]
Cc: stable@vger.kernel.org
Fixes: ed0f94102251 ("ARM: 9404/1: arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Venkata Prasad Potturu <venkataprasad.potturu@amd.com>
Date: Tue Mar 11 00:02:01 2025 +0530
ASoC: amd: acp: Fix for enabling DMIC on acp platforms via _DSD entry
[ Upstream commit 02e1cf7a352a3ba5f768849f2b4fcaaaa19f89e3 ]
Add condition check to register ACP PDM sound card by reading
_WOV acpi entry.
Fixes: 09068d624c49 ("ASoC: amd: acp: fix for acp platform device creation failure")
Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com>
Link: https://patch.msgid.link/20250310183201.11979-15-venkataprasad.potturu@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat Mar 22 08:45:49 2025 +0100
ASoC: codecs: rt5665: Fix some error handling paths in rt5665_probe()
[ Upstream commit 1ebd4944266e86a7ce274f197847f5a6399651e8 ]
Should an error occur after a successful regulator_bulk_enable() call,
regulator_bulk_disable() should be called, as already done in the remove
function.
Instead of adding an error handling path in the probe, switch from
devm_regulator_bulk_get() to devm_regulator_bulk_get_enable() and
simplify the remove function and some other places accordingly.
Finally, add a missing const when defining rt5665_supply_names to please
checkpatch and constify a few bytes.
Fixes: 33ada14a26c8 ("ASoC: add rt5665 codec driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/e3c2aa1b2fdfa646752d94f4af968630c0d58248.1742629525.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexey Klimov <alexey.klimov@linaro.org>
Date: Fri Feb 21 04:40:24 2025 +0000
ASoC: codecs: wsa884x: report temps to hwmon in millidegree of Celsius
[ Upstream commit d776f016d24816f15033169dcd081f077b6c10f4 ]
Temperatures are reported in units of Celsius however hwmon expects
values to be in millidegree of Celsius. Userspace tools observe values
close to zero and report it as "Not available" or incorrect values like
0C or 1C. Add a simple conversion to fix that.
Before the change:
wsa884x-virtual-0
Adapter: Virtual device
temp1: +0.0°C
--
wsa884x-virtual-0
Adapter: Virtual device
temp1: +0.0°C
Also reported as N/A before first amplifier power on.
After this change and initial wsa884x power on:
wsa884x-virtual-0
Adapter: Virtual device
temp1: +39.0°C
--
wsa884x-virtual-0
Adapter: Virtual device
temp1: +37.0°C
Tested on sm8550 only.
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://patch.msgid.link/20250221044024.1207921-1-alexey.klimov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Date: Tue Mar 4 16:56:37 2025 +0500
ASoC: cs35l41: check the return value from spi_setup()
[ Upstream commit ad5a0970f86d82e39ebd06d45a1f7aa48a1316f8 ]
Currently the return value from spi_setup() is not checked for a failure.
It is unlikely it will ever fail in this particular case but it is still
better to add this check for the sake of completeness and correctness. This
is cheap since it is performed once when the device is being probed.
Handle spi_setup() return value.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Fixes: 872fc0b6bde8 ("ASoC: cs35l41: Set the max SPI speed for the whole device")
Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Link: https://patch.msgid.link/20250304115643.2748-1-v.shevtsov@mt-integration.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Henry Martin <bsdhenrymartin@gmail.com>
Date: Tue Apr 1 22:25:10 2025 +0800
ASoC: imx-card: Add NULL check in imx_card_probe()
[ Upstream commit 93d34608fd162f725172e780b1c60cc93a920719 ]
devm_kasprintf() returns NULL when memory allocation fails. Currently,
imx_card_probe() does not check for this case, which results in a NULL
pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250401142510.29900-1-bsdhenrymartin@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bard Liao <yung-chuan.liao@linux.intel.com>
Date: Wed Mar 5 21:41:13 2025 +0800
ASoC: rt1320: set wake_capable = 0 explicitly
[ Upstream commit 927e6bec5cf3624665b0a2e9f64a1d32f3d22cdd ]
"generic_new_peripheral_assigned: invalid dev_num 1, wake supported 1"
is reported by our internal CI test.
Rt1320's wake feature is not used in Linux and that's why it is not in
the wake_capable_list[] list in intel_auxdevice.c.
However, BIOS may set it as wake-capable. Overwrite wake_capable to 0
in the codec driver to align with wake_capable_list[].
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Acked-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20250305134113.201326-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jayesh Choudhary <j-choudhary@ti.com>
Date: Tue Mar 18 17:05:24 2025 +0530
ASoC: ti: j721e-evm: Fix clock configuration for ti,j7200-cpb-audio compatible
[ Upstream commit 45ff65e30deb919604e68faed156ad96ce7474d9 ]
For 'ti,j7200-cpb-audio' compatible, there is support for only one PLL for
48k. For 11025, 22050, 44100 and 88200 sampling rates, due to absence of
J721E_CLK_PARENT_44100, we get EINVAL while running any audio application.
Add support for these rates by using the 48k parent clock and adjusting
the clock for these rates later in j721e_configure_refclk.
Fixes: 6748d0559059 ("ASoC: ti: Add custom machine driver for j721e EVM (CPB and IVI)")
Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com>
Link: https://patch.msgid.link/20250318113524.57100-1-j-choudhary@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Thu Feb 20 08:48:42 2025 +0100
auxdisplay: MAX6959 should select BITREVERSE
[ Upstream commit fce85f3da08b76c1b052f53a9f6f9c40a8a10660 ]
If CONFIG_BITREVERSE is not enabled:
max6959.c:(.text+0x92): undefined reference to `byte_rev_table'
Fixes: a9bcd02fa42217c7 ("auxdisplay: Add driver for MAX695x 7-segment LED controllers")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/202502161703.3Vr4M7qg-lkp@intel.com/
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon Feb 24 14:36:25 2025 +0200
auxdisplay: panel: Fix an API misuse in panel.c
[ Upstream commit 72e1c440c848624ad4cfac93d69d8a999a20355b ]
Variable allocated by charlcd_alloc() should be released
by charlcd_free(). The following patch changed kfree() to
charlcd_free() to fix an API misuse.
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 718e05ed92ec ("auxdisplay: Introduce hd44780_common.[ch]")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiayuan Chen <mrpre@163.com>
Date: Fri Feb 14 17:18:21 2025 +0800
bpf: Fix array bounds error with may_goto
[ Upstream commit 6ebc5030e0c5a698f1dd9a6684cddf6ccaed64a0 ]
may_goto uses an additional 8 bytes on the stack, which causes the
interpreters[] array to go out of bounds when calculating index by
stack_size.
1. If a BPF program is rewritten, re-evaluate the stack size. For non-JIT
cases, reject loading directly.
2. For non-JIT cases, calculating interpreters[idx] may still cause
out-of-bounds array access, and just warn about it.
3. For jit_requested cases, the execution of bpf_func also needs to be
warned. So move the definition of function __bpf_prog_ret0_warn out of
the macro definition CONFIG_BPF_JIT_ALWAYS_ON.
Reported-by: syzbot+d2a2c639d03ac200a4f1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/0000000000000f823606139faa5d@google.com/
Fixes: 011832b97b311 ("bpf: Introduce may_goto instruction")
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-2-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hou Tao <houtao1@huawei.com>
Date: Thu Feb 20 12:22:59 2025 +0800
bpf: Use preempt_count() directly in bpf_send_signal_common()
[ Upstream commit b4a8b5bba712a711d8ca1f7d04646db63f9c88f5 ]
bpf_send_signal_common() uses preemptible() to check whether or not the
current context is preemptible. If it is preemptible, it will use
irq_work to send the signal asynchronously instead of trying to hold a
spin-lock, because spin-lock is sleepable under PREEMPT_RT.
However, preemptible() depends on CONFIG_PREEMPT_COUNT. When
CONFIG_PREEMPT_COUNT is turned off (e.g., CONFIG_PREEMPT_VOLUNTARY=y),
!preemptible() will be evaluated as 1 and bpf_send_signal_common() will
use irq_work unconditionally.
Fix it by unfolding "!preemptible()" and using "preempt_count() != 0 ||
irqs_disabled()" instead.
Fixes: 87c544108b61 ("bpf: Send signals asynchronously if !preemptible")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20250220042259.1583319-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Mon Mar 10 15:33:53 2025 +0100
can: statistics: use atomic access in hot path
[ Upstream commit 80b5f90158d1364cbd80ad82852a757fc0692bf2 ]
In can_send() and can_receive() CAN messages and CAN filter matches are
counted to be visible in the CAN procfs files.
KCSAN detected a data race within can_send() when two CAN frames have
been generated by a timer event writing to the same CAN netdevice at the
same time. Use atomic operations to access the statistics in the hot path
to fix the KCSAN complaint.
Reported-by: syzbot+78ce4489b812515d5e4d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67cd717d.050a0220.e1a89.0006.GAE@google.com
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20250310143353.3242-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abel Wu <wuyun.abel@bytedance.com>
Date: Sun Feb 9 14:13:11 2025 +0800
cgroup/rstat: Fix forceidle time in cpu.stat
[ Upstream commit c4af66a95aa3bc1d4f607ebd4eea524fb58946e3 ]
The commit b824766504e4 ("cgroup/rstat: add force idle show helper")
retrieves forceidle_time outside cgroup_rstat_lock for non-root cgroups
which can be potentially inconsistent with other stats.
Rather than reverting that commit, fix it in a way that retains the
effort of cleaning up the ifdef-messes.
Fixes: b824766504e4 ("cgroup/rstat: add force idle show helper")
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joshua Hahn <joshua.hahn6@gmail.com>
Date: Wed Oct 2 11:47:16 2024 -0700
cgroup/rstat: Tracking cgroup-level niced CPU time
[ Upstream commit aefa398d93d5db7c555be78a605ff015357f127d ]
Cgroup-level CPU statistics currently include time spent on
user/system processes, but do not include niced CPU time (despite
already being tracked). This patch exposes niced CPU time to the
userspace, allowing users to get a better understanding of their
hardware limits and can facilitate more informed workload distribution.
A new field 'ntime' is added to struct cgroup_base_stat as opposed to
struct task_cputime to minimize footprint.
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Stable-dep-of: c4af66a95aa3 ("cgroup/rstat: Fix forceidle time in cpu.stat")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Wed Feb 12 09:37:57 2025 +0900
cifs: fix incorrect validation for num_aces field of smb_acl
[ Upstream commit aa2a739a75ab6f24ef72fb3fdb9192c081eacf06 ]
parse_dcal() validate num_aces to allocate ace array.
f (num_aces > ULONG_MAX / sizeof(struct smb_ace *))
It is an incorrect validation that we can create an array of size ULONG_MAX.
smb_acl has ->size field to calculate actual number of aces in response buffer
size. Use this to check invalid num_aces.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jerome Brunet <jbrunet@baylibre.com>
Date: Fri Dec 13 11:03:23 2024 +0100
clk: amlogic: g12a: fix mmc A peripheral clock
[ Upstream commit 0079e77c08de692cb20b38e408365c830a44b1ef ]
The bit index of the peripheral clock for mmc A is wrong
This was probably not a problem for mmc A as the peripheral is likely left
enabled by the bootloader.
No issues has been reported so far but it could be a problem, most likely
some form of conflict between the ethernet and mmc A clock, breaking
ethernet on init.
Use the value provided by the documentation for mmc A before this
becomes an actual problem.
Fixes: 085a4ea93d54 ("clk: meson: g12a: add peripheral clock controller")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20241213-amlogic-clk-g12a-mmca-fix-v1-1-5af421f58b64@baylibre.com
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jerome Brunet <jbrunet@baylibre.com>
Date: Fri Dec 13 15:30:17 2024 +0100
clk: amlogic: g12b: fix cluster A parent data
[ Upstream commit 8995f8f108c3ac5ad52b12a6cfbbc7b3b32e9a58 ]
Several clocks used by both g12a and g12b use the g12a cpu A clock hw
pointer as clock parent. This is incorrect on g12b since the parents of
cluster A cpu clock are different. Also the hw clock provided as parent to
these children is not even registered clock on g12b.
Fix the problem by reverting to the global namespace and let CCF pick
the appropriate, as it is already done for other clocks, such as
cpu_clk_trace_div.
Fixes: 25e682a02d91 ("clk: meson: g12a: migrate to the new parent description method")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20241213-amlogic-clk-g12a-cpua-parent-fix-v1-1-d8c0f41865fe@baylibre.com
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jerome Brunet <jbrunet@baylibre.com>
Date: Fri Dec 20 11:25:36 2024 +0100
clk: amlogic: gxbb: drop incorrect flag on 32k clock
[ Upstream commit f38f7fe4830c5cb4eac138249225f119e7939965 ]
gxbb_32k_clk_div sets CLK_DIVIDER_ROUND_CLOSEST in the init_data flag which
is incorrect. This is field is not where the divider flags belong.
Thankfully, CLK_DIVIDER_ROUND_CLOSEST maps to bit 4 which is an unused
clock flag, so there is no unintended consequence to this error.
Effectively, the clock has been used without CLK_DIVIDER_ROUND_CLOSEST
so far, so just drop it.
Fixes: 14c735c8e308 ("clk: meson-gxbb: Add EE 32K Clock for CEC")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20241220-amlogic-clk-gxbb-32k-fixes-v1-1-baca56ecf2db@baylibre.com
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jerome Brunet <jbrunet@baylibre.com>
Date: Fri Dec 20 11:25:37 2024 +0100
clk: amlogic: gxbb: drop non existing 32k clock parent
[ Upstream commit 7915d7d5407c026fa9343befb4d3343f7a345f97 ]
The 32k clock reference a parent 'cts_slow_oscin' with a fixme note saying
that this clock should be provided by AO controller.
The HW probably has this clock but it does not exist at the moment in
any controller implementation. Furthermore, referencing clock by the global
name should be avoided whenever possible.
There is no reason to keep this hack around, at least for now.
Fixes: 14c735c8e308 ("clk: meson-gxbb: Add EE 32K Clock for CEC")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20241220-amlogic-clk-gxbb-32k-fixes-v1-2-baca56ecf2db@baylibre.com
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Laurentiu Mihalcea <laurentiu.mihalcea@nxp.com>
Date: Wed Feb 26 11:45:11 2025 -0500
clk: clk-imx8mp-audiomix: fix dsp/ocram_a clock parents
[ Upstream commit 91be7d27099dedf813b80702e4ca117d1fb38ce6 ]
The DSP and OCRAM_A modules from AUDIOMIX are clocked by
AUDIO_AXI_CLK_ROOT, not AUDIO_AHB_CLK_ROOT. Update the clock data
accordingly.
Fixes: 6cd95f7b151c ("clk: imx: imx8mp: Add audiomix block control")
Signed-off-by: Laurentiu Mihalcea <laurentiu.mihalcea@nxp.com>
Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250226164513.33822-3-laurentiumihalcea111@gmail.com
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Lypak <vladimir.lypak@gmail.com>
Date: Sat Mar 15 16:26:18 2025 +0100
clk: qcom: gcc-msm8953: fix stuck venus0_core0 clock
[ Upstream commit cdc59600bccf2cb4c483645438a97d4ec55f326b ]
This clock can't be enable with VENUS_CORE0 GDSC turned off. But that
GDSC is under HW control so it can be turned off at any moment.
Instead of checking the dependent clock we can just vote for it to
enable later when GDSC gets turned on.
Fixes: 9bb6cfc3c77e6 ("clk: qcom: Add Global Clock Controller driver for MSM8953")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
Link: https://lore.kernel.org/r/20250315-clock-fix-v1-2-2efdc4920dda@mainlining.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Neil Armstrong <neil.armstrong@linaro.org>
Date: Wed Mar 5 20:00:29 2025 +0100
clk: qcom: gcc-sm8650: Do not turn off USB GDSCs during gdsc_disable()
[ Upstream commit 8b75c2973997e66fd897b7e87b5ba2f3d683e94b ]
With PWRSTS_OFF_ON, USB GDSCs are turned off during gdsc_disable(). This
can happen during scenarios such as system suspend and breaks the resume
of USB controller from suspend.
So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
during gdsc_disable() and allow the hardware to transition the GDSCs to
retention when the parent domain enters low power state during system
suspend.
Fixes: c58225b7e3d7 ("clk: qcom: add the SM8650 Global Clock Controller driver, part 1")
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250305-topic-sm8650-upstream-fix-usb-suspend-v1-1-649036ab0557@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Sat Jan 11 17:54:18 2025 +0100
clk: qcom: gcc-x1e80100: Unregister GCC_GPU_CFG_AHB_CLK/GCC_DISP_XO_CLK
[ Upstream commit b60521eff227ef459e03879cbea2b2bd85a8d7af ]
The GPU clock is required for CPU access to GPUSS registers. It was
previously decided (on this and many more platforms) that the added
overhead/hassle introduced by keeping track of it would not bring much
measurable improvement in the power department.
The display clock is basically the same story over again.
Now, we're past that discussion and this commit is not trying to change
that. Instead, the clocks are both force-enabled in .probe *and*
registered with the common clock framework, resulting in them being
toggled off after ignore_unused.
Unregister said clocks to fix breakage when clk_ignore_unused is absent
(as it should be).
Fixes: 161b7c401f4b ("clk: qcom: Add Global Clock controller (GCC) driver for X1E80100")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250111-topic-x1e_fixups-v1-1-77dc39237c12@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org>
Date: Sat Mar 15 16:26:17 2025 +0100
clk: qcom: mmcc-sdm660: fix stuck video_subcore0 clock
[ Upstream commit 000cbe3896c56bf5c625e286ff096533a6b27657 ]
This clock can't be enable with VENUS_CORE0 GDSC turned off. But that
GDSC is under HW control so it can be turned off at any moment.
Instead of checking the dependent clock we can just vote for it to
enable later when GDSC gets turned on.
Fixes: 5db3ae8b33de6 ("clk: qcom: Add SDM660 Multimedia Clock Controller (MMCC) driver")
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
Link: https://lore.kernel.org/r/20250315-clock-fix-v1-1-2efdc4920dda@mainlining.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Date: Wed Jan 15 16:20:58 2025 +0200
clk: renesas: r8a08g045: Check the source of the CPU PLL settings
[ Upstream commit dc0f16c1b76293ac942a783e960abfd19e95fdf5 ]
On the RZ/G3S SoC, the CPU PLL settings can be set and retrieved through
the CPG_PLL1_CLK1 and CPG_PLL1_CLK2 registers. However, these settings
are applied only when CPG_PLL1_SETTING.SEL_PLL1 is set to 0.
Otherwise, the CPU PLL operates at the default frequency of 1.1 GHz.
Hence add support to the PLL driver for returning the 1.1 GHz frequency
when the CPU PLL is configured with the default frequency.
Fixes: 01eabef547e6 ("clk: renesas: rzg2l: Add support for RZ/G3S PLL")
Fixes: de60a3ebe410 ("clk: renesas: Add minimal boot support for RZ/G3S SoC")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250115142059.1833063-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Geis <pgwipeout@gmail.com>
Date: Wed Jan 15 01:26:22 2025 +0000
clk: rockchip: rk3328: fix wrong clk_ref_usb3otg parent
[ Upstream commit a9e60f1ffe1ca57d6af6a2573e2f950e76efbf5b ]
Correct the clk_ref_usb3otg parent to fix clock control for the usb3
controller on rk3328. Verified against the rk3328 trm, the rk3228h trm,
and the rk3328 usb3 phy clock map.
Fixes: fe3511ad8a1c ("clk: rockchip: add clock controller for rk3328")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20250115012628.1035928-2-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Will McVicker <willmcvicker@google.com>
Date: Wed Feb 12 10:32:52 2025 -0800
clk: samsung: Fix UBSAN panic in samsung_clk_init()
[ Upstream commit d19d7345a7bcdb083b65568a11b11adffe0687af ]
With UBSAN_ARRAY_BOUNDS=y, I'm hitting the below panic due to
dereferencing `ctx->clk_data.hws` before setting
`ctx->clk_data.num = nr_clks`. Move that up to fix the crash.
UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP
<snip>
Call trace:
samsung_clk_init+0x110/0x124 (P)
samsung_clk_init+0x48/0x124 (L)
samsung_cmu_register_one+0x3c/0xa0
exynos_arm64_register_cmu+0x54/0x64
__gs101_cmu_top_of_clk_init_declare+0x28/0x60
...
Fixes: e620a1e061c4 ("drivers/clk: convert VL struct to struct_size")
Signed-off-by: Will McVicker <willmcvicker@google.com>
Link: https://lore.kernel.org/r/20250212183253.509771-1-willmcvicker@google.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 31 21:26:45 2025 -0700
context_tracking: Always inline ct_{nmi,irq}_{enter,exit}()
[ Upstream commit 9ac50f7311dc8b39e355582f14c1e82da47a8196 ]
Thanks to CONFIG_DEBUG_SECTION_MISMATCH, empty functions can be
generated out of line. These can be called from noinstr code, so make
sure they're always inlined.
Fixes the following warnings:
vmlinux.o: warning: objtool: irqentry_nmi_enter+0xa2: call to ct_nmi_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_nmi_exit+0x16: call to ct_nmi_exit() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_exit+0x78: call to ct_irq_exit() leaves .noinstr.text section
Fixes: 6f0e6c1598b1 ("context_tracking: Take IRQ eqs entrypoints over RCU")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/8509bce3f536bcd4ae7af3a2cf6930d48c5e631a.1743481539.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/d1eca076-fdde-484a-b33e-70e0d167c36d@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuanfang Zhang <quic_yuanfang@quicinc.com>
Date: Thu Jan 16 17:04:20 2025 +0800
coresight-etm4x: add isb() before reading the TRCSTATR
[ Upstream commit 4ff6039ffb79a4a8a44b63810a8a2f2b43264856 ]
As recommended by section 4.3.7 ("Synchronization when using system
instructions to progrom the trace unit") of ARM IHI 0064H.b, the
self-hosted trace analyzer must perform a Context synchronization
event between writing to the TRCPRGCTLR and reading the TRCSTATR.
Additionally, add an ISB between the each read of TRCSTATR on
coresight_timeout() when using system instructions to program the
trace unit.
Fixes: 1ab3bb9df5e3 ("coresight: etm4x: Add necessary synchronization for sysreg access")
Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250116-etm_sync-v4-1-39f2b05e9514@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Date: Thu Jan 9 21:53:48 2025 +0000
coresight: catu: Fix number of pages while using 64k pages
[ Upstream commit 0e14e062f5ff98aa15264dfa87c5f5e924028561 ]
Trying to record a trace on kernel with 64k pages resulted in -ENOMEM.
This happens due to a bug in calculating the number of table pages, which
returns zero. Fix the issue by rounding up.
$ perf record --kcore -e cs_etm/@tmc_etr55,cycacc,branch_broadcast/k --per-thread taskset --cpu-list 1 dd if=/dev/zero of=/dev/null
failed to mmap with 12 (Cannot allocate memory)
Fixes: 8ed536b1e283 ("coresight: catu: Add support for scatter gather tables")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250109215348.5483-1-ilkka@os.amperecomputing.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jie Zhan <zhanjie9@hisilicon.com>
Date: Thu Feb 13 11:55:10 2025 +0800
cpufreq: governor: Fix negative 'idle_time' handling in dbs_update()
[ Upstream commit 3698dd6b139dc37b35a9ad83d9330c1f99666c02 ]
We observed an issue that the CPU frequency can't raise up with a 100% CPU
load when NOHZ is off and the 'conservative' governor is selected.
'idle_time' can be negative if it's obtained from get_cpu_idle_time_jiffy()
when NOHZ is off. This was found and explained in commit 9485e4ca0b48
("cpufreq: governor: Fix handling of special cases in dbs_update()").
However, commit 7592019634f8 ("cpufreq: governors: Fix long idle detection
logic in load calculation") introduced a comparison between 'idle_time' and
'samling_rate' to detect a long idle interval. While 'idle_time' is
converted to int before comparison, it's actually promoted to unsigned
again when compared with an unsigned 'sampling_rate'. Hence, this leads to
wrong idle interval detection when it's in fact 100% busy and sets
policy_dbs->idle_periods to a very large value. 'conservative' adjusts the
frequency to minimum because of the large 'idle_periods', such that the
frequency can't raise up. 'Ondemand' doesn't use policy_dbs->idle_periods
so it fortunately avoids the issue.
Correct negative 'idle_time' to 0 before any use of it in dbs_update().
Fixes: 7592019634f8 ("cpufreq: governors: Fix long idle detection logic in load calculation")
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Link: https://patch.msgid.link/20250213035510.2402076-1-zhanjie9@hisilicon.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: zuoqian <zuoqian113@gmail.com>
Date: Sat Jan 25 08:49:49 2025 +0000
cpufreq: scpi: compare kHz instead of Hz
[ Upstream commit 4742da9774a416908ef8e3916164192c15c0e2d1 ]
The CPU rate from clk_get_rate() may not be divisible by 1000
(e.g., 133333333). But the rate calculated from frequency(kHz) is
always divisible by 1000 (e.g., 133333000).
Comparing the rate causes a warning during CPU scaling:
"cpufreq: __target_index: Failed to change cpu frequency: -5".
When we choose to compare kHz here, the issue does not occur.
Fixes: 343a8d17fa8d ("cpufreq: scpi: remove arm_big_little dependency")
Signed-off-by: zuoqian <zuoqian113@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aaron Kling <webgeek1234@gmail.com>
Date: Wed Feb 26 12:51:59 2025 -0600
cpufreq: tegra194: Allow building for Tegra234
[ Upstream commit 4a1e3bf61fc78ad100018adb573355303915dca3 ]
Support was added for Tegra234 in the referenced commit, but the Kconfig
was not updated to allow building for the arch.
Fixes: 273bc890a2a8 ("cpufreq: tegra194: Add support for Tegra234")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri Feb 14 10:31:25 2025 +0800
crypto: api - Fix larval relookup type and mask
[ Upstream commit 7505436e2925d89a13706a295a6734d6cabb4b43 ]
When the lookup is retried after instance construction, it uses
the type and mask from the larval, which may not match the values
used by the caller. For example, if the caller is requesting for
a !NEEDS_FALLBACK algorithm, it may end up getting an algorithm
that needs fallbacks.
Fix this by making the caller supply the type/mask and using that
for the lookup.
Reported-by: Coiby Xu <coxu@redhat.com>
Fixes: 96ad59552059 ("crypto: api - Remove instance larval fulfilment")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Mon Feb 17 13:55:55 2025 +0100
crypto: bpf - Add MODULE_DESCRIPTION for skcipher
[ Upstream commit f307c87ea06c64b87fcd3221a682cd713cde51e9 ]
All modules should have a description, building with extra warnings
enabled prints this outfor the for bpf_crypto_skcipher module:
WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/bpf_crypto_skcipher.o
Add a description line.
Fixes: fda4f71282b2 ("bpf: crypto: add skcipher to bpf crypto")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wenkai Lin <linwenkai6@hisilicon.com>
Date: Wed Feb 5 11:56:26 2025 +0800
crypto: hisilicon/sec2 - fix for aead auth key length
[ Upstream commit 1b284ffc30b02808a0de698667cbcf5ce5f9144e ]
According to the HMAC RFC, the authentication key
can be 0 bytes, and the hardware can handle this
scenario. Therefore, remove the incorrect validation
for this case.
Fixes: 2f072d75d1ab ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wenkai Lin <linwenkai6@hisilicon.com>
Date: Wed Feb 5 11:56:27 2025 +0800
crypto: hisilicon/sec2 - fix for aead authsize alignment
[ Upstream commit a49cc71e219040d771a8c1254879984f98192811 ]
The hardware only supports authentication sizes
that are 4-byte aligned. Therefore, the driver
switches to software computation in this case.
Fixes: 2f072d75d1ab ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wenkai Lin <linwenkai6@hisilicon.com>
Date: Wed Feb 5 11:56:28 2025 +0800
crypto: hisilicon/sec2 - fix for sec spec check
[ Upstream commit f4f353cb7ae9bb43e34943edb693532a39118eca ]
During encryption and decryption, user requests
must be checked first, if the specifications that
are not supported by the hardware are used, the
software computing is used for processing.
Fixes: 2f072d75d1ab ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu Feb 27 18:14:55 2025 +0800
crypto: iaa - Test the correct request flag
[ Upstream commit fc4bd01d9ff592f620c499686245c093440db0e8 ]
Test the correct flags for the MAY_SLEEP bit.
Fixes: 2ec6761df889 ("crypto: iaa - Add support for deflate-iaa compression algorithm")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat Mar 15 16:50:42 2025 +0800
crypto: nx - Fix uninitialised hv_nxc on error
[ Upstream commit 9b00eb923f3e60ca76cbc8b31123716f3a87ac6a ]
The compiler correctly warns that hv_nxc may be used uninitialised
as that will occur when NX-GZIP is unavailable.
Fix it by rearranging the code and delay setting caps_feat until
the final query succeeds.
Fixes: b4ba22114c78 ("crypto/nx: Get NX capabilities for GZIP coprocessor type")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date: Wed Mar 5 00:02:39 2025 +0100
crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
[ Upstream commit 1e4d73d06c98f5a1af4f7591cf7c2c4eee5b94fa ]
The following build warning has been reported:
arch/powerpc/crypto/ghashp8-ppc.o: warning: objtool: .text+0x22c: unannotated intra-function call
This happens due to commit bb7f054f4de2 ("objtool/powerpc: Add support
for decoding all types of uncond branches")
Disassembly of arch/powerpc/crypto/ghashp8-ppc.o shows:
arch/powerpc/crypto/ghashp8-ppc.o: file format elf64-powerpcle
Disassembly of section .text:
0000000000000140 <gcm_ghash_p8>:
140: f8 ff 00 3c lis r0,-8
...
20c: 20 00 80 4e blr
210: 00 00 00 00 .long 0x0
214: 00 0c 14 00 .long 0x140c00
218: 00 00 04 00 .long 0x40000
21c: 00 00 00 00 .long 0x0
220: 47 48 41 53 rlwimi. r1,r26,9,1,3
224: 48 20 66 6f xoris r6,r27,8264
228: 72 20 50 6f xoris r16,r26,8306
22c: 77 65 72 49 bla 1726574 <gcm_ghash_p8+0x1726434> <==
...
It corresponds to the following code in ghashp8-ppc.o :
_GLOBAL(gcm_ghash_p8)
lis 0,0xfff8
...
blr
.long 0
.byte 0,12,0x14,0,0,0,4,0
.long 0
.size gcm_ghash_p8,.-gcm_ghash_p8
.byte 71,72,65,83,72,32,102,111,114,32,80,111,119,101,114,73,83,65,32,50,46,48,55,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
.align 2
.align 2
In fact this is raw data that is after the function end and that is
not text so shouldn't be disassembled as text. But ghashp8-ppc.S is
generated by a perl script and should have been marked as
OBJECT_FILES_NON_STANDARD.
Now that 'bla' is understood as a call instruction, that raw data
is mis-interpreted as an infra-function call.
Mark ghashp8-ppc.o as a OBJECT_FILES_NON_STANDARD to avoid this
warning.
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/8c4c3fc2-2bd7-4148-af68-2f504d6119e0@linux.ibm.com
Fixes: 109303336a0c ("crypto: vmx - Move to arch/powerpc/crypto")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-By: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/7aa7eb73fe6bc95ac210510e22394ca0ae227b69.1741128786.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bairavi Alagappan <bairavix.alagappan@intel.com>
Date: Fri Mar 14 15:09:31 2025 +0000
crypto: qat - remove access to parity register for QAT GEN4
[ Upstream commit 92c6a707d82f0629debf1c21dd87717776d96af2 ]
The firmware already handles parity errors reported by the accelerators
by clearing them through the corresponding SSMSOFTERRORPARITY register.
To ensure consistent behavior and prevent race conditions between the
driver and firmware, remove the logic that checks the SSMSOFTERRORPARITY
registers.
Additionally, change the return type of the function
adf_handle_rf_parr_err() to void, as it consistently returns false.
Parity errors are recoverable and do not necessitate a device reset.
Fixes: 895f7d532c84 ("crypto: qat - add handling of errors from ERRSOU2 for QAT GEN4")
Signed-off-by: Bairavi Alagappan <bairavix.alagappan@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bairavi Alagappan <bairavix.alagappan@intel.com>
Date: Fri Mar 14 13:14:29 2025 +0000
crypto: qat - set parity error mask for qat_420xx
[ Upstream commit f9555d18084985c80a91baa4fdb7d205b401a754 ]
The field parerr_wat_wcp_mask in the structure adf_dev_err_mask enables
the detection and reporting of parity errors for the wireless cipher and
wireless authentication accelerators.
Set the parerr_wat_wcp_mask field, which was inadvertently omitted
during the initial enablement of the qat_420xx driver, to ensure that
parity errors are enabled for those accelerators.
In addition, fix the string used to report such errors that was
inadvertently set to "ath_cph" (authentication and cipher).
Fixes: fcf60f4bcf54 ("crypto: qat - add support for 420xx devices")
Signed-off-by: Bairavi Alagappan <bairavix.alagappan@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:04 2025 +0530
crypto: tegra - check return value for hash do_one_req
[ Upstream commit dcf8b7e49b86738296c77fb58c123dd2d74a22a7 ]
Initialize and check the return value in hash *do_one_req() functions
and exit the function if there is an error. This fixes the
'uninitialized variable' warnings reported by testbots.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202412071747.flPux4oB-lkp@intel.com/
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:07 2025 +0530
crypto: tegra - Fix CMAC intermediate result handling
[ Upstream commit ce390d6c2675d2e24d798169a1a0e3cdbc076907 ]
Saving and restoring of the intermediate results are needed if there is
context switch caused by another ongoing request on the same engine.
This is therefore not only to support import/export functionality.
Hence, save and restore the intermediate result for every non-first task.
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:08 2025 +0530
crypto: tegra - Set IV to NULL explicitly for AES ECB
[ Upstream commit bde558220866e74f19450e16d9a2472b488dfedf ]
It may happen that the variable req->iv may have stale values or
zero sized buffer by default and may end up getting used during
encryption/decryption. This inturn may corrupt the results or break the
operation. Set the req->iv variable to NULL explicitly for algorithms
like AES-ECB where IV is not used.
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:10 2025 +0530
crypto: tegra - Use HMAC fallback when keyslots are full
[ Upstream commit f80a2e2e77bedd0aa645a60f89b4f581c70accda ]
The intermediate results for HMAC is stored in the allocated keyslot by
the hardware. Dynamic allocation of keyslot during an operation is hence
not possible. As the number of keyslots are limited in the hardware,
fallback to the HMAC software implementation if keyslots are not available
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:01 2025 +0530
crypto: tegra - Use separate buffer for setkey
[ Upstream commit bcfc8fc53f3acb3213fb9d28675244aa4ce208e0 ]
The buffer which sends the commands to host1x was shared for all tasks
in the engine. This causes a problem with the setkey() function as it
gets called asynchronous to the crypto engine queue. Modifying the same
cmdbuf in setkey() will corrupt the ongoing host1x task and in turn
break the encryption/decryption operation. Hence use a separate cmdbuf
for setkey().
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Fri Feb 28 15:17:19 2025 +0800
dmaengine: fsl-edma: cleanup chan after dma_async_device_unregister
[ Upstream commit c9c59da76ce9cb3f215b66eb3708cda1134a5206 ]
There is kernel dump when do module test:
sysfs: cannot create duplicate filename
/devices/platform/soc@0/44000000.bus/44000000.dma-controller/dma/dma0chan0
__dma_async_device_channel_register+0x128/0x19c
dma_async_device_register+0x150/0x454
fsl_edma_probe+0x6cc/0x8a0
platform_probe+0x68/0xc8
fsl_edma_cleanup_vchan will unlink vchan.chan.device_node, while
dma_async_device_unregister needs the link to do
__dma_async_device_channel_unregister. So need move fsl_edma_cleanup_vchan
after dma_async_device_unregister to make sure channel could be freed.
So clean up chan after dma_async_device_unregister to address this.
Fixes: 6f93b93b2a1b ("dmaengine: fsl-edma: kill the tasklets upon exit")
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250228071720.3780479-1-peng.fan@oss.nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Fri Feb 28 15:17:20 2025 +0800
dmaengine: fsl-edma: free irq correctly in remove path
[ Upstream commit fa70c4c3c580c239a0f9e83a14770ab026e8d820 ]
Add fsl_edma->txirq/errirq check to avoid below warning because no
errirq at i.MX9 platform. Otherwise there will be kernel dump:
WARNING: CPU: 0 PID: 11 at kernel/irq/devres.c:144 devm_free_irq+0x74/0x80
Modules linked in:
CPU: 0 UID: 0 PID: 11 Comm: kworker/u8:0 Not tainted 6.12.0-rc7#18
Hardware name: NXP i.MX93 11X11 EVK board (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : devm_free_irq+0x74/0x80
lr : devm_free_irq+0x48/0x80
Call trace:
devm_free_irq+0x74/0x80 (P)
devm_free_irq+0x48/0x80 (L)
fsl_edma_remove+0xc4/0xc8
platform_remove+0x28/0x44
device_remove+0x4c/0x80
Fixes: 44eb827264de ("dmaengine: fsl-edma: request per-channel IRQ only when channel is allocated")
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250228071720.3780479-2-peng.fan@oss.nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Date: Wed Feb 5 10:06:38 2025 -0300
drm/amd/display: avoid NPD when ASIC does not support DMUB
[ Upstream commit 42d9d7bed270247f134190ba0cb05bbd072f58c2 ]
ctx->dmub_srv will de NULL if the ASIC does not support DMUB, which is
tested in dm_dmub_sw_init.
However, it will be dereferenced in dmub_hw_lock_mgr_cmd if
should_use_dmub_lock returns true.
This has been the case since dmub support has been added for PSR1.
Fix this by checking for dmub_srv in should_use_dmub_lock.
[ 37.440832] BUG: kernel NULL pointer dereference, address: 0000000000000058
[ 37.447808] #PF: supervisor read access in kernel mode
[ 37.452959] #PF: error_code(0x0000) - not-present page
[ 37.458112] PGD 0 P4D 0
[ 37.460662] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 37.465553] CPU: 2 UID: 1000 PID: 1745 Comm: DrmThread Not tainted 6.14.0-rc1-00003-gd62e938120f0 #23 99720e1cb1e0fc4773b8513150932a07de3c6e88
[ 37.478324] Hardware name: Google Morphius/Morphius, BIOS Google_Morphius.13434.858.0 10/26/2023
[ 37.487103] RIP: 0010:dmub_hw_lock_mgr_cmd+0x77/0xb0
[ 37.492074] Code: 44 24 0e 00 00 00 00 48 c7 04 24 45 00 00 0c 40 88 74 24 0d 0f b6 02 88 44 24 0c 8b 01 89 44 24 08 85 f6 75 05 c6 44 24 0e 01 <48> 8b 7f 58 48 89 e6 ba 01 00 00 00 e8 08 3c 2a 00 65 48 8b 04 5
[ 37.510822] RSP: 0018:ffff969442853300 EFLAGS: 00010202
[ 37.516052] RAX: 0000000000000000 RBX: ffff92db03000000 RCX: ffff969442853358
[ 37.523185] RDX: ffff969442853368 RSI: 0000000000000001 RDI: 0000000000000000
[ 37.530322] RBP: 0000000000000001 R08: 00000000000004a7 R09: 00000000000004a5
[ 37.537453] R10: 0000000000000476 R11: 0000000000000062 R12: ffff92db0ade8000
[ 37.544589] R13: ffff92da01180ae0 R14: ffff92da011802a8 R15: ffff92db03000000
[ 37.551725] FS: 0000784a9cdfc6c0(0000) GS:ffff92db2af00000(0000) knlGS:0000000000000000
[ 37.559814] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.565562] CR2: 0000000000000058 CR3: 0000000112b1c000 CR4: 00000000003506f0
[ 37.572697] Call Trace:
[ 37.575152] <TASK>
[ 37.577258] ? __die_body+0x66/0xb0
[ 37.580756] ? page_fault_oops+0x3e7/0x4a0
[ 37.584861] ? exc_page_fault+0x3e/0xe0
[ 37.588706] ? exc_page_fault+0x5c/0xe0
[ 37.592550] ? asm_exc_page_fault+0x22/0x30
[ 37.596742] ? dmub_hw_lock_mgr_cmd+0x77/0xb0
[ 37.601107] dcn10_cursor_lock+0x1e1/0x240
[ 37.605211] program_cursor_attributes+0x81/0x190
[ 37.609923] commit_planes_for_stream+0x998/0x1ef0
[ 37.614722] update_planes_and_stream_v2+0x41e/0x5c0
[ 37.619703] dc_update_planes_and_stream+0x78/0x140
[ 37.624588] amdgpu_dm_atomic_commit_tail+0x4362/0x49f0
[ 37.629832] ? srso_return_thunk+0x5/0x5f
[ 37.633847] ? mark_held_locks+0x6d/0xd0
[ 37.637774] ? _raw_spin_unlock_irq+0x24/0x50
[ 37.642135] ? srso_return_thunk+0x5/0x5f
[ 37.646148] ? lockdep_hardirqs_on+0x95/0x150
[ 37.650510] ? srso_return_thunk+0x5/0x5f
[ 37.654522] ? _raw_spin_unlock_irq+0x2f/0x50
[ 37.658883] ? srso_return_thunk+0x5/0x5f
[ 37.662897] ? wait_for_common+0x186/0x1c0
[ 37.666998] ? srso_return_thunk+0x5/0x5f
[ 37.671009] ? drm_crtc_next_vblank_start+0xc3/0x170
[ 37.675983] commit_tail+0xf5/0x1c0
[ 37.679478] drm_atomic_helper_commit+0x2a2/0x2b0
[ 37.684186] drm_atomic_commit+0xd6/0x100
[ 37.688199] ? __cfi___drm_printfn_info+0x10/0x10
[ 37.692911] drm_atomic_helper_update_plane+0xe5/0x130
[ 37.698054] drm_mode_cursor_common+0x501/0x670
[ 37.702600] ? __cfi_drm_mode_cursor_ioctl+0x10/0x10
[ 37.707572] drm_mode_cursor_ioctl+0x48/0x70
[ 37.711851] drm_ioctl_kernel+0xf2/0x150
[ 37.715781] drm_ioctl+0x363/0x590
[ 37.719189] ? __cfi_drm_mode_cursor_ioctl+0x10/0x10
[ 37.724165] amdgpu_drm_ioctl+0x41/0x80
[ 37.728013] __se_sys_ioctl+0x7f/0xd0
[ 37.731685] do_syscall_64+0x87/0x100
[ 37.735355] ? vma_end_read+0x12/0xe0
[ 37.739024] ? srso_return_thunk+0x5/0x5f
[ 37.743041] ? find_held_lock+0x47/0xf0
[ 37.746884] ? vma_end_read+0x12/0xe0
[ 37.750552] ? srso_return_thunk+0x5/0x5f
[ 37.754565] ? lock_release+0x1c4/0x2e0
[ 37.758406] ? vma_end_read+0x12/0xe0
[ 37.762079] ? exc_page_fault+0x84/0xe0
[ 37.765921] ? srso_return_thunk+0x5/0x5f
[ 37.769938] ? lockdep_hardirqs_on+0x95/0x150
[ 37.774303] ? srso_return_thunk+0x5/0x5f
[ 37.778317] ? exc_page_fault+0x84/0xe0
[ 37.782163] entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ 37.787218] RIP: 0033:0x784aa5ec3059
[ 37.790803] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1d 48 8b 45 c8 64 48 2b 04 25 28 00 0
[ 37.809553] RSP: 002b:0000784a9cdf90e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 37.817121] RAX: ffffffffffffffda RBX: 0000784a9cdf917c RCX: 0000784aa5ec3059
[ 37.824256] RDX: 0000784a9cdf917c RSI: 00000000c01c64a3 RDI: 0000000000000020
[ 37.831391] RBP: 0000784a9cdf9130 R08: 0000000000000100 R09: 0000000000ff0000
[ 37.838525] R10: 0000000000000000 R11: 0000000000000246 R12: 0000025c01606ed0
[ 37.845657] R13: 0000025c00030200 R14: 00000000c01c64a3 R15: 0000000000000020
[ 37.852799] </TASK>
[ 37.854992] Modules linked in:
[ 37.864546] gsmi: Log Shutdown Reason 0x03
[ 37.868656] CR2: 0000000000000058
[ 37.871979] ---[ end trace 0000000000000000 ]---
[ 37.880976] RIP: 0010:dmub_hw_lock_mgr_cmd+0x77/0xb0
[ 37.885954] Code: 44 24 0e 00 00 00 00 48 c7 04 24 45 00 00 0c 40 88 74 24 0d 0f b6 02 88 44 24 0c 8b 01 89 44 24 08 85 f6 75 05 c6 44 24 0e 01 <48> 8b 7f 58 48 89 e6 ba 01 00 00 00 e8 08 3c 2a 00 65 48 8b 04 5
[ 37.904703] RSP: 0018:ffff969442853300 EFLAGS: 00010202
[ 37.909933] RAX: 0000000000000000 RBX: ffff92db03000000 RCX: ffff969442853358
[ 37.917068] RDX: ffff969442853368 RSI: 0000000000000001 RDI: 0000000000000000
[ 37.924201] RBP: 0000000000000001 R08: 00000000000004a7 R09: 00000000000004a5
[ 37.931336] R10: 0000000000000476 R11: 0000000000000062 R12: ffff92db0ade8000
[ 37.938469] R13: ffff92da01180ae0 R14: ffff92da011802a8 R15: ffff92db03000000
[ 37.945602] FS: 0000784a9cdfc6c0(0000) GS:ffff92db2af00000(0000) knlGS:0000000000000000
[ 37.953689] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.959435] CR2: 0000000000000058 CR3: 0000000112b1c000 CR4: 00000000003506f0
[ 37.966570] Kernel panic - not syncing: Fatal exception
[ 37.971901] Kernel Offset: 0x30200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 37.982840] gsmi: Log Shutdown Reason 0x02
Fixes: b5c764d6ed55 ("drm/amd/display: Use HW lock mgr for PSR1")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Cc: Sun peng Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aurabindo Pillai <aurabindo.pillai@amd.com>
Date: Fri Feb 21 09:45:12 2025 -0500
drm/amd/display: fix an indent issue in DML21
[ Upstream commit a1addcf8499a566496847f1e36e1cf0b4ad72a26 ]
Remove extraneous tab and newline in dml2_core_dcn4.c that was
reported by the bot
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502211920.txUfwtSj-lkp@intel.com/
Fixes: 70839da6360 ("drm/amd/display: Add new DCN401 sources")
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Date: Thu Feb 27 01:28:51 2025 +0500
drm/amd/display: fix type mismatch in CalculateDynamicMetadataParameters()
[ Upstream commit c3c584c18c90a024a54716229809ba36424f9660 ]
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers and integer. On
some systems where long is 32 bits and one of these unsigned int params is
greater than INT_MAX it may cause passing input params as negative values.
Fix this by changing these argument types from long to unsigned int and to
int respectively. Also this will align the function's definition with
similar functions in other dcn* drivers.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Fixes: 6725a88f88a7 ("drm/amd/display: Add DCN3 DML")
Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mario Limonciello <mario.limonciello@amd.com>
Date: Thu Mar 6 12:51:24 2025 -0600
drm/amd: Keep display off while going into S4
[ Upstream commit 4afacc9948e1f8fdbca401d259ae65ad93d298c0 ]
When userspace invokes S4 the flow is:
1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system
Then on resume amdgpu_pmops_restore() is called.
This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.
This includes turning the display hardware back on and discovering
connectors again.
This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.
Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8dc0a1a7fdd9ab61e69907ae71a6fd3d91)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Wed Mar 26 09:35:02 2025 -0400
drm/amdgpu/gfx11: fix num_mec
[ Upstream commit 4161050d47e1b083a7e1b0b875c9907e1a6f1f1f ]
GC11 only has 1 mec.
Fixes: 3d879e81f0f9 ("drm/amdgpu: add init support for GFX11 (v2)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Thu Mar 20 12:09:11 2025 -0400
drm/amdgpu/gfx12: fix num_mec
[ Upstream commit dce8bd9137b88735dd0efc4e2693213d98c15913 ]
GC12 only has 1 mec.
Fixes: 52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Wed Feb 12 16:31:43 2025 -0500
drm/amdgpu/umsch: fix ucode check
[ Upstream commit c917e39cbdcd9fff421184db6cc461cc58d52c17 ]
Return an error if the IP version doesn't match otherwise
we end up passing a NULL string to amdgpu_ucode_request.
We should never hit this in practice today since we only
enable the umsch code on the supported IP versions, but
add a check to be safe.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502130406.iWQ0eBug-lkp@intel.com/
Fixes: 020620424b27 ("drm/amd: Use a constant format string for amdgpu_ucode_request")
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yang Wang <kevinyang.wang@amd.com>
Date: Wed Feb 5 15:46:42 2025 +0800
drm/amdgpu: refine smu send msg debug log format
[ Upstream commit 8c6631234557515a7567c6251505a98e9793c8a6 ]
remove unnecessary line breaks.
[ 51.280860] amdgpu 0000:24:00.0: amdgpu: smu send message: GetEnabledSmuFeaturesHigh(13) param: 0x00000000, resp: 0x00000001, readval: 0x00003763
Fixes: 0cd2bc06de72 ("drm/amd/pm: enable amdgpu smu send message log")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date: Mon Feb 24 13:46:32 2025 +0530
drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'
[ Upstream commit fddc45026311c05a5355fd34b9dc0a1d7eaef4a2 ]
This commit addresses a circular locking dependency in the
svm_range_cpu_invalidate_pagetables function. The function previously
held a lock while determining whether to perform an unmap or eviction
operation, which could lead to deadlocks.
Fixes the below:
[ 223.418794] ======================================================
[ 223.418820] WARNING: possible circular locking dependency detected
[ 223.418845] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE
[ 223.418869] ------------------------------------------------------
[ 223.418889] kfdtest/3939 is trying to acquire lock:
[ 223.418906] ffff8957552eae38 (&dqm->lock_hidden){+.+.}-{3:3}, at: evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.419302]
but task is already holding lock:
[ 223.419303] ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[ 223.419447] Console: switching to colour dummy device 80x25
[ 223.419477] [IGT] amd_basic: executing
[ 223.419599]
which lock already depends on the new lock.
[ 223.419611]
the existing dependency chain (in reverse order) is:
[ 223.419621]
-> #2 (&prange->lock){+.+.}-{3:3}:
[ 223.419636] __mutex_lock+0x85/0xe20
[ 223.419647] mutex_lock_nested+0x1b/0x30
[ 223.419656] svm_range_validate_and_map+0x2f1/0x15b0 [amdgpu]
[ 223.419954] svm_range_set_attr+0xe8c/0x1710 [amdgpu]
[ 223.420236] svm_ioctl+0x46/0x50 [amdgpu]
[ 223.420503] kfd_ioctl_svm+0x50/0x90 [amdgpu]
[ 223.420763] kfd_ioctl+0x409/0x6d0 [amdgpu]
[ 223.421024] __x64_sys_ioctl+0x95/0xd0
[ 223.421036] x64_sys_call+0x1205/0x20d0
[ 223.421047] do_syscall_64+0x87/0x140
[ 223.421056] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.421068]
-> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 223.421084] __ww_mutex_lock.constprop.0+0xab/0x1560
[ 223.421095] ww_mutex_lock+0x2b/0x90
[ 223.421103] amdgpu_amdkfd_alloc_gtt_mem+0xcc/0x2b0 [amdgpu]
[ 223.421361] add_queue_mes+0x3bc/0x440 [amdgpu]
[ 223.421623] unhalt_cpsch+0x1ae/0x240 [amdgpu]
[ 223.421888] kgd2kfd_start_sched+0x5e/0xd0 [amdgpu]
[ 223.422148] amdgpu_amdkfd_start_sched+0x3d/0x50 [amdgpu]
[ 223.422414] amdgpu_gfx_enforce_isolation_handler+0x132/0x270 [amdgpu]
[ 223.422662] process_one_work+0x21e/0x680
[ 223.422673] worker_thread+0x190/0x330
[ 223.422682] kthread+0xe7/0x120
[ 223.422690] ret_from_fork+0x3c/0x60
[ 223.422699] ret_from_fork_asm+0x1a/0x30
[ 223.422708]
-> #0 (&dqm->lock_hidden){+.+.}-{3:3}:
[ 223.422723] __lock_acquire+0x16f4/0x2810
[ 223.422734] lock_acquire+0xd1/0x300
[ 223.422742] __mutex_lock+0x85/0xe20
[ 223.422751] mutex_lock_nested+0x1b/0x30
[ 223.422760] evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.423025] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[ 223.423285] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[ 223.423540] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[ 223.423807] __mmu_notifier_invalidate_range_start+0x1f5/0x250
[ 223.423819] copy_page_range+0x1e94/0x1ea0
[ 223.423829] copy_process+0x172f/0x2ad0
[ 223.423839] kernel_clone+0x9c/0x3f0
[ 223.423847] __do_sys_clone+0x66/0x90
[ 223.423856] __x64_sys_clone+0x25/0x30
[ 223.423864] x64_sys_call+0x1d7c/0x20d0
[ 223.423872] do_syscall_64+0x87/0x140
[ 223.423880] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.423891]
other info that might help us debug this:
[ 223.423903] Chain exists of:
&dqm->lock_hidden --> reservation_ww_class_mutex --> &prange->lock
[ 223.423926] Possible unsafe locking scenario:
[ 223.423935] CPU0 CPU1
[ 223.423942] ---- ----
[ 223.423949] lock(&prange->lock);
[ 223.423958] lock(reservation_ww_class_mutex);
[ 223.423970] lock(&prange->lock);
[ 223.423981] lock(&dqm->lock_hidden);
[ 223.423990]
*** DEADLOCK ***
[ 223.423999] 5 locks held by kfdtest/3939:
[ 223.424006] #0: ffffffffb82b4fc0 (dup_mmap_sem){.+.+}-{0:0}, at: copy_process+0x1387/0x2ad0
[ 223.424026] #1: ffff89575eda81b0 (&mm->mmap_lock){++++}-{3:3}, at: copy_process+0x13a8/0x2ad0
[ 223.424046] #2: ffff89575edaf3b0 (&mm->mmap_lock/1){+.+.}-{3:3}, at: copy_process+0x13e4/0x2ad0
[ 223.424066] #3: ffffffffb82e76e0 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: copy_page_range+0x1cea/0x1ea0
[ 223.424088] #4: ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[ 223.424365]
stack backtrace:
[ 223.424374] CPU: 0 UID: 0 PID: 3939 Comm: kfdtest Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14
[ 223.424392] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 223.424401] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO WIFI/X570 AORUS PRO WIFI, BIOS F36a 02/16/2022
[ 223.424416] Call Trace:
[ 223.424423] <TASK>
[ 223.424430] dump_stack_lvl+0x9b/0xf0
[ 223.424441] dump_stack+0x10/0x20
[ 223.424449] print_circular_bug+0x275/0x350
[ 223.424460] check_noncircular+0x157/0x170
[ 223.424469] ? __bfs+0xfd/0x2c0
[ 223.424481] __lock_acquire+0x16f4/0x2810
[ 223.424490] ? srso_return_thunk+0x5/0x5f
[ 223.424505] lock_acquire+0xd1/0x300
[ 223.424514] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.424783] __mutex_lock+0x85/0xe20
[ 223.424792] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425058] ? srso_return_thunk+0x5/0x5f
[ 223.425067] ? mark_held_locks+0x54/0x90
[ 223.425076] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425339] ? srso_return_thunk+0x5/0x5f
[ 223.425350] mutex_lock_nested+0x1b/0x30
[ 223.425358] ? mutex_lock_nested+0x1b/0x30
[ 223.425367] evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425631] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[ 223.425893] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[ 223.426156] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[ 223.426423] ? srso_return_thunk+0x5/0x5f
[ 223.426436] __mmu_notifier_invalidate_range_start+0x1f5/0x250
[ 223.426450] copy_page_range+0x1e94/0x1ea0
[ 223.426461] ? srso_return_thunk+0x5/0x5f
[ 223.426474] ? srso_return_thunk+0x5/0x5f
[ 223.426484] ? lock_acquire+0xd1/0x300
[ 223.426494] ? copy_process+0x1718/0x2ad0
[ 223.426502] ? srso_return_thunk+0x5/0x5f
[ 223.426510] ? sched_clock_noinstr+0x9/0x10
[ 223.426519] ? local_clock_noinstr+0xe/0xc0
[ 223.426528] ? copy_process+0x1718/0x2ad0
[ 223.426537] ? srso_return_thunk+0x5/0x5f
[ 223.426550] copy_process+0x172f/0x2ad0
[ 223.426569] kernel_clone+0x9c/0x3f0
[ 223.426577] ? __schedule+0x4c9/0x1b00
[ 223.426586] ? srso_return_thunk+0x5/0x5f
[ 223.426594] ? sched_clock_noinstr+0x9/0x10
[ 223.426602] ? srso_return_thunk+0x5/0x5f
[ 223.426610] ? local_clock_noinstr+0xe/0xc0
[ 223.426619] ? schedule+0x107/0x1a0
[ 223.426629] __do_sys_clone+0x66/0x90
[ 223.426643] __x64_sys_clone+0x25/0x30
[ 223.426652] x64_sys_call+0x1d7c/0x20d0
[ 223.426661] do_syscall_64+0x87/0x140
[ 223.426671] ? srso_return_thunk+0x5/0x5f
[ 223.426679] ? common_nsleep+0x44/0x50
[ 223.426690] ? srso_return_thunk+0x5/0x5f
[ 223.426698] ? trace_hardirqs_off+0x52/0xd0
[ 223.426709] ? srso_return_thunk+0x5/0x5f
[ 223.426717] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426727] ? srso_return_thunk+0x5/0x5f
[ 223.426736] ? do_syscall_64+0x93/0x140
[ 223.426748] ? srso_return_thunk+0x5/0x5f
[ 223.426756] ? up_write+0x1c/0x1e0
[ 223.426765] ? srso_return_thunk+0x5/0x5f
[ 223.426775] ? srso_return_thunk+0x5/0x5f
[ 223.426783] ? trace_hardirqs_off+0x52/0xd0
[ 223.426792] ? srso_return_thunk+0x5/0x5f
[ 223.426800] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426810] ? srso_return_thunk+0x5/0x5f
[ 223.426818] ? do_syscall_64+0x93/0x140
[ 223.426826] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426836] ? srso_return_thunk+0x5/0x5f
[ 223.426844] ? do_syscall_64+0x93/0x140
[ 223.426853] ? srso_return_thunk+0x5/0x5f
[ 223.426861] ? irqentry_exit+0x6b/0x90
[ 223.426869] ? srso_return_thunk+0x5/0x5f
[ 223.426877] ? exc_page_fault+0xa7/0x2c0
[ 223.426888] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.426898] RIP: 0033:0x7f46758eab57
[ 223.426906] Code: ba 04 00 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 41 89 c0 85 c0 75 2c 64 48 8b 04 25 10 00
[ 223.426930] RSP: 002b:00007fff5c3e5188 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[ 223.426943] RAX: ffffffffffffffda RBX: 00007f4675f8c040 RCX: 00007f46758eab57
[ 223.426954] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[ 223.426965] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 223.426975] R10: 00007f4675e81a50 R11: 0000000000000246 R12: 0000000000000001
[ 223.426986] R13: 00007fff5c3e5470 R14: 00007fff5c3e53e0 R15: 00007fff5c3e5410
[ 223.427004] </TASK>
v2: To resolve this issue, the allocation of the process context buffer
(`proc_ctx_bo`) has been moved from the `add_queue_mes` function to the
`pqm_create_queue` function. This change ensures that the buffer is
allocated only when the first queue for a process is created and only if
the Micro Engine Scheduler (MES) is enabled. (Felix)
v3: Fix typo s/Memory Execution Scheduler (MES)/Micro Engine Scheduler
in commit message. (Lijo)
Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd")
Cc: Jesse Zhang <jesse.zhang@amd.com>
Cc: Yunxiang Li <Yunxiang.Li@amd.com>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hermes Wu <Hermes.wu@ite.com.tw>
Date: Tue Jan 21 15:01:51 2025 +0800
drm/bridge: it6505: fix HDCP V match check is not performed correctly
[ Upstream commit a5072fc77fb9e38fa9fd883642c83c3720049159 ]
Fix a typo where V compare incorrectly compares av[] with av[] itself,
which can result in HDCP failure.
The loop of V compare is expected to iterate for 5 times
which compare V array form av[0][] to av[4][].
It should check loop counter reach the last statement "i == 5"
before return true
Fixes: 0989c02c7a5c ("drm/bridge: it6505: fix HDCP CTS compare V matching")
Signed-off-by: Hermes Wu <Hermes.wu@ite.com.tw>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Robert Foss <rfoss@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250121-fix-hdcp-v-comp-v4-1-185f45c728dc@ite.com.tw
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Tue Dec 10 15:18:46 2024 +0100
drm/bridge: ti-sn65dsi86: Fix multiple instances
[ Upstream commit 574f5ee2c85a00a579549d50e9fc9c6c072ee4c4 ]
Each bridge instance creates up to four auxiliary devices with different
names. However, their IDs are always zero, causing duplicate filename
errors when a system has multiple bridges:
sysfs: cannot create duplicate filename '/bus/auxiliary/devices/ti_sn65dsi86.gpio.0'
Fix this by using a unique instance ID per bridge instance. The
instance ID is derived from the I2C adapter number and the bridge's I2C
address, to support multiple instances on the same bus.
Fixes: bf73537f411b ("drm/bridge: ti-sn65dsi86: Break GPIO and MIPI-to-eDP bridge into sub-drivers")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/7a68a0e3f927e26edca6040067fb653eb06efb79.1733840089.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wayne Lin <Wayne.Lin@amd.com>
Date: Mon Jan 13 17:10:59 2025 +0800
drm/dp_mst: Fix drm RAD print
[ Upstream commit 6bbce873a9c97cb12f5455c497be279ac58e707f ]
[Why]
The RAD of sideband message printed today is incorrect.
For RAD stored within MST branch
- If MST branch LCT is 1, it's RAD array is untouched and remained as 0.
- If MST branch LCT is larger than 1, use nibble to store the up facing
port number in cascaded sequence as illustrated below:
u8 RAD[0] = (LCT_2_UFP << 4) | LCT_3_UFP
RAD[1] = (LCT_4_UFP << 4) | LCT_5_UFP
...
In drm_dp_mst_rad_to_str(), it wrongly to use BIT_MASK(4) to fetch the port
number of one nibble.
[How]
Adjust the code by:
- RAD array items are valuable only for LCT >= 1.
- Use 0xF as the mask to replace BIT_MASK(4)
V2:
- Document how RAD is constructed (Imre)
V3:
- Adjust the comment for rad[] so kdoc formats it properly (Lyude)
Fixes: 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing + selftests")
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250113091100.3314533-2-Wayne.Lin@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Douglas Anderson <dianders@chromium.org>
Date: Thu Jan 16 09:42:50 2025 -0800
drm/mediatek: dp: drm_err => dev_err in HPD path to avoid NULL ptr
[ Upstream commit 106a6de46cf4887d535018185ec528ce822d6d84 ]
The function mtk_dp_wait_hpd_asserted() may be called before the
`mtk_dp->drm_dev` pointer is assigned in mtk_dp_bridge_attach().
Specifically it can be called via this callpath:
- mtk_edp_wait_hpd_asserted
- [panel probe]
- dp_aux_ep_probe
Using "drm" level prints anywhere in this callpath causes a NULL
pointer dereference. Change the error message directly in
mtk_dp_wait_hpd_asserted() to dev_err() to avoid this. Also change the
error messages in mtk_dp_parse_capabilities(), which is called by
mtk_dp_wait_hpd_asserted().
While touching these prints, also add the error code to them to make
future debugging easier.
Fixes: 7eacba9a083b ("drm/mediatek: dp: Add .wait_hpd_asserted() for AUX bus")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20250116094249.1.I29b0b621abb613ddc70ab4996426a3909e1aa75f@changeid/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Wed Jan 8 12:35:57 2025 +0300
drm/mediatek: dsi: fix error codes in mtk_dsi_host_transfer()
[ Upstream commit dcb166ee43c3d594e7b73a24f6e8cf5663eeff2c ]
There is a type bug because the return statement:
return ret < 0 ? ret : recv_cnt;
The issue is that ret is an int, recv_cnt is a u32 and the function
returns ssize_t, which is a signed long. The way that the type promotion
works is that the negative error codes are first cast to u32 and then
to signed long. The error codes end up being positive instead of
negative and the callers treat them as success.
Fixes: 81cc7e51c4f1 ("drm/mediatek: Allow commands to be sent during video mode")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202412210801.iADw0oIH-lkp@intel.com/
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/b754a408-4f39-4e37-b52d-7706c132e27f@stanley.mountain/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason-JH Lin <jason-jh.lin@mediatek.com>
Date: Mon Feb 24 13:12:21 2025 +0800
drm/mediatek: Fix config_updating flag never false when no mbox channel
[ Upstream commit 4ba973c8bad04d59fd4efa62512f4d9cee131714 ]
When CONFIG_MTK_CMDQ is enabled, if the display is controlled by the CPU
while other hardware is controlled by the GCE, the display will encounter
a mbox request channel failure.
However, it will still enter the CONFIG_MTK_CMDQ statement, causing the
config_updating flag to never be set to false. As a result, no page flip
event is sent back to user space, and the screen does not update.
Fixes: da03801ad08f ("drm/mediatek: Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb()")
Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20250224051301.3538484-1-jason-jh.lin@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Date: Mon Feb 17 16:48:12 2025 +0100
drm/mediatek: mtk_hdmi: Fix typo for aud_sampe_size member
[ Upstream commit 72fcb88e7bbc053ed4fc74cebb0315b98a0f20c3 ]
Rename member aud_sampe_size of struct hdmi_audio_param to
aud_sample_size to fix a typo and enhance readability.
This commit brings no functional changes.
Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support")
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250217154836.108895-20-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Date: Mon Feb 17 16:48:10 2025 +0100
drm/mediatek: mtk_hdmi: Unregister audio platform device on failure
[ Upstream commit 0be123cafc06eed0fd1227166a66e786434b0c50 ]
The probe function of this driver may fail after registering the
audio platform device: in that case, the state is not getting
cleaned up, leaving this device registered.
Adding up to the mix, should the probe function of this driver
return a probe deferral for N times, we're registering up to N
audio platform devices and, again, never freeing them up.
To fix this, add a pointer to the audio platform device in the
mtk_hdmi structure, and add a devm action to unregister it upon
driver removal or probe failure.
Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support")
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250217154836.108895-18-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rob Clark <robdclark@chromium.org>
Date: Fri Feb 28 13:31:24 2025 -0800
drm/msm/a6xx: Fix a6xx indexed-regs in devcoreduump
[ Upstream commit 06dd5d86c6aef1c7609ca3a5ffa4097e475e2213 ]
Somehow, possibly as a result of rebase gone badly, setting
nr_indexed_regs for pre-a650 a6xx devices lost the setting of
nr_indexed_regs, resulting in values getting snapshot, but omitted
from the devcoredump.
Fixes: e997ae5f45ca ("drm/msm/a6xx: Mostly implement A7xx gpu_state")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/640289/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <lumag@kernel.org>
Date: Thu Jan 23 14:43:33 2025 +0200
drm/msm/dpu: don't use active in atomic_check()
[ Upstream commit 25b4614843bcc56ba150f7c99905125a019e656c ]
The driver isn't supposed to consult crtc_state->active/active_check for
resource allocation. Instead all resources should be allocated if
crtc_state->enabled is set. Stop consulting active / active_changed in
order to determine whether the hardware resources should be
(re)allocated.
Fixes: ccc862b957c6 ("drm/msm/dpu: Fix reservation failures in modeset")
Reported-by: Simona Vetter <simona.vetter@ffwll.ch>
Closes: https://lore.kernel.org/dri-devel/ZtW_S0j5AEr4g0QW@phenom.ffwll.local/
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/633393/
Link: https://lore.kernel.org/r/20250123-drm-dirty-modeset-v2-1-bbfd3a6cd1a4@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date: Wed Jan 29 12:55:04 2025 +0100
drm/msm/dsi/phy: Program clock inverters in correct register
[ Upstream commit baf49072877726616c7f5943a6b45eb86bfeca0a ]
Since SM8250 all downstream sources program clock inverters in
PLL_CLOCK_INVERTERS_1 register and leave the PLL_CLOCK_INVERTERS as
reset value (0x0). The most recent Hardware Programming Guide for 3 nm,
4 nm, 5 nm and 7 nm PHYs also mention PLL_CLOCK_INVERTERS_1.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Fixes: 1ef7c99d145c ("drm/msm/dsi: add support for 7nm DSI PHY/PLL")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reported-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/634489/
Link: https://lore.kernel.org/r/20250129115504.40080-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marijn Suijten <marijn.suijten@somainline.org>
Date: Mon Feb 17 12:17:42 2025 +0100
drm/msm/dsi: Set PHY usescase (and mode) before registering DSI host
[ Upstream commit 660c396c98c061f9696bebacc178b74072e80054 ]
Ordering issues here cause an uninitialized (default STANDALONE)
usecase to be programmed (which appears to be a MUX) in some cases
when msm_dsi_host_register() is called, leading to the slave PLL in
bonded-DSI mode to source from a clock parent (dsi1vco) that is off.
This should seemingly not be a problem as the actual dispcc clocks from
DSI1 that are muxed in the clock tree of DSI0 are way further down, this
bit still seems to have an effect on them somehow and causes the right
side of the panel controlled by DSI1 to not function.
In an ideal world this code is refactored to no longer have such
error-prone calls "across subsystems", and instead model the "PLL src"
register field as a regular mux so that changing the clock parents
programmatically or in DTS via `assigned-clock-parents` has the
desired effect.
But for the avid reader, the clocks that we *are* muxing into DSI0's
tree are way further down, so if this bit turns out to be a simple mux
between dsiXvco and out_div, that shouldn't have any effect as this
whole tree is off anyway.
Fixes: 57bf43389337 ("drm/msm/dsi: Pass down use case to PHY")
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Patchwork: https://patchwork.freedesktop.org/patch/637650/
Link: https://lore.kernel.org/r/20250217-drm-msm-initial-dualpipe-dsc-fixes-v3-2-913100d6103f@somainline.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marijn Suijten <marijn.suijten@somainline.org>
Date: Mon Feb 17 12:17:41 2025 +0100
drm/msm/dsi: Use existing per-interface slice count in DSC timing
[ Upstream commit 14ad809ceb66d0874cbe4bd5ca9edf0de8d9ad96 ]
When configuring the timing of DSI hosts (interfaces) in
dsi_timing_setup() all values written to registers are taking
bonded-mode into account by dividing the original mode width by 2
(half the data is sent over each of the two DSI hosts), but the full
width instead of the interface width is passed as hdisplay parameter to
dsi_update_dsc_timing().
Currently only msm_dsc_get_slices_per_intf() is called within
dsi_update_dsc_timing() with the `hdisplay` argument which clearly
documents that it wants the width of a single interface (which, again,
in bonded DSI mode is half the total width of the mode) resulting in all
subsequent values to be completely off.
However, as soon as we start to pass the halved hdisplay
into dsi_update_dsc_timing() we might as well discard
msm_dsc_get_slices_per_intf() since the value it calculates is already
available in dsc->slice_count which is per-interface by the current
design of MSM DPU/DSI implementations and their use of the DRM DSC
helpers.
Fixes: 08802f515c3c ("drm/msm/dsi: Add support for DSC configuration")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Patchwork: https://patchwork.freedesktop.org/patch/637648/
Link: https://lore.kernel.org/r/20250217-drm-msm-initial-dualpipe-dsc-fixes-v3-1-913100d6103f@somainline.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Keeping <jkeeping@inmusicbrands.com>
Date: Mon Feb 17 12:04:28 2025 +0000
drm/panel: ilitek-ili9882t: fix GPIO name in error message
[ Upstream commit 4ce2c7e201c265df1c62a9190a98a98803208b8f ]
This driver uses the enable-gpios property and it is confusing that the
error message refers to reset-gpios. Use the correct name when the
enable GPIO is not found.
Fixes: e2450d32e5fb5 ("drm/panel: ili9882t: Break out as separate driver")
Signed-off-by: John Keeping <jkeeping@inmusicbrands.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250217120428.3779197-1-jkeeping@inmusicbrands.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ashley Smith <ashley.smith@collabora.com>
Date: Mon Mar 3 18:04:32 2025 +0000
drm/panthor: Update CS_STATUS_ defines to correct values
[ Upstream commit c82734fbdc50dc9e568e8686622eaa4498acb81e ]
Values for SC_STATUS_BLOCKED_REASON_ are documented in the G610 "Odin"
GPU specification (CS_STATUS_BLOCKED_REASON register).
This change updates the defines to the correct values.
Fixes: 2718d91816ee ("drm/panthor: Add the FW logical block")
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250303180444.3768993-1-ashley.smith@collabora.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Keeping <jkeeping@inmusicbrands.com>
Date: Wed Jan 15 11:01:38 2025 +0000
drm/ssd130x: ensure ssd132x pitch is correct
[ Upstream commit 229adcffdb54b13332d2afd2dc5d203418d50908 ]
The bounding rectangle is adjusted to ensure it aligns to
SSD132X_SEGMENT_WIDTH, which may adjust the pitch. Calculate the pitch
after aligning the left and right edge.
Fixes: fdd591e00a9c ("drm/ssd130x: Add support for the SSD132x OLED controller family")
Signed-off-by: John Keeping <jkeeping@inmusicbrands.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250115110139.1672488-3-jkeeping@inmusicbrands.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Keeping <jkeeping@inmusicbrands.com>
Date: Wed Jan 15 11:01:37 2025 +0000
drm/ssd130x: fix ssd132x encoding
[ Upstream commit 1e14484677c8e87548f5f0d4eb8800e408004404 ]
The ssd132x buffer is encoded one pixel per nibble, with two pixels in
each byte. When encoding an 8-bit greyscale input, take the top 4-bits
as the value and ensure the two pixels are distinct and do not overwrite
each other.
Fixes: fdd591e00a9c ("drm/ssd130x: Add support for the SSD132x OLED controller family")
Signed-off-by: John Keeping <jkeeping@inmusicbrands.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250115110139.1672488-2-jkeeping@inmusicbrands.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Javier Martinez Canillas <javierm@redhat.com>
Date: Tue Dec 31 12:44:58 2024 +0100
drm/ssd130x: Set SPI .id_table to prevent an SPI core warning
[ Upstream commit 5d40d4fae6f2fb789f48207a9d4772bbee970b5c ]
The only reason for the ssd130x-spi driver to have an spi_device_id table
is that the SPI core always reports an "spi:" MODALIAS, even when the SPI
device has been registered via a Device Tree Blob.
Without spi_device_id table information in the module's metadata, module
autoloading would not work because there won't be an alias that matches
the MODALIAS reported by the SPI core.
This spi_device_id table is not needed for device matching though, since
the of_device_id table is always used in this case. For this reason, the
struct spi_driver .id_table member is currently not set in the SPI driver.
Because the spi_device_id table is always required for module autoloading,
the SPI core checks during driver registration that both an of_device_id
table and a spi_device_id table are present and that they contain the same
entries for all the SPI devices.
Not setting the .id_table member in the driver then confuses the core and
leads to the following warning when the ssd130x-spi driver is registered:
[ 41.091198] SPI driver ssd130x-spi has no spi_device_id for sinowealth,sh1106
[ 41.098614] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1305
[ 41.105862] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1306
[ 41.113062] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1307
[ 41.120247] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1309
[ 41.127449] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1322
[ 41.134627] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1325
[ 41.141784] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1327
[ 41.149021] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1331
To prevent the warning, set the .id_table even though it's not necessary.
Since the check is done even for built-in drivers, drop the condition to
only define the ID table when the driver is built as a module. Finally,
rename the variable to use the "_spi_id" convention used for ID tables.
Fixes: 74373977d2ca ("drm/solomon: Add SSD130x OLED displays SPI support")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241231114516.2063201-1-javierm@redhat.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: José Expósito <jose.exposito89@gmail.com>
Date: Wed Feb 12 09:49:12 2025 +0100
drm/vkms: Fix use after free and double free on init error
[ Upstream commit ed15511a773df86205bda66c37193569575ae828 ]
If the driver initialization fails, the vkms_exit() function might
access an uninitialized or freed default_config pointer and it might
double free it.
Fix both possible errors by initializing default_config only when the
driver initialization succeeded.
Reported-by: Louis Chauvet <louis.chauvet@bootlin.com>
Closes: https://lore.kernel.org/all/Z5uDHcCmAwiTsGte@louis-chauvet-laptop/
Fixes: 2df7af93fdad ("drm/vkms: Add vkms_config type")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Thomas Zimmermann <tzimmremann@suse.de>
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250212084912.3196-1-jose.exposito89@gmail.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Date: Wed Jan 15 11:03:39 2025 +0200
drm: xlnx: zynqmp: Fix max dma segment size
[ Upstream commit 28b529a98525123acd37372a04d21e87ec2edcf7 ]
Fix "mapping sg segment longer than device claims to support" warning by
setting the max segment size.
Fixes: d76271d22694 ("drm: xlnx: DRM/KMS driver for Xilinx ZynqMP DisplayPort Subsystem")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Tested-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250115-xilinx-formats-v2-10-160327ca652a@ideasonboard.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maud Spierings <maudspierings@gocontroll.com>
Date: Wed Feb 26 15:19:13 2025 +0100
dt-bindings: vendor-prefixes: add GOcontroll
[ Upstream commit 5f0d2de417166698c8eba433b696037ce04730da ]
GOcontroll produces embedded linux systems and IO modules to use in
these systems, add its prefix.
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Maud Spierings <maudspierings@gocontroll.com>
Link: https://patch.msgid.link/20250226-initial_display-v2-2-23fafa130817@gocontroll.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Tue Feb 25 17:44:22 2025 +0100
dummycon: fix default rows/cols
[ Upstream commit beefaba1978c04ea2950d34236f58fe6cf6a7f58 ]
dummycon fails to build on ARM/footbridge when the VGA console is
disabled, since I got the dependencies slightly wrong in a previous
patch:
drivers/video/console/dummycon.c: In function 'dummycon_init':
drivers/video/console/dummycon.c:27:25: error: 'CONFIG_DUMMY_CONSOLE_COLUMNS' undeclared (first use in this function); did you mean 'CONFIG_DUMMY_CONSOLE'?
27 | #define DUMMY_COLUMNS CONFIG_DUMMY_CONSOLE_COLUMNS
drivers/video/console/dummycon.c:28:25: error: 'CONFIG_DUMMY_CONSOLE_ROWS' undeclared (first use in this function); did you mean 'CONFIG_DUMMY_CONSOLE'?
28 | #define DUMMY_ROWS CONFIG_DUMMY_CONSOLE_ROWS
This only showed up after many thousand randconfig builds on Arm, and
doesn't matter in practice, but should still be fixed. Address it by
using the default row/columns on footbridge after all in that corner
case.
Fixes: 4293b0925149 ("dummycon: limit Arm console size hack to footbridge")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409151512.LML1slol-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vitaly Lifshits <vitaly.lifshits@intel.com>
Date: Thu Mar 13 16:05:56 2025 +0200
e1000e: change k1 configuration on MTP and later platforms
[ Upstream commit efaaf344bc2917cbfa5997633bc18a05d3aed27f ]
Starting from Meteor Lake, the Kumeran interface between the integrated
MAC and the I219 PHY works at a different frequency. This causes sporadic
MDI errors when accessing the PHY, and in rare circumstances could lead
to packet corruption.
To overcome this, introduce minor changes to the Kumeran idle
state (K1) parameters during device initialization. Hardware reset
reverts this configuration, therefore it needs to be applied in a few
places.
Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Date: Mon Mar 10 09:14:02 2025 +0800
EDAC/ie31200: Fix the DIMM size mask for several SoCs
[ Upstream commit 3427befbbca6b19fe0e37f91d66ce5221de70bf1 ]
The DIMM size mask for {Sky, Kaby, Coffee} Lake is not bits{7:0},
but bits{5:0}. Fix it.
Fixes: 953dee9bbd24 ("EDAC, ie31200_edac: Add Skylake support")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-3-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Date: Mon Mar 10 09:14:03 2025 +0800
EDAC/ie31200: Fix the error path order of ie31200_init()
[ Upstream commit 231e341036d9988447e3b3345cf741a98139199e ]
The error path order of ie31200_init() is incorrect, fix it.
Fixes: 709ed1bcef12 ("EDAC/ie31200: Fallback if host bridge device is already initialized")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-4-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Date: Mon Mar 10 09:14:01 2025 +0800
EDAC/ie31200: Fix the size of EDAC_MC_LAYER_CHIP_SELECT layer
[ Upstream commit d59d844e319d97682c8de29b88d2d60922a683b3 ]
The EDAC_MC_LAYER_CHIP_SELECT layer pertains to the rank, not the DIMM.
Fix its size to reflect the number of ranks instead of the number of DIMMs.
Also delete the unused macros IE31200_{DIMMS,RANKS}.
Fixes: 7ee40b897d18 ("ie31200_edac: Introduce the driver")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-2-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Date: Fri Feb 14 08:27:28 2025 +0800
EDAC/{skx_common,i10nm}: Fix some missing error reports on Emerald Rapids
[ Upstream commit d9207cf7760f5f5599e9ff7eb0fedf56821a1d59 ]
When doing error injection to some memory DIMMs on certain Intel Emerald
Rapids servers, the i10nm_edac missed error reports for some memory DIMMs.
Certain BIOS configurations may hide some memory controllers, and the
i10nm_edac doesn't enumerate these hidden memory controllers. However, the
ADXL decodes memory errors using memory controller physical indices even
if there are hidden memory controllers. Therefore, the memory controller
physical indices reported by the ADXL may mismatch the logical indices
enumerated by the i10nm_edac, resulting in missed error reports for some
memory DIMMs.
Fix this issue by creating a mapping table from memory controller physical
indices (used by the ADXL) to logical indices (used by the i10nm_edac) and
using it to convert the physical indices to the logical indices during the
error handling process.
Fixes: c545f5e41225 ("EDAC/i10nm: Skip the absent memory controllers")
Reported-by: Kevin Chang <kevin1.chang@intel.com>
Tested-by: Kevin Chang <kevin1.chang@intel.com>
Reported-by: Thomas Chen <Thomas.Chen@intel.com>
Tested-by: Thomas Chen <Thomas.Chen@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20250214002728.6287-1-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oleg Nesterov <oleg@redhat.com>
Date: Mon Mar 24 17:00:03 2025 +0100
exec: fix the racy usage of fs_struct->in_exec
commit af7bb0d2ca459f15cb5ca604dab5d9af103643f0 upstream.
check_unsafe_exec() sets fs->in_exec under cred_guard_mutex, then execve()
paths clear fs->in_exec lockless. This is fine if exec succeeds, but if it
fails we have the following race:
T1 sets fs->in_exec = 1, fails, drops cred_guard_mutex
T2 sets fs->in_exec = 1
T1 clears fs->in_exec
T2 continues with fs->in_exec == 0
Change fs/exec.c to clear fs->in_exec with cred_guard_mutex held.
Reported-by: syzbot+1c486d0b62032c82a968@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67dc67f0.050a0220.25ae54.001f.GAE@google.com/
Cc: stable@vger.kernel.org
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250324160003.GA8878@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Sat Feb 8 17:16:58 2025 +0800
exfat: add a check for invalid data size
[ Upstream commit 13940cef95491472760ca261b6713692ece9b946 ]
Add a check for invalid data size to avoid corrupted filesystem
from being further corrupted.
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Thu Mar 6 15:02:07 2025 +0800
exfat: fix missing shutdown check
[ Upstream commit 47e35366bc6fa3cf189a8305bce63992495f3efa ]
xfstests generic/730 test failed because after deleting the device
that still had dirty data, the file could still be read without
returning an error. The reason is the missing shutdown check in
->read_iter.
I also noticed that shutdown checks were missing from ->write_iter,
->splice_read, and ->mmap. This commit adds shutdown checks to all
of them.
Fixes: f761fcdd289d ("exfat: Implement sops->shutdown and ioctl")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sungjong Seo <sj1557.seo@samsung.com>
Date: Wed Mar 26 23:48:48 2025 +0900
exfat: fix potential wrong error return from get_block
commit 59c30e31425833385e6644ad33151420e37eabe1 upstream.
If there is no error, get_block() should return 0. However, when bh_read()
returns 1, get_block() also returns 1 in the same manner.
Let's set err to 0, if there is no error from bh_read()
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Cc: stable@vger.kernel.org
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sungjong Seo <sj1557.seo@samsung.com>
Date: Fri Mar 21 15:34:42 2025 +0900
exfat: fix random stack corruption after get_block
commit 1bb7ff4204b6d4927e982cd256286c09ed4fd8ca upstream.
When get_block is called with a buffer_head allocated on the stack, such
as do_mpage_readpage, stack corruption due to buffer_head UAF may occur in
the following race condition situation.
<CPU 0> <CPU 1>
mpage_read_folio
<<bh on stack>>
do_mpage_readpage
exfat_get_block
bh_read
__bh_read
get_bh(bh)
submit_bh
wait_on_buffer
...
end_buffer_read_sync
__end_buffer_read_notouch
unlock_buffer
<<keep going>>
...
...
...
...
<<bh is not valid out of mpage_read_folio>>
.
.
another_function
<<variable A on stack>>
put_bh(bh)
atomic_dec(bh->b_count)
* stack corruption here *
This patch returns -EAGAIN if a folio does not have buffers when bh_read
needs to be called. By doing this, the caller can fallback to functions
like block_read_full_folio(), create a buffer_head in the folio, and then
call get_block again.
Let's do not call bh_read() with on-stack buffer_head.
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Cc: stable@vger.kernel.org
Tested-by: Yeongjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Mon Mar 17 10:53:10 2025 +0800
exfat: fix the infinite loop in exfat_find_last_cluster()
[ Upstream commit b0522303f67255926b946aa66885a0104d1b2980 ]
In exfat_find_last_cluster(), the cluster chain is traversed until
the EOF cluster. If the cluster chain includes a loop due to file
system corruption, the EOF cluster cannot be traversed, resulting
in an infinite loop.
If the number of clusters indicated by the file size is inconsistent
with the cluster chain length, exfat_find_last_cluster() will return
an error, so if this inconsistency is found, the traversal can be
aborted without traversing to the EOF cluster.
Reported-by: syzbot+f7d147e6db52b1e09dba@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f7d147e6db52b1e09dba
Tested-by: syzbot+f7d147e6db52b1e09dba@syzkaller.appspotmail.com
Fixes: 31023864e67a ("exfat: add fat entry operations")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Theodore Ts'o <tytso@mit.edu>
Date: Fri Mar 14 00:38:42 2025 -0400
ext4: don't over-report free space or inodes in statvfs
commit f87d3af7419307ae26e705a2b2db36140db367a2 upstream.
This fixes an analogus bug that was fixed in xfs in commit
4b8d867ca6e2 ("xfs: don't over-report free space or inodes in
statvfs") where statfs can report misleading / incorrect information
where project quota is enabled, and the free space is less than the
remaining quota.
This commit will resolve a test failure in generic/762 which tests for
this bug.
Cc: stable@kernel.org
Fixes: 689c958cbe6b ("ext4: add project quota support")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Acs, Jakub <acsjakub@amazon.de>
Date: Thu Mar 20 15:46:49 2025 +0000
ext4: fix OOB read when checking dotdot dir
commit d5e206778e96e8667d3bde695ad372c296dc9353 upstream.
Mounting a corrupted filesystem with directory which contains '.' dir
entry with rec_len == block size results in out-of-bounds read (later
on, when the corrupted directory is removed).
ext4_empty_dir() assumes every ext4 directory contains at least '.'
and '..' as directory entries in the first data block. It first loads
the '.' dir entry, performs sanity checks by calling ext4_check_dir_entry()
and then uses its rec_len member to compute the location of '..' dir
entry (in ext4_next_entry). It assumes the '..' dir entry fits into the
same data block.
If the rec_len of '.' is precisely one block (4KB), it slips through the
sanity checks (it is considered the last directory entry in the data
block) and leaves "struct ext4_dir_entry_2 *de" point exactly past the
memory slot allocated to the data block. The following call to
ext4_check_dir_entry() on new value of de then dereferences this pointer
which results in out-of-bounds mem access.
Fix this by extending __ext4_check_dir_entry() to check for '.' dir
entries that reach the end of data block. Make sure to ignore the phony
dir entries for checksum (by checking name_len for non-zero).
Note: This is reported by KASAN as use-after-free in case another
structure was recently freed from the slot past the bound, but it is
really an OOB read.
This issue was found by syzkaller tool.
Call Trace:
[ 38.594108] BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x67e/0x710
[ 38.594649] Read of size 2 at addr ffff88802b41a004 by task syz-executor/5375
[ 38.595158]
[ 38.595288] CPU: 0 UID: 0 PID: 5375 Comm: syz-executor Not tainted 6.14.0-rc7 #1
[ 38.595298] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 38.595304] Call Trace:
[ 38.595308] <TASK>
[ 38.595311] dump_stack_lvl+0xa7/0xd0
[ 38.595325] print_address_description.constprop.0+0x2c/0x3f0
[ 38.595339] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595349] print_report+0xaa/0x250
[ 38.595359] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595368] ? kasan_addr_to_slab+0x9/0x90
[ 38.595378] kasan_report+0xab/0xe0
[ 38.595389] ? __ext4_check_dir_entry+0x67e/0x710
[ 38.595400] __ext4_check_dir_entry+0x67e/0x710
[ 38.595410] ext4_empty_dir+0x465/0x990
[ 38.595421] ? __pfx_ext4_empty_dir+0x10/0x10
[ 38.595432] ext4_rmdir.part.0+0x29a/0xd10
[ 38.595441] ? __dquot_initialize+0x2a7/0xbf0
[ 38.595455] ? __pfx_ext4_rmdir.part.0+0x10/0x10
[ 38.595464] ? __pfx___dquot_initialize+0x10/0x10
[ 38.595478] ? down_write+0xdb/0x140
[ 38.595487] ? __pfx_down_write+0x10/0x10
[ 38.595497] ext4_rmdir+0xee/0x140
[ 38.595506] vfs_rmdir+0x209/0x670
[ 38.595517] ? lookup_one_qstr_excl+0x3b/0x190
[ 38.595529] do_rmdir+0x363/0x3c0
[ 38.595537] ? __pfx_do_rmdir+0x10/0x10
[ 38.595544] ? strncpy_from_user+0x1ff/0x2e0
[ 38.595561] __x64_sys_unlinkat+0xf0/0x130
[ 38.595570] do_syscall_64+0x5b/0x180
[ 38.595583] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: ac27a0ec112a0 ("[PATCH] ext4: initial copy of files from ext3")
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Mahmoud Adam <mngyadam@amazon.com>
Cc: stable@vger.kernel.org
Cc: security@kernel.org
Link: https://patch.msgid.link/b3ae36a6794c4a01944c7d70b403db5b@amazon.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Markus Elfring <elfring@users.sourceforge.net>
Date: Thu Apr 13 21:35:36 2023 +0200
fbdev: au1100fb: Move a variable assignment behind a null pointer check
[ Upstream commit 2df2c0caaecfd869b49e14f2b8df822397c5dd7f ]
The address of a data structure member was determined before
a corresponding null pointer check in the implementation of
the function “au1100fb_setmode”.
This issue was detected by using the Coccinelle software.
Fixes: 3b495f2bb749 ("Au1100 FB driver uplift for 2.6.")
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danila Chernetsov <listdansp@mail.ru>
Date: Wed Mar 19 01:30:11 2025 +0000
fbdev: sm501fb: Add some geometry checks.
[ Upstream commit aee50bd88ea5fde1ff4cc021385598f81a65830c ]
Added checks for xoffset, yoffset settings.
Incorrect settings of these parameters can lead to errors
in sm501fb_pan_ functions.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 5fc404e47bdf ("[PATCH] fb: SM501 framebuffer driver")
Signed-off-by: Danila Chernetsov <listdansp@mail.ru>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date: Sun Mar 23 17:05:29 2025 +0000
firmware: cs_dsp: Ensure cs_dsp_load[_coeff]() returns 0 on success
[ Upstream commit 2593f7e0dc93a898a84220b3fb180d86f1ca8c60 ]
Set ret = 0 on successful completion of the processing loop in
cs_dsp_load() and cs_dsp_load_coeff() to ensure that the function
returns 0 on success.
All normal firmware files will have at least one data block, and
processing this block will set ret == 0, from the result of either
regmap_raw_write() or cs_dsp_parse_coeff().
The kunit tests create a dummy firmware file that contains only the
header, without any data blocks. This gives cs_dsp a file to "load"
that will not cause any side-effects. As there aren't any data blocks,
the processing loop will not set ret == 0.
Originally there was a line after the processing loop:
ret = regmap_async_complete(regmap);
which would set ret == 0 before the function returned.
Commit fe08b7d5085a ("firmware: cs_dsp: Remove async regmap writes")
changed the regmap write to a normal sync write, so the call to
regmap_async_complete() wasn't necessary and was removed. It was
overlooked that the ret here wasn't only to check the result of
regmap_async_complete(), it also set the final return value of the
function.
Fixes: fe08b7d5085a ("firmware: cs_dsp: Remove async regmap writes")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20250323170529.197205-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Schoenebeck <linux_oss@crudebyte.com>
Date: Thu Mar 13 13:59:32 2025 +0100
fs/9p: fix NULL pointer dereference on mkdir
[ Upstream commit 3f61ac7c65bdb26accb52f9db66313597e759821 ]
When a 9p tree was mounted with option 'posixacl', parent directory had a
default ACL set for its subdirectories, e.g.:
setfacl -m default:group:simpsons:rwx parentdir
then creating a subdirectory crashed 9p client, as v9fs_fid_add() call in
function v9fs_vfs_mkdir_dotl() sets the passed 'fid' pointer to NULL
(since dafbe689736) even though the subsequent v9fs_set_create_acl() call
expects a valid non-NULL 'fid' pointer:
[ 37.273191] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 37.322338] Call Trace:
[ 37.323043] <TASK>
[ 37.323621] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 37.324448] ? page_fault_oops (arch/x86/mm/fault.c:714)
[ 37.325532] ? search_module_extables (kernel/module/main.c:3733)
[ 37.326742] ? p9_client_walk (net/9p/client.c:1165) 9pnet
[ 37.328006] ? search_bpf_extables (kernel/bpf/core.c:804)
[ 37.329142] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:686 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
[ 37.330196] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:574)
[ 37.331330] ? p9_client_walk (net/9p/client.c:1165) 9pnet
[ 37.332562] ? v9fs_fid_xattr_get (fs/9p/xattr.c:30) 9p
[ 37.333824] v9fs_fid_xattr_set (fs/9p/fid.h:23 fs/9p/xattr.c:121) 9p
[ 37.335077] v9fs_set_acl (fs/9p/acl.c:276) 9p
[ 37.336112] v9fs_set_create_acl (fs/9p/acl.c:307) 9p
[ 37.337326] v9fs_vfs_mkdir_dotl (fs/9p/vfs_inode_dotl.c:411) 9p
[ 37.338590] vfs_mkdir (fs/namei.c:4313)
[ 37.339535] do_mkdirat (fs/namei.c:4336)
[ 37.340465] __x64_sys_mkdir (fs/namei.c:4354)
[ 37.341455] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 37.342447] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Fix this by simply swapping the sequence of these two calls in
v9fs_vfs_mkdir_dotl(), i.e. calling v9fs_set_create_acl() before
v9fs_fid_add().
Fixes: dafbe689736f ("9p fid refcount: cleanup p9_fid_put calls")
Reported-by: syzbot+5b667f9a1fee4ba3775a@syzkaller.appspotmail.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Message-ID: <E1tsiI6-002iMG-Kh@kylie.crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Sun Feb 16 23:52:00 2025 +0300
fs/ntfs3: Fix a couple integer overflows on 32bit systems
[ Upstream commit 5ad414f4df2294b28836b5b7b69787659d6aa708 ]
On 32bit systems the "off + sizeof(struct NTFS_DE)" addition can
have an integer wrapping issue. Fix it by using size_add().
Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Sun Feb 16 23:52:10 2025 +0300
fs/ntfs3: Prevent integer overflow in hdr_first_de()
[ Upstream commit 6bb81b94f7a9cba6bde9a905cef52a65317a8b04 ]
The "de_off" and "used" variables come from the disk so they both need to
check. The problem is that on 32bit systems if they're both greater than
UINT_MAX - 16 then the check does work as intended because of an integer
overflow.
Fixes: 60ce8dfde035 ("fs/ntfs3: Fix wrong if in hdr_first_de")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date: Thu Jan 30 17:03:41 2025 +0300
fs/ntfs3: Update inode->i_mapping->a_ops on compression state
[ Upstream commit b432163ebd15a0fb74051949cb61456d6c55ccbd ]
Update inode->i_mapping->a_ops when the compression state changes to
ensure correct address space operations.
Clear ATTR_FLAG_SPARSED/FILE_ATTRIBUTE_SPARSE_FILE when enabling
compression to prevent flag conflicts.
v2:
Additionally, ensure that all dirty pages are flushed and concurrent access
to the page cache is blocked.
Fixes: 6b39bfaeec44 ("fs/ntfs3: Add support for the compression attribute")
Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bart Van Assche <bvanassche@acm.org>
Date: Wed Mar 19 14:02:22 2025 -0700
fs/procfs: fix the comment above proc_pid_wchan()
[ Upstream commit 6287fbad1cd91f0c25cdc3a580499060828a8f30 ]
proc_pid_wchan() used to report kernel addresses to user space but that is
no longer the case today. Bring the comment above proc_pid_wchan() in
sync with the implementation.
Link: https://lkml.kernel.org/r/20250319210222.1518771-1-bvanassche@acm.org
Fixes: b2f73922d119 ("fs/proc, core/debug: Don't expose absolute kernel addresses via wchan")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alistair Popple <apopple@nvidia.com>
Date: Fri Feb 28 14:30:56 2025 +1100
fuse: fix dax truncate/punch_hole fault path
[ Upstream commit 7851bf649d423edd7286b292739f2eefded3d35c ]
Patch series "fs/dax: Fix ZONE_DEVICE page reference counts", v9.
Device and FS DAX pages have always maintained their own page reference
counts without following the normal rules for page reference counting. In
particular pages are considered free when the refcount hits one rather
than zero and refcounts are not added when mapping the page.
Tracking this requires special PTE bits (PTE_DEVMAP) and a secondary
mechanism for allowing GUP to hold references on the page (see
get_dev_pagemap). However there doesn't seem to be any reason why FS DAX
pages need their own reference counting scheme.
By treating the refcounts on these pages the same way as normal pages we
can remove a lot of special checks. In particular pXd_trans_huge()
becomes the same as pXd_leaf(), although I haven't made that change here.
It also frees up a valuable SW define PTE bit on architectures that have
devmap PTE bits defined.
It also almost certainly allows further clean-up of the devmap managed
functions, but I have left that as a future improvment. It also enables
support for compound ZONE_DEVICE pages which is one of my primary
motivators for doing this work.
This patch (of 20):
FS DAX requires file systems to call into the DAX layout prior to
unlinking inodes to ensure there is no ongoing DMA or other remote access
to the direct mapped page. The fuse file system implements
fuse_dax_break_layouts() to do this which includes a comment indicating
that passing dmap_end == 0 leads to unmapping of the whole file.
However this is not true - passing dmap_end == 0 will not unmap anything
before dmap_start, and further more dax_layout_busy_page_range() will not
scan any of the range to see if there maybe ongoing DMA access to the
range. Fix this by passing -1 for dmap_end to fuse_dax_break_layouts()
which will invalidate the entire file range to
dax_layout_busy_page_range().
Link: https://lkml.kernel.org/r/cover.8068ad144a7eea4a813670301f4d2a86a8e68ec4.1740713401.git-series.apopple@nvidia.com
Link: https://lkml.kernel.org/r/f09a34b6c40032022e4ddee6fadb7cc676f08867.1740713401.git-series.apopple@nvidia.com
Fixes: 6ae330cad6ef ("virtiofs: serialize truncate/punch_hole and dax fault path")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vitalii Mordan <mordan@ispras.ru>
Date: Fri Feb 14 18:46:32 2025 +0300
gpu: cdns-mhdp8546: fix call balance of mhdp->clk handling routines
[ Upstream commit f65727be3fa5f252c8d982d15023aab8255ded19 ]
If the clock mhdp->clk was not enabled in cdns_mhdp_probe(), it should not
be disabled in any path.
The return value of clk_prepare_enable() is not checked. If mhdp->clk was
not enabled, it may be disabled in the error path of cdns_mhdp_probe()
(e.g., if cdns_mhdp_load_firmware() fails) or in cdns_mhdp_remove() after
a successful cdns_mhdp_probe() call.
Use the devm_clk_get_enabled() helper function to ensure proper call
balance for mhdp->clk.
Found by Linux Verification Center (linuxtesting.org) with Klever.
Fixes: fb43aa0acdfd ("drm: bridge: Add support for Cadence MHDP8546 DPI/DP bridge")
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Robert Foss <rfoss@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250214154632.1907425-1-mordan@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wentao Liang <vulab@iscas.ac.cn>
Date: Mon Jan 20 22:05:47 2025 +0800
greybus: gb-beagleplay: Add error handling for gb_greybus_init
[ Upstream commit be382372d55d65b5c7e5a523793ca5e403f8c595 ]
Add error handling for the gb_greybus_init(bg) function call
during the firmware reflash process to maintain consistency
in error handling throughout the codebase. If initialization
fails, log an error and return FW_UPLOAD_ERR_RW_ERROR.
Fixes: 0cf7befa3ea2 ("greybus: gb-beagleplay: Add firmware upload API")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Ayush Singh <ayush@beagleboard.org>
Link: https://lore.kernel.org/r/20250120140547.1460-1-vulab@iscas.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wentao Guan <guanwentao@uniontech.com>
Date: Fri Feb 14 19:04:18 2025 +0800
HID: i2c-hid: improve i2c_hid_get_report error message
[ Upstream commit 723aa55c08c9d1e0734e39a815fd41272eac8269 ]
We have two places to print "failed to set a report to ...",
use "get a report from" instead of "set a report to", it makes
people who knows less about the module to know where the error
happened.
Before:
i2c_hid_acpi i2c-FTSC1000:00: failed to set a report to device: -11
After:
i2c_hid_acpi i2c-FTSC1000:00: failed to get a report from device: -11
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiri Kosina <jikos@kernel.org>
Date: Wed Mar 12 09:08:22 2025 +0100
HID: remove superfluous (and wrong) Makefile entry for CONFIG_INTEL_ISH_FIRMWARE_DOWNLOADER
[ Upstream commit fe0fb58325e519008e2606a5aa2cff7ad23e212d ]
The line
obj-$(INTEL_ISH_FIRMWARE_DOWNLOADER) += intel-ish-hid/
in top-level HID Makefile is both superfluous (as CONFIG_INTEL_ISH_FIRMWARE_DOWNLOADER
depends on CONFIG_INTEL_ISH_HID, which contains intel-ish-hid/ already) and wrong (as it's
missing the CONFIG_ prefix).
Just remove it.
Fixes: 91b228107da3e ("HID: intel-ish-hid: ISH firmware loader client driver")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tasos Sahanidis <tasos@tasossah.com>
Date: Wed Mar 12 05:08:32 2025 +0200
hwmon: (nct6775-core) Fix out of bounds access for NCT679{8,9}
[ Upstream commit 815f80ad20b63830949a77c816e35395d5d55144 ]
pwm_num is set to 7 for these chips, but NCT6776_REG_PWM_MODE and
NCT6776_PWM_MODE_MASK only contain 6 values.
Fix this by adding another 0 to the end of each array.
Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Link: https://lore.kernel.org/r/20250312030832.106475-1-tasos@tasossah.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stanley Chu <yschu@nuvoton.com>
Date: Tue Mar 18 13:36:04 2025 +0800
i3c: master: svc: Fix missing the IBI rules
[ Upstream commit 9cecad134d84d14dc72a0eea7a107691c3e5a837 ]
The code does not add IBI rules for devices with controller capability.
However, the secondary controller has the controller capability and works
at target mode when the device is probed. Therefore, add IBI rules for
such devices.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Signed-off-by: Stanley Chu <yschu@nuvoton.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250318053606.3087121-2-yschu@nuvoton.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maher Sanalla <msanalla@nvidia.com>
Date: Thu Mar 13 16:20:17 2025 +0200
IB/mad: Check available slots before posting receive WRs
[ Upstream commit 37826f0a8c2f6b6add5179003b8597e32a445362 ]
The ib_post_receive_mads() function handles posting receive work
requests (WRs) to MAD QPs and is called in two cases:
1) When a MAD port is opened.
2) When a receive WQE is consumed upon receiving a new MAD.
Whereas, if MADs arrive during the port open phase, a race condition
might cause an extra WR to be posted, exceeding the QP’s capacity.
This leads to failures such as:
infiniband mlx5_0: ib_post_recv failed: -12
infiniband mlx5_0: Couldn't post receive WRs
infiniband mlx5_0: Couldn't start port
infiniband mlx5_0: Couldn't open port 1
Fix this by checking the current receive count before posting a new WR.
If the QP’s receive queue is full, do not post additional WRs.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Link: https://patch.msgid.link/c4984ba3c3a98a5711a558bccefcad789587ecf1.1741875592.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joe Damato <jdamato@fastly.com>
Date: Fri Oct 4 10:54:07 2024 +0000
idpf: Don't hard code napi_struct size
commit 49717ef01ce1b6dbe4cd12bee0fc25e086c555df upstream.
The sizeof(struct napi_struct) can change. Don't hardcode the size to
400 bytes and instead use "sizeof(struct napi_struct)".
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20241004105407.73585-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[Yifei: In Linux-6.12.y, it still hard code the size of napi_struct,
adding a member will lead the entire build failed]
Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Emil Tantilov <emil.s.tantilov@intel.com>
Date: Mon Mar 17 22:42:02 2025 -0700
idpf: fix adapter NULL pointer dereference on reboot
[ Upstream commit 4c9106f4906a85f6b13542d862e423bcdc118cc3 ]
With SRIOV enabled, idpf ends up calling into idpf_remove() twice.
First via idpf_shutdown() and then again when idpf_remove() calls into
sriov_disable(), because the VF devices use the idpf driver, hence the
same remove routine. When that happens, it is possible for the adapter
to be NULL from the first call to idpf_remove(), leading to a NULL
pointer dereference.
echo 1 > /sys/class/net/<netif>/device/sriov_numvfs
reboot
BUG: kernel NULL pointer dereference, address: 0000000000000020
...
RIP: 0010:idpf_remove+0x22/0x1f0 [idpf]
...
? idpf_remove+0x22/0x1f0 [idpf]
? idpf_remove+0x1e4/0x1f0 [idpf]
pci_device_remove+0x3f/0xb0
device_release_driver_internal+0x19f/0x200
pci_stop_bus_device+0x6d/0x90
pci_stop_and_remove_bus_device+0x12/0x20
pci_iov_remove_virtfn+0xbe/0x120
sriov_disable+0x34/0xe0
idpf_sriov_configure+0x58/0x140 [idpf]
idpf_remove+0x1b9/0x1f0 [idpf]
idpf_shutdown+0x12/0x30 [idpf]
pci_device_shutdown+0x35/0x60
device_shutdown+0x156/0x200
...
Replace the direct idpf_remove() call in idpf_shutdown() with
idpf_vc_core_deinit() and idpf_deinit_dflt_mbx(), which perform
the bulk of the cleanup, such as stopping the init task, freeing IRQs,
destroying the vports and freeing the mailbox. This avoids the calls to
sriov_disable() in addition to a small netdev cleanup, and destroying
workqueues, which don't seem to be required on shutdown.
Reported-by: Yuying Ma <yuma@redhat.com>
Fixes: e850efed5e15 ("idpf: add module register and probe functionality")
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Mon Feb 17 14:01:28 2025 +0000
iio: accel: mma8452: Ensure error return on failure to matching oversampling ratio
[ Upstream commit df330c808182a8beab5d0f84a6cbc9cff76c61fc ]
If a match was not found, then the write_raw() callback would return
the odr index, not an error. Return -EINVAL if this occurs.
To avoid similar issues in future, introduce j, a new indexing variable
rather than using ret for this purpose.
Fixes: 79de2ee469aa ("iio: accel: mma8452: claim direct mode during write raw")
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250217140135.896574-2-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Mon Feb 17 14:01:33 2025 +0000
iio: accel: msa311: Fix failure to release runtime pm if direct mode claim fails.
[ Upstream commit 60a0cf2ebab92011055ab7db6553c0fc3c546938 ]
Reorder the claiming of direct mode and runtime pm calls to simplify
handling a little. For correct error handling, after the reorder
iio_device_release_direct_mode() must be claimed in an error occurs
in pm_runtime_resume_and_get()
Fixes: 1ca2cfbc0c33 ("iio: add MEMSensing MSA311 3-axis accelerometer driver")
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250217140135.896574-7-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date: Mon Mar 3 12:47:00 2025 +0100
iio: adc: ad4130: Fix comparison of channel setups
[ Upstream commit 280acb19824663d55a3f4d09087c76fabe86fa3c ]
Checking the binary representation of two structs (of the same type)
for equality doesn't have the same semantic as comparing all members for
equality. The former might find a difference where the latter doesn't in
the presence of padding or when ambiguous types like float or bool are
involved. (Floats typically have different representations for single
values, like -0.0 vs +0.0, or 0.5 * 2² vs 0.25 * 2³. The type bool has
at least 8 bits and the raw values 1 and 2 (probably) both evaluate to
true, but memcmp finds a difference.)
When searching for a channel that already has the configuration we need,
the comparison by member is the one that is needed.
Convert the comparison accordingly to compare the members one after
another. Also add a static_assert guard to (somewhat) ensure that when
struct ad4130_setup_info is expanded, the comparison is adapted, too.
This issue is somewhat theoretic, but using memcmp() on a struct is a
bad pattern that is worth fixing.
Fixes: 62094060cf3a ("iio: adc: ad4130: add AD4130 driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250303114659.1672695-12-u.kleine-koenig@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date: Mon Mar 3 12:47:01 2025 +0100
iio: adc: ad7124: Fix comparison of channel configs
[ Upstream commit 05a5d874f7327b75e9bc4359618017e047cc129c ]
Checking the binary representation of two structs (of the same type)
for equality doesn't have the same semantic as comparing all members for
equality. The former might find a difference where the latter doesn't in
the presence of padding or when ambiguous types like float or bool are
involved. (Floats typically have different representations for single
values, like -0.0 vs +0.0, or 0.5 * 2² vs 0.25 * 2³. The type bool has
at least 8 bits and the raw values 1 and 2 (probably) both evaluate to
true, but memcmp finds a difference.)
When searching for a channel that already has the configuration we need,
the comparison by member is the one that is needed.
Convert the comparison accordingly to compare the members one after
another. Also add a static_assert guard to (somewhat) ensure that when
struct ad7124_channel_config::config_props is expanded, the comparison
is adapted, too.
This issue is somewhat theoretic, but using memcmp() on a struct is a
bad pattern that is worth fixing.
Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250303114659.1672695-13-u.kleine-koenig@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date: Mon Mar 3 12:47:02 2025 +0100
iio: adc: ad7173: Fix comparison of channel configs
[ Upstream commit 7b6033ed5a9e1a369a9cf58018388ae4c5f17e41 ]
Checking the binary representation of two structs (of the same type)
for equality doesn't have the same semantic as comparing all members for
equality. The former might find a difference where the latter doesn't in
the presence of padding or when ambiguous types like float or bool are
involved. (Floats typically have different representations for single
values, like -0.0 vs +0.0, or 0.5 * 2² vs 0.25 * 2³. The type bool has
at least 8 bits and the raw values 1 and 2 (probably) both evaluate to
true, but memcmp finds a difference.)
When searching for a channel that already has the configuration we need,
the comparison by member is the one that is needed.
Convert the comparison accordingly to compare the members one after
another. Also add a static_assert guard to (somewhat) ensure that when
struct ad7173_channel_config::config_props is expanded, the comparison
is adapted, too.
This issue is somewhat theoretic, but using memcmp() on a struct is a
bad pattern that is worth fixing.
Fixes: 76a1e6a42802 ("iio: adc: ad7173: add AD7173 driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250303114659.1672695-14-u.kleine-koenig@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Santos <Jonathan.Santos@analog.com>
Date: Thu Mar 6 18:00:43 2025 -0300
iio: adc: ad7768-1: set MOSI idle state to prevent accidental reset
[ Upstream commit 2416cec859299be04d021b4cf98eff814f345af7 ]
Datasheet recommends Setting the MOSI idle state to high in order to
prevent accidental reset of the device when SCLK is free running.
This happens when the controller clocks out a 1 followed by 63 zeros
while the CS is held low.
Check if SPI controller supports SPI_MOSI_IDLE_HIGH flag and set it.
Fixes: a5f8c7da3dbe ("iio: adc: Add AD7768-1 ADC basic support")
Signed-off-by: Jonathan Santos <Jonathan.Santos@analog.com>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
Link: https://patch.msgid.link/c2a2b0f3d54829079763a5511359a1fa80516cfb.1741268122.git.Jonathan.Santos@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nuno Sá <nuno.sa@analog.com>
Date: Tue Feb 18 10:31:25 2025 +0000
iio: backend: make sure to NULL terminate stack buffer
[ Upstream commit 035b4989211dc1c8626e186d655ae8ca5141bb73 ]
Make sure to NULL terminate the buffer in
iio_backend_debugfs_write_reg() before passing it to sscanf(). It is a
stack variable so we should not assume it will 0 initialized.
Fixes: cdf01e0809a4 ("iio: backend: add debugFs interface")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250218-dev-iio-misc-v1-1-bf72b20a1eb8@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Karan Sanghavi <karansanghvi98@gmail.com>
Date: Thu Feb 20 17:34:36 2025 +0000
iio: light: Add check for array bounds in veml6075_read_int_time_ms
[ Upstream commit ee735aa33db16c1fb5ebccbaf84ad38f5583f3cc ]
The array contains only 5 elements, but the index calculated by
veml6075_read_int_time_index can range from 0 to 7,
which could lead to out-of-bounds access. The check prevents this issue.
Coverity Issue
CID 1574309: (#1 of 1): Out-of-bounds read (OVERRUN)
overrun-local: Overrunning array veml6075_it_ms of 5 4-byte
elements at element index 7 (byte offset 31) using
index int_index (which evaluates to 7)
This is hardening against potentially broken hardware. Good to have
but not necessary to backport.
Fixes: 3b82f43238ae ("iio: light: add VEML6075 UVA and UVB light sensor driver")
Signed-off-by: Karan Sanghavi <karansanghvi98@gmail.com>
Reviewed-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/Z7dnrEpKQdRZ2qFU@Emma
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Wed Apr 2 14:42:24 2025 +0300
ipv6: Do not consider link down nexthops in path selection
[ Upstream commit 8b8e0dd357165e0258d9f9cdab5366720ed2f619 ]
Nexthops whose link is down are not supposed to be considered during
path selection when the "ignore_routes_with_linkdown" sysctl is set.
This is done by assigning them a negative region boundary.
However, when comparing the computed hash (unsigned) with the region
boundary (signed), the negative region boundary is treated as unsigned,
resulting in incorrect nexthop selection.
Fix by treating the computed hash as signed. Note that the computed hash
is always in range of [0, 2^31 - 1].
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250402114224.293392-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fernando Fernandez Mancera <ffmancera@riseup.net>
Date: Wed Apr 2 14:17:51 2025 +0200
ipv6: fix omitted netlink attributes when using RTEXT_FILTER_SKIP_STATS
[ Upstream commit 7ac6ea4a3e0898db76aecccd68fb2c403eb7d24e ]
Using RTEXT_FILTER_SKIP_STATS is incorrectly skipping non-stats IPv6
netlink attributes on link dump. This causes issues on userspace tools,
e.g iproute2 is not rendering address generation mode as it should due
to missing netlink attribute.
Move the filling of IFLA_INET6_STATS and IFLA_INET6_ICMP6STATS to a
helper function guarded by a flag check to avoid hitting the same
situation in the future.
Fixes: d5566fd72ec1 ("rtnetlink: RTEXT_FILTER_SKIP_STATS support to avoid dumping inet/inet6 stats")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250402121751.3108-1-ffmancera@riseup.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Wed Apr 2 14:42:23 2025 +0300
ipv6: Start path selection from the first nexthop
[ Upstream commit 4d0ab3a6885e3e9040310a8d8f54503366083626 ]
Cited commit transitioned IPv6 path selection to use hash-threshold
instead of modulo-N. With hash-threshold, each nexthop is assigned a
region boundary in the multipath hash function's output space and a
nexthop is chosen if the calculated hash is smaller than the nexthop's
region boundary.
Hash-threshold does not work correctly if path selection does not start
with the first nexthop. For example, if fib6_select_path() is always
passed the last nexthop in the group, then it will always be chosen
because its region boundary covers the entire hash function's output
space.
Fix this by starting the selection process from the first nexthop and do
not consider nexthops for which rt6_score_route() provided a negative
score.
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N")
Reported-by: Stanislav Fomichev <stfomichev@gmail.com>
Closes: https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250402114224.293392-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qasim Ijaz <qasdev00@gmail.com>
Date: Tue Feb 11 19:59:00 2025 +0000
isofs: fix KMSAN uninit-value bug in do_isofs_readdir()
[ Upstream commit 81a82e8f33880793029cd6f8a766fb13b737e6a7 ]
In do_isofs_readdir() when assigning the variable
"struct iso_directory_record *de" the b_data field of the buffer_head
is accessed and an offset is added to it, the size of b_data is 2048
and the offset size is 2047, meaning
"de = (struct iso_directory_record *) (bh->b_data + offset);"
yields the final byte of the 2048 sized b_data block.
The first byte of the directory record (de_len) is then read and
found to be 31, meaning the directory record size is 31 bytes long.
The directory record is defined by the structure:
struct iso_directory_record {
__u8 length; // 1 byte
__u8 ext_attr_length; // 1 byte
__u8 extent[8]; // 8 bytes
__u8 size[8]; // 8 bytes
__u8 date[7]; // 7 bytes
__u8 flags; // 1 byte
__u8 file_unit_size; // 1 byte
__u8 interleave; // 1 byte
__u8 volume_sequence_number[4]; // 4 bytes
__u8 name_len; // 1 byte
char name[]; // variable size
} __attribute__((packed));
The fixed portion of this structure occupies 33 bytes. Therefore, a
valid directory record must be at least 33 bytes long
(even without considering the variable-length name field).
Since de_len is only 31, it is insufficient to contain
the complete fixed header.
The code later hits the following sanity check that
compares de_len against the sum of de->name_len and
sizeof(struct iso_directory_record):
if (de_len < de->name_len[0] + sizeof(struct iso_directory_record)) {
...
}
Since the fixed portion of the structure is
33 bytes (up to and including name_len member),
a valid record should have de_len of at least 33 bytes;
here, however, de_len is too short, and the field de->name_len
(located at offset 32) is accessed even though it lies beyond
the available 31 bytes.
This access on the corrupted isofs data triggers a KASAN uninitialized
memory warning. The fix would be to first verify that de_len is at least
sizeof(struct iso_directory_record) before accessing any
fields like de->name_len.
Reported-by: syzbot <syzbot+812641c6c3d7586a1613@syzkaller.appspotmail.com>
Tested-by: syzbot <syzbot+812641c6c3d7586a1613@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=812641c6c3d7586a1613
Fixes: 2deb1acc653c ("isofs: fix access to unallocated memory when reading corrupted filesystem")
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250211195900.42406-1-qasdev00@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Roman Smirnov <r.smirnov@omp.ru>
Date: Wed Feb 26 11:25:22 2025 +0300
jfs: add index corruption check to DT_GETPAGE()
commit a8dfb2168906944ea61acfc87846b816eeab882d upstream.
If the file system is corrupted, the header.stblindex variable
may become greater than 127. Because of this, an array access out
of bounds may occur:
------------[ cut here ]------------
UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dtree.c:3096:10
index 237 is out of range for type 'struct dtslot[128]'
CPU: 0 UID: 0 PID: 5822 Comm: syz-executor740 Not tainted 6.13.0-rc4-syzkaller-00110-g4099a71718b0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429
dtReadFirst+0x622/0xc50 fs/jfs/jfs_dtree.c:3096
dtReadNext fs/jfs/jfs_dtree.c:3147 [inline]
jfs_readdir+0x9aa/0x3c50 fs/jfs/jfs_dtree.c:2862
wrap_directory_iterator+0x91/0xd0 fs/readdir.c:65
iterate_dir+0x571/0x800 fs/readdir.c:108
__do_sys_getdents64 fs/readdir.c:403 [inline]
__se_sys_getdents64+0x1e2/0x4b0 fs/readdir.c:389
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
---[ end trace ]---
Add a stblindex check for corruption.
Reported-by: syzbot <syzbot+9120834fc227768625ba@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=9120834fc227768625ba
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Qasim Ijaz <qasdev00@gmail.com>
Date: Thu Feb 13 21:05:53 2025 +0000
jfs: fix slab-out-of-bounds read in ea_get()
commit fdf480da5837c23b146c4743c18de97202fcab37 upstream.
During the "size_check" label in ea_get(), the code checks if the extended
attribute list (xattr) size matches ea_size. If not, it logs
"ea_get: invalid extended attribute" and calls print_hex_dump().
Here, EALIST_SIZE(ea_buf->xattr) returns 4110417968, which exceeds
INT_MAX (2,147,483,647). Then ea_size is clamped:
int size = clamp_t(int, ea_size, 0, EALIST_SIZE(ea_buf->xattr));
Although clamp_t aims to bound ea_size between 0 and 4110417968, the upper
limit is treated as an int, causing an overflow above 2^31 - 1. This leads
"size" to wrap around and become negative (-184549328).
The "size" is then passed to print_hex_dump() (called "len" in
print_hex_dump()), it is passed as type size_t (an unsigned
type), this is then stored inside a variable called
"int remaining", which is then assigned to "int linelen" which
is then passed to hex_dump_to_buffer(). In print_hex_dump()
the for loop, iterates through 0 to len-1, where len is
18446744073525002176, calling hex_dump_to_buffer()
on each iteration:
for (i = 0; i < len; i += rowsize) {
linelen = min(remaining, rowsize);
remaining -= rowsize;
hex_dump_to_buffer(ptr + i, linelen, rowsize, groupsize,
linebuf, sizeof(linebuf), ascii);
...
}
The expected stopping condition (i < len) is effectively broken
since len is corrupted and very large. This eventually leads to
the "ptr+i" being passed to hex_dump_to_buffer() to get closer
to the end of the actual bounds of "ptr", eventually an out of
bounds access is done in hex_dump_to_buffer() in the following
for loop:
for (j = 0; j < len; j++) {
if (linebuflen < lx + 2)
goto overflow2;
ch = ptr[j];
...
}
To fix this we should validate "EALIST_SIZE(ea_buf->xattr)"
before it is utilised.
Reported-by: syzbot <syzbot+4e6e7e4279d046613bc5@syzkaller.appspotmail.com>
Tested-by: syzbot <syzbot+4e6e7e4279d046613bc5@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=4e6e7e4279d046613bc5
Fixes: d9f9d96136cb ("jfs: xattr: check invalid xattr size more strictly")
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alexandru Gagniuc <alexandru.gagniuc@hp.com>
Date: Fri Mar 14 13:10:53 2025 +0000
kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
[ Upstream commit 62604063621fb075c7966286bdddcb057d883fa8 ]
In ThinPro, we use the convention <upstream_ver>+hp<patchlevel> for
the kernel package. This does not have a dash in the name or version.
This is built by editing ".version" before a build, and setting
EXTRAVERSION="+hp" and KDEB_PKGVERSION make variables:
echo 68 > .version
make -j<n> EXTRAVERSION="+hp" bindeb-pkg KDEB_PKGVERSION=6.12.2+hp69
.deb name: linux-image-6.12.2+hp_6.12.2+hp69_amd64.deb
Since commit 7d4f07d5cb71 ("kbuild: deb-pkg: squash
scripts/package/deb-build-option to debian/rules"), this no longer
works. The deb build logic changed, even though, the commit message
implies that the logic should be unmodified.
Before, KBUILD_BUILD_VERSION was not set if the KDEB_PKGVERSION did
not contain a dash. After the change KBUILD_BUILD_VERSION is always
set to KDEB_PKGVERSION. Since this determines UTS_VERSION, the uname
output to look off:
(now) uname -a: version 6.12.2+hp ... #6.12.2+hp69
(expected) uname -a: version 6.12.2+hp ... #69
Update the debian/rules logic to restore the original behavior.
Fixes: 7d4f07d5cb71 ("kbuild: deb-pkg: squash scripts/package/deb-build-option to debian/rules")
Signed-off-by: Alexandru Gagniuc <alexandru.gagniuc@hp.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Hildenbrand <david@redhat.com>
Date: Mon Feb 10 20:37:50 2025 +0100
kernel/events/uprobes: handle device-exclusive entries correctly in __replace_page()
[ Upstream commit 096cbb80ab3fd85a9035ec17a1312c2a7db8bc8c ]
Ever since commit b756a3b5e7ea ("mm: device exclusive memory access") we
can return with a device-exclusive entry from page_vma_mapped_walk().
__replace_page() is not prepared for that, so teach it about these PFN
swap PTEs. Note that device-private entries are so far not applicable on
that path, because GUP would never have returned such folios (conversion
to device-private happens by page migration, not in-place conversion of
the PTE).
There is a race between GUP and us locking the folio to look it up using
page_vma_mapped_walk(), so this is likely a fix (unless something else
could prevent that race, but it doesn't look like). pte_pfn() on
something that is not a present pte could give use garbage, and we'd
wrongly mess up the mapcount because it was already adjusted by calling
folio_remove_rmap_pte() when making the entry device-exclusive.
Link: https://lkml.kernel.org/r/20250210193801.781278-9-david@redhat.com
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Lyude <lyude@redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yanteng Si <si.yanteng@linux.dev>
Cc: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Fri Jan 31 17:08:24 2025 +0530
kexec: initialize ELF lowest address to ULONG_MAX
[ Upstream commit 9986fb5164c8b21f6439cfd45ba36d8cc80c9710 ]
Patch series "powerpc/crash: use generic crashkernel reservation", v3.
Commit 0ab97169aa05 ("crash_core: add generic function to do reservation")
added a generic function to reserve crashkernel memory. So let's use the
same function on powerpc and remove the architecture-specific code that
essentially does the same thing.
The generic crashkernel reservation also provides a way to split the
crashkernel reservation into high and low memory reservations, which can
be enabled for powerpc in the future.
Additionally move powerpc to use generic APIs to locate memory hole for
kexec segments while loading kdump kernel.
This patch (of 7):
kexec_elf_load() loads an ELF executable and sets the address of the
lowest PT_LOAD section to the address held by the lowest_load_addr
function argument.
To determine the lowest PT_LOAD address, a local variable lowest_addr
(type unsigned long) is initialized to UINT_MAX. After loading each
PT_LOAD, its address is compared to lowest_addr. If a loaded PT_LOAD
address is lower, lowest_addr is updated. However, setting lowest_addr to
UINT_MAX won't work when the kernel image is loaded above 4G, as the
returned lowest PT_LOAD address would be invalid. This is resolved by
initializing lowest_addr to ULONG_MAX instead.
This issue was discovered while implementing crashkernel high/low
reservation on the PowerPC architecture.
Link: https://lkml.kernel.org/r/20250131113830.925179-1-sourabhjain@linux.ibm.com
Link: https://lkml.kernel.org/r/20250131113830.925179-2-sourabhjain@linux.ibm.com
Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Norbert Szetei <norbert@doyensec.com>
Date: Sat Mar 15 12:19:28 2025 +0900
ksmbd: add bounds check for create lease context
commit bab703ed8472aa9d109c5f8c1863921533363dae upstream.
Add missing bounds check for create lease context.
Cc: stable@vger.kernel.org
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Fri Mar 14 18:21:47 2025 +0900
ksmbd: add bounds check for durable handle context
commit 542027e123fc0bfd61dd59e21ae0ee4ef2101b29 upstream.
Add missing bounds check for durable handle context.
Cc: stable@vger.kernel.org
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Mon Mar 24 20:19:20 2025 +0900
ksmbd: fix multichannel connection failure
[ Upstream commit c1883049aa9b2b7dffd3a68c5fc67fa92c174bd9 ]
ksmbd check that the session of second channel is in the session list of
first connection. If it is in session list, multichannel connection
should not be allowed.
Fixes: b95629435b84 ("ksmbd: fix racy issue from session lookup and expire")
Reported-by: Sean Heelan <seanheelan@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Wed Apr 2 09:11:23 2025 +0900
ksmbd: fix null pointer dereference in alloc_preauth_hash()
commit c8b5b7c5da7d0c31c9b7190b4a7bba5281fc4780 upstream.
The Client send malformed smb2 negotiate request. ksmbd return error
response. Subsequently, the client can send smb2 session setup even
thought conn->preauth_info is not allocated.
This patch add KSMBD_SESS_NEED_SETUP status of connection to ignore
session setup request if smb2 negotiate phase is not complete.
Cc: stable@vger.kernel.org
Tested-by: Steve French <stfrench@microsoft.com>
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-26505
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Norbert Szetei <norbert@doyensec.com>
Date: Sat Mar 29 06:58:15 2025 +0000
ksmbd: fix overflow in dacloffset bounds check
commit beff0bc9d69bc8e733f9bca28e2d3df5b3e10e42 upstream.
The dacloffset field was originally typed as int and used in an
unchecked addition, which could overflow and bypass the existing
bounds check in both smb_check_perm_dacl() and smb_inherit_dacl().
This could result in out-of-bounds memory access and a kernel crash
when dereferencing the DACL pointer.
This patch converts dacloffset to unsigned int and uses
check_add_overflow() to validate access to the DACL.
Cc: stable@vger.kernel.org
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Tue Mar 25 00:00:24 2025 +0900
ksmbd: fix r_count dec/increment mismatch
[ Upstream commit ddb7ea36ba7129c2ed107e2186591128618864e1 ]
r_count is only increased when there is an oplock break wait,
so r_count inc/decrement are not paired. This can cause r_count
to become negative, which can lead to a problem where the ksmbd
thread does not terminate.
Fixes: 3aa660c05924 ("ksmbd: prevent connection release during oplock break notification")
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Thu Mar 27 21:22:51 2025 +0900
ksmbd: fix session use-after-free in multichannel connection
commit fa4cdb8cbca7d6cb6aa13e4d8d83d1103f6345db upstream.
There is a race condition between session setup and
ksmbd_sessions_deregister. The session can be freed before the connection
is added to channel list of session.
This patch check reference count of session before freeing it.
Cc: stable@vger.kernel.org
Reported-by: Sean Heelan <seanheelan@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Sat Mar 22 09:20:19 2025 +0900
ksmbd: fix use-after-free in ksmbd_sessions_deregister()
commit 15a9605f8d69dc85005b1a00c31a050b8625e1aa upstream.
In multichannel mode, UAF issue can occur in session_deregister
when the second channel sets up a session through the connection of
the first channel. session that is freed through the global session
table can be accessed again through ->sessions of connection.
Cc: stable@vger.kernel.org
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Miaoqian Lin <linmq006@gmail.com>
Date: Tue Mar 18 20:12:34 2025 +0800
ksmbd: use aead_request_free to match aead_request_alloc
[ Upstream commit 6171063e9d046ffa46f51579b2ca4a43caef581a ]
Use aead_request_free() instead of kfree() to properly free memory
allocated by aead_request_alloc(). This ensures sensitive crypto data
is zeroed before being freed.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Norbert Szetei <norbert@doyensec.com>
Date: Sat Mar 29 16:06:01 2025 +0000
ksmbd: validate zero num_subauth before sub_auth is accessed
commit bf21e29d78cd2c2371023953d9c82dfef82ebb36 upstream.
Access psid->sub_auth[psid->num_subauth - 1] without checking
if num_subauth is non-zero leads to an out-of-bounds read.
This patch adds a validation step to ensure num_subauth != 0
before sub_auth is accessed.
Cc: stable@vger.kernel.org
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kees Cook <kees@kernel.org>
Date: Tue Mar 4 14:56:11 2025 -0800
kunit/stackinit: Use fill byte different from Clang i386 pattern
[ Upstream commit d985e4399adffb58e10b38dbb5479ef29d53cde6 ]
The byte initialization values used with -ftrivial-auto-var-init=pattern
(CONFIG_INIT_STACK_ALL_PATTERN=y) depends on the compiler, architecture,
and byte position relative to struct member types. On i386 with Clang,
this includes the 0xFF value, which means it looks like nothing changes
between the leaf byte filling pass and the expected "stack wiping"
pass of the stackinit test.
Use the byte fill value of 0x99 instead, fixing the test for i386 Clang
builds.
Reported-by: ernsteiswuerfel
Closes: https://github.com/ClangBuiltLinux/linux/issues/2071
Fixes: 8c30d32b1a32 ("lib/test_stackinit: Handle Clang auto-initialization pattern")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250304225606.work.030-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sean Christopherson <seanjc@google.com>
Date: Wed Feb 26 17:25:35 2025 -0800
KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error
commit d26638bfcdfc5c8c4e085dc3f5976a0443abab3c upstream.
If KVM rejects an AP Creation event, leave the target vCPU state as-is.
Nothing in the GHCB suggests the hypervisor is *allowed* to muck with vCPU
state on failure, let alone required to do so. Furthermore, kicking only
in the !ON_INIT case leads to divergent behavior, and even the "kick" case
is non-deterministic.
E.g. if an ON_INIT request fails, the guest can successfully retry if the
fixed AP Creation request is made prior to sending INIT. And if a !ON_INIT
fails, the guest can successfully retry if the fixed AP Creation request is
handled before the target vCPU processes KVM's
KVM_REQ_UPDATE_PROTECTED_GUEST_STATE.
Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Link: https://lore.kernel.org/r/20250227012541.3234589-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 6 21:29:22 2025 +0100
KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
commit 74c1807f6c4feddb3c3cb1056c54531d4adbaea6 upstream.
KVM_CAP_SYNC_REGS does not make sense for VMs with protected guest state,
since the register values cannot actually be written. Return 0
when using the VM-level KVM_CHECK_EXTENSION ioctl, and accordingly
return -EINVAL from KVM_RUN if the valid/dirty fields are nonzero.
However, on exit from KVM_RUN userspace could have placed a nonzero
value into kvm_run->kvm_valid_regs, so check guest_state_protected
again and skip store_regs() in that case.
Cc: stable@vger.kernel.org
Fixes: 517987e3fb19 ("KVM: x86: add fields to struct kvm_arch for CoCo features")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20250306202923.646075-1-pbonzini@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Remi Pommarel <repk@triplefau.lt>
Date: Thu Feb 20 12:23:17 2025 +0100
leds: Fix LED_OFF brightness race
[ Upstream commit 2c70953b6f535f7698ccbf22c1f5ba26cb6c2816 ]
While commit fa15d8c69238 ("leds: Fix set_brightness_delayed() race")
successfully forces led_set_brightness() to be called with LED_OFF at
least once when switching from blinking to LED on state so that
hw-blinking can be disabled, another race remains. Indeed in
led_set_brightness(LED_OFF) followed by led_set_brightness(any)
scenario the following CPU scheduling can happen:
CPU0 CPU1
---- ----
set_brightness_delayed() {
test_and_clear_bit(BRIGHTNESS_OFF)
led_set_brightness(LED_OFF) {
set_bit(BRIGHTNESS_OFF)
queue_work()
}
led_set_brightness(any) {
set_bit(BRIGHTNESS)
queue_work() //already queued
}
test_and_clear_bit(BRIGHTNESS)
/* LED set with brightness any */
}
/* From previous CPU1 queue_work() */
set_brightness_delayed() {
test_and_clear_bit(BRIGHTNESS_OFF)
/* LED turned off */
test_and_clear_bit(BRIGHTNESS)
/* Clear from previous run, LED remains off */
In that case the led_set_brightness(LED_OFF)/led_set_brightness(any)
sequence will be effectively executed in reverse order and LED will
remain off.
With the introduction of commit 32360bf6a5d4 ("leds: Introduce ordered
workqueue for LEDs events instead of system_wq") the race is easier to
trigger as sysfs brightness configuration does not wait for
set_brightness_delayed() work to finish (flush_work() removal).
Use delayed_set_value to optionnally re-configure brightness after a
LED_OFF. That way a LED state could be configured more that once but
final state will always be as expected. Ensure that delayed_set_value
modification is seen before set_bit() using smp_mb__before_atomic().
Fixes: fa15d8c69238 ("leds: Fix set_brightness_delayed() race")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/19c81177059dab7b656c42063958011a8e4d1a66.1740050412.git.repk@triplefau.lt
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Date: Tue Jan 14 19:42:04 2025 +0530
lib: 842: Improve error handling in sw842_compress()
[ Upstream commit af324dc0e2b558678aec42260cce38be16cc77ca ]
The static code analysis tool "Coverity Scan" pointed the following
implementation details out for further development considerations:
CID 1309755: Unused value
In sw842_compress: A value assigned to a variable is never used. (CWE-563)
returned_value: Assigning value from add_repeat_template(p, repeat_count)
to ret here, but that stored value is overwritten before it can be used.
Conclusion:
Add error handling for the return value from an add_repeat_template()
call.
Fixes: 2da572c959dd ("lib: add software 842 compression/decompression")
Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andrii Nakryiko <andrii@kernel.org>
Date: Wed Feb 19 16:28:21 2025 -0800
libbpf: Fix hypothetical STT_SECTION extern NULL deref case
[ Upstream commit e0525cd72b5979d8089fe524a071ea93fd011dc9 ]
Fix theoretical NULL dereference in linker when resolving *extern*
STT_SECTION symbol against not-yet-existing ELF section. Not sure if
it's possible in practice for valid ELF object files (this would require
embedded assembly manipulations, at which point BTF will be missing),
but fix the s/dst_sym/dst_sec/ typo guarding this condition anyways.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250220002821.834400-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu Apr 10 14:39:41 2025 +0200
Linux 6.12.23
Link: https://lore.kernel.org/r/20250408104845.675475678@linuxfoundation.org
Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
Link: https://lore.kernel.org/r/20250408154121.378213016@linuxfoundation.org
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: SeongJae Park <sj@kernel.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250409115859.721906906@linuxfoundation.org
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Nov 4 14:39:10 2024 +0100
lockdep/mm: Fix might_fault() lockdep check of current->mm->mmap_lock
[ Upstream commit a1b65f3f7c6f7f0a08a7dba8be458c6415236487 ]
Turns out that this commit, about 10 years ago:
9ec23531fd48 ("sched/preempt, mm/fault: Trigger might_sleep() in might_fault() with disabled pagefaults")
... accidentally (and unnessecarily) put the lockdep part of
__might_fault() under CONFIG_DEBUG_ATOMIC_SLEEP=y.
This is potentially notable because large distributions such as
Ubuntu are running with !CONFIG_DEBUG_ATOMIC_SLEEP.
Restore the debug check.
[ mingo: Update changelog. ]
Fixes: 9ec23531fd48 ("sched/preempt, mm/fault: Trigger might_sleep() in might_fault() with disabled pagefaults")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20241104135517.536628371@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed Feb 12 11:36:18 2025 +0100
lockdep: Don't disable interrupts on RT in disable_irq_nosync_lockdep.*()
[ Upstream commit 87886b32d669abc11c7be95ef44099215e4f5788 ]
disable_irq_nosync_lockdep() disables interrupts with lockdep enabled to
avoid false positive reports by lockdep that a certain lock has not been
acquired with disabled interrupts. The user of this macros expects that
a lock can be acquried without disabling interrupts because the IRQ line
triggering the interrupt is disabled.
This triggers a warning on PREEMPT_RT because after
disable_irq_nosync_lockdep.*() the following spinlock_t now is acquired
with disabled interrupts.
On PREEMPT_RT there is no difference between spin_lock() and
spin_lock_irq() so avoiding disabling interrupts in this case works for
the two remaining callers as of today.
Don't disable interrupts on PREEMPT_RT in disable_irq_nosync_lockdep.*().
Closes: https://lore.kernel.org/760e34f9-6034-40e0-82a5-ee9becd24438@roeck-us.net
Fixes: e8106b941ceab ("[PATCH] lockdep: core, add enable/disable_irq_irqsave/irqrestore() APIs")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250212103619.2560503-2-bigeasy@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Waiman Long <longman@redhat.com>
Date: Fri Mar 7 15:26:52 2025 -0800
locking/semaphore: Use wake_q to wake up processes outside lock critical section
[ Upstream commit 85b2b9c16d053364e2004883140538e73b333cdb ]
A circular lock dependency splat has been seen involving down_trylock():
======================================================
WARNING: possible circular locking dependency detected
6.12.0-41.el10.s390x+debug
------------------------------------------------------
dd/32479 is trying to acquire lock:
0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90
but task is already holding lock:
000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0
the existing dependency chain (in reverse order) is:
-> #4 (&zone->lock){-.-.}-{2:2}:
-> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
-> #2 (&rq->__lock){-.-.}-{2:2}:
-> #1 (&p->pi_lock){-.-.}-{2:2}:
-> #0 ((console_sem).lock){-.-.}-{2:2}:
The console_sem -> pi_lock dependency is due to calling try_to_wake_up()
while holding the console_sem raw_spinlock. This dependency can be broken
by using wake_q to do the wakeup instead of calling try_to_wake_up()
under the console_sem lock. This will also make the semaphore's
raw_spinlock become a terminal lock without taking any further locks
underneath it.
The hrtimer_bases.lock is a raw_spinlock while zone->lock is a
spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via
the debug_objects_fill_pool() helper function in the debugobjects code.
-> #4 (&zone->lock){-.-.}-{2:2}:
__lock_acquire+0xe86/0x1cc0
lock_acquire.part.0+0x258/0x630
lock_acquire+0xb8/0xe0
_raw_spin_lock_irqsave+0xb4/0x120
rmqueue_bulk+0xac/0x8f0
__rmqueue_pcplist+0x580/0x830
rmqueue_pcplist+0xfc/0x470
rmqueue.isra.0+0xdec/0x11b0
get_page_from_freelist+0x2ee/0xeb0
__alloc_pages_noprof+0x2c2/0x520
alloc_pages_mpol_noprof+0x1fc/0x4d0
alloc_pages_noprof+0x8c/0xe0
allocate_slab+0x320/0x460
___slab_alloc+0xa58/0x12b0
__slab_alloc.isra.0+0x42/0x60
kmem_cache_alloc_noprof+0x304/0x350
fill_pool+0xf6/0x450
debug_object_activate+0xfe/0x360
enqueue_hrtimer+0x34/0x190
__run_hrtimer+0x3c8/0x4c0
__hrtimer_run_queues+0x1b2/0x260
hrtimer_interrupt+0x316/0x760
do_IRQ+0x9a/0xe0
do_irq_async+0xf6/0x160
Normally a raw_spinlock to spinlock dependency is not legitimate
and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled,
but debug_objects_fill_pool() is an exception as it explicitly
allows this dependency for non-PREEMPT_RT kernel without causing
PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is
legitimate and not a bug.
Anyway, semaphore is the only locking primitive left that is still
using try_to_wake_up() to do wakeup inside critical section, all the
other locking primitives had been migrated to use wake_q to do wakeup
outside of the critical section. It is also possible that there are
other circular locking dependencies involving printk/console_sem or
other existing/new semaphores lurking somewhere which may show up in
the future. Let just do the migration now to wake_q to avoid headache
like this.
Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hengqi Chen <hengqi.chen@gmail.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: BPF: Don't override subprog's return value
commit 60f3caff1492e5b8616b9578c4bedb5c0a88ed14 upstream.
The verifier test `calls: div by 0 in subprog` triggers a panic at the
ld.bu instruction. The ld.bu insn is trying to load byte from memory
address returned by the subprog. The subprog actually set the correct
address at the a5 register (dedicated register for BPF return values).
But at commit 73c359d1d356 ("LoongArch: BPF: Sign-extend return values")
we also sign extended a5 to the a0 register (return value in LoongArch).
For function call insn, we later propagate the a0 register back to a5
register. This is right for native calls but wrong for bpf2bpf calls
which expect zero-extended return value in a5 register. So only move a0
to a5 for native calls (i.e. non-BPF_PSEUDO_CALL).
Cc: stable@vger.kernel.org
Fixes: 73c359d1d356 ("LoongArch: BPF: Sign-extend return values")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hengqi Chen <hengqi.chen@gmail.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: BPF: Fix off-by-one error in build_prologue()
commit 7e2586991e36663c9bc48c828b83eab180ad30a9 upstream.
Vincent reported that running BPF progs with tailcalls on LoongArch
causes kernel hard lockup. Debugging the issues shows that the JITed
image missing a jirl instruction at the end of the epilogue.
There are two passes in JIT compiling, the first pass set the flags and
the second pass generates JIT code based on those flags. With BPF progs
mixing bpf2bpf and tailcalls, build_prologue() generates N insns in the
first pass and then generates N+1 insns in the second pass. This makes
epilogue_offset off by one and we will jump to some unexpected insn and
cause lockup. Fix this by inserting a nop insn.
Cc: stable@vger.kernel.org
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Closes: https://lore.kernel.org/loongarch/CAK3+h2w6WESdBN3UCr3WKHByD7D6Q_Ve1EDAjotVrnx6Or_c8g@mail.gmail.com/
Closes: https://lore.kernel.org/bpf/CAK3+h2woEjG_N=-XzqEGaAeCmgu2eTCUc7p6bP4u8Q+DFHm-7g@mail.gmail.com/
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hengqi Chen <hengqi.chen@gmail.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: BPF: Use move_addr() for BPF_PSEUDO_FUNC
commit 52266f1015a8b5aabec7d127f83d105f702b388e upstream.
Vincent reported that running XDP synproxy program on LoongArch results
in the following error:
JIT doesn't support bpf-to-bpf calls
With dmesg:
multi-func JIT bug 1391 != 1390
The root cause is that verifier will refill the imm with the correct
addresses of bpf_calls for BPF_PSEUDO_FUNC instructions and then run
the last pass of JIT. So we generate different JIT code for the same
instruction in two passes (one for placeholder and the other for the
real address). Let's use move_addr() instead.
See commit 64f50f6575721ef0 ("LoongArch, bpf: Use 4 instructions for
function address in JIT") for a similar fix.
Cc: stable@vger.kernel.org
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Closes: https://lore.kernel.org/loongarch/CAK3+h2yfM9FTNiXvEQBkvtuoJrvzmN4c_NZsFXqEk4Cj1tsBNA@mail.gmail.com/T/#u
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Miaoqian Lin <linmq006@gmail.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: Fix device node refcount leak in fdt_cpu_clk_init()
[ Upstream commit 2e3bc71e4f394ecf8f499d21923cf556b4bfa1e7 ]
Add missing of_node_put() to properly handle the reference count of the
device node obtained from of_get_cpu_node().
Fixes: 44a01f1f726a ("LoongArch: Parsing CPU-related information from DTS")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: Fix help text of CMDLINE_EXTEND in Kconfig
[ Upstream commit be216cbc1ddf99a51915414ce147311c0dfd50a2 ]
It is the built-in command line appended to the bootloader command line,
not the bootloader command line appended to the built-in command line.
Fixes: fa96b57c1490 ("LoongArch: Add build infrastructure")
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: Increase ARCH_DMA_MINALIGN up to 16
commit 4103cfe9dcb88010ae4911d3ff417457d1b6a720 upstream.
ARCH_DMA_MINALIGN is 1 by default, but some LoongArch-specific devices
(such as APBDMA) require 16 bytes alignment. When the data buffer length
is too small, the hardware may make an error writing cacheline. Thus, it
is dangerous to allocate a small memory buffer for DMA. It's always safe
to define ARCH_DMA_MINALIGN as L1_CACHE_BYTES but unnecessary (kmalloc()
need small memory objects). Therefore, just increase it to 16.
Cc: stable@vger.kernel.org
Tested-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: Increase MAX_IO_PICS up to 8
commit ec105cadff5d8c0a029a3dc1084cae46cf3f799d upstream.
Begin with Loongson-3C6000, the number of PCI host can be as many as
8 for multi-chip machines, and this number should be the same for I/O
interrupt controllers. To support these machines we also increase the
MAX_IO_PICS up to 8.
Cc: stable@vger.kernel.org
Tested-by: Mingcong Bai <baimingcong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuli Wang <wangyuli@uniontech.com>
Date: Sun Mar 30 16:31:09 2025 +0800
LoongArch: Rework the arch_kgdb_breakpoint() implementation
[ Upstream commit 29c92a41c6d2879c1f62220fe4758dce191bb38f ]
The arch_kgdb_breakpoint() function defines the kgdb_breakinst symbol
using inline assembly.
1. There's a potential issue where the compiler might inline
arch_kgdb_breakpoint(), which would then define the kgdb_breakinst
symbol multiple times, leading to a linker error.
To prevent this, declare arch_kgdb_breakpoint() as noinline.
Fix follow error with LLVM-19 *only* when LTO_CLANG_FULL:
LD vmlinux.o
ld.lld-19: error: ld-temp.o <inline asm>:3:1: symbol 'kgdb_breakinst' is already defined
kgdb_breakinst: break 2
^
2. Remove "nop" in the inline assembly because it's meaningless for
LoongArch here.
3. Add "STACK_FRAME_NON_STANDARD" for arch_kgdb_breakpoint() to avoid
the objtool warning.
Fixes: e14dd076964e ("LoongArch: Add basic KGDB & KDB support")
Tested-by: Binbin Zhou <zhoubinbin@loongson.cn>
Co-developed-by: Winston Wen <wentao@uniontech.com>
Signed-off-by: Winston Wen <wentao@uniontech.com>
Co-developed-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Tue Feb 25 17:44:23 2025 +0100
mdacon: rework dependency list
[ Upstream commit 5bbcc7645f4b244ffb5ac6563fbe9d3d42194447 ]
mdacon has roughly the same dependencies as vgacon but expresses them
as a negative list instead of a positive list, with the only practical
difference being PowerPC/CHRP, which uses vga16fb instead of vgacon.
The CONFIG_MDA_CONSOLE description advises to only turn it on when vgacon
is also used because MDA/Hercules-only systems should be using vgacon
instead, so just change the list to enforce that directly for simplicity.
The probing was broken from 2002 to 2008, this improves on the fix
that was added then: If vgacon is a loadable module, then mdacon
cannot be built-in now, and the list of systems that support vgacon
is carried over.
Fixes: 0b9cf3aa6b1e ("mdacon messing up default vc's - set default to vc13-16 again")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Robin Murphy <robin.murphy@arm.com>
Date: Mon Oct 28 17:58:36 2024 +0000
media: omap3isp: Handle ARM dma_iommu_mapping
commit 6bc076eec6f85f778f33a8242b438e1bd9fcdd59 upstream.
It's no longer practical for the OMAP IOMMU driver to trick
arm_setup_iommu_dma_ops() into ignoring its presence, so let's use the
same tactic as other IOMMU API users on 32-bit ARM and explicitly kick
the arch code's dma_iommu_mapping out of the way to avoid problems.
Fixes: 4720287c7bf7 ("iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops()")
Cc: stable@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Sicelo A. Mhlongo <absicsz@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Thu Dec 5 11:06:21 2024 +0900
media: platform: allgro-dvt: unregister v4l2_device on the error path
[ Upstream commit c2b96a6818159fba8a3bcc38262da9e77f9b3ec7 ]
In allegro_probe(), the v4l2 device is not unregistered in the error
path, which results in a memory leak. Fix it by calling
v4l2_device_unregister() before returning error.
Fixes: d74d4e2359ec ("media: allegro: move driver out of staging")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Reviewed-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Murad Masimov <m.masimov@mt-integration.ru>
Date: Mon Jan 13 13:51:30 2025 +0300
media: streamzap: fix race between device disconnection and urb callback
commit f656cfbc7a293a039d6a0c7100e1c846845148c1 upstream.
Syzkaller has reported a general protection fault at function
ir_raw_event_store_with_filter(). This crash is caused by a NULL pointer
dereference of dev->raw pointer, even though it is checked for NULL in
the same function, which means there is a race condition. It occurs due
to the incorrect order of actions in the streamzap_disconnect() function:
rc_unregister_device() is called before usb_kill_urb(). The dev->raw
pointer is freed and set to NULL in rc_unregister_device(), and only
after that usb_kill_urb() waits for in-progress requests to finish.
If rc_unregister_device() is called while streamzap_callback() handler is
not finished, this can lead to accessing freed resources. Thus
rc_unregister_device() should be called after usb_kill_urb().
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 8e9e60640067 ("V4L/DVB: staging/lirc: port lirc_streamzap to ir-core")
Cc: stable@vger.kernel.org
Reported-by: syzbot+34008406ee9a31b13c73@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=34008406ee9a31b13c73
Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Date: Mon Jan 20 09:10:52 2025 +0100
media: verisilicon: HEVC: Initialize start_bit field
[ Upstream commit 7fcb42b3835e90ef18d68555934cf72adaf58402 ]
The HEVC driver needs to set the start_bit field explicitly to avoid
causing corrupted frames when the VP9 decoder is used in parallel. The
reason for this problem is that the VP9 and the HEVC decoder share this
register.
Fixes: cb5dd5a0fa51 ("media: hantro: Introduce G2/HEVC decoder")
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Tested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Date: Sun Mar 2 17:58:25 2025 +0300
media: vimc: skip .s_stream() for stopped entities
commit 36cef585e2a31e4ddf33a004b0584a7a572246de upstream.
Syzbot reported [1] a warning prompted by a check in call_s_stream()
that checks whether .s_stream() operation is warranted for unstarted
or stopped subdevs.
Add a simple fix in vimc_streamer_pipeline_terminate() ensuring that
entities skip a call to .s_stream() unless they have been previously
properly started.
[1] Syzbot report:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5933 at drivers/media/v4l2-core/v4l2-subdev.c:460 call_s_stream+0x2df/0x350 drivers/media/v4l2-core/v4l2-subdev.c:460
Modules linked in:
CPU: 0 UID: 0 PID: 5933 Comm: syz-executor330 Not tainted 6.13.0-rc2-syzkaller-00362-g2d8308bf5b67 #0
...
Call Trace:
<TASK>
vimc_streamer_pipeline_terminate+0x218/0x320 drivers/media/test-drivers/vimc/vimc-streamer.c:62
vimc_streamer_pipeline_init drivers/media/test-drivers/vimc/vimc-streamer.c:101 [inline]
vimc_streamer_s_stream+0x650/0x9a0 drivers/media/test-drivers/vimc/vimc-streamer.c:203
vimc_capture_start_streaming+0xa1/0x130 drivers/media/test-drivers/vimc/vimc-capture.c:256
vb2_start_streaming+0x15f/0x5a0 drivers/media/common/videobuf2/videobuf2-core.c:1789
vb2_core_streamon+0x2a7/0x450 drivers/media/common/videobuf2/videobuf2-core.c:2348
vb2_streamon drivers/media/common/videobuf2/videobuf2-v4l2.c:875 [inline]
vb2_ioctl_streamon+0xf4/0x170 drivers/media/common/videobuf2/videobuf2-v4l2.c:1118
__video_do_ioctl+0xaf0/0xf00 drivers/media/v4l2-core/v4l2-ioctl.c:3122
video_usercopy+0x4d2/0x1620 drivers/media/v4l2-core/v4l2-ioctl.c:3463
v4l2_ioctl+0x1ba/0x250 drivers/media/v4l2-core/v4l2-dev.c:366
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl fs/ioctl.c:892 [inline]
__x64_sys_ioctl+0x190/0x200 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f2b85c01b19
...
Reported-by: syzbot+5bcd7c809d365e14c4df@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5bcd7c809d365e14c4df
Fixes: adc589d2a208 ("media: vimc: Add vimc-streamer for stream control")
Cc: stable@vger.kernel.org
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Roger Quadros <rogerq@kernel.org>
Date: Mon Mar 10 15:15:14 2025 +0100
memory: omap-gpmc: drop no compatible check
[ Upstream commit edcccc6892f65eff5fd3027a13976131dc7fd733 ]
We are no longer depending on legacy device trees so
drop the no compatible check for NAND and OneNAND
nodes.
Suggested-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250114-omap-gpmc-drop-no-compatible-check-v1-1-262c8d549732@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Date: Wed Jan 15 09:12:06 2025 -0800
mfd: sm501: Switch to BIT() to mitigate integer overflows
[ Upstream commit 2d8cb9ffe18c2f1e5bd07a19cbce85b26c1d0cf0 ]
If offset end up being high enough, right hand expression in functions
like sm501_gpio_set() shifted left for that number of bits, may
not fit in int type.
Just in case, fix that by using BIT() both as an option safe from
overflow issues and to make this step look similar to other gpio
drivers.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: f61be273d369 ("sm501: add gpiolib support")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://lore.kernel.org/r/20250115171206.20308-1-n.zhandarovich@fintech.ru
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Hildenbrand <david@redhat.com>
Date: Mon Feb 10 20:37:43 2025 +0100
mm/gup: reject FOLL_SPLIT_PMD with hugetlb VMAs
commit 8977752c8056a6a094a279004a49722da15bace3 upstream.
Patch series "mm: fixes for device-exclusive entries (hmm)", v2.
Discussing the PageTail() call in make_device_exclusive_range() with
Willy, I recently discovered [1] that device-exclusive handling does not
properly work with THP, making the hmm-tests selftests fail if THPs are
enabled on the system.
Looking into more details, I found that hugetlb is not properly fenced,
and I realized that something that was bugging me for longer -- how
device-exclusive entries interact with mapcounts -- completely breaks
migration/swapout/split/hwpoison handling of these folios while they have
device-exclusive PTEs.
The program below can be used to allocate 1 GiB worth of pages and making
them device-exclusive on a kernel with CONFIG_TEST_HMM.
Once they are device-exclusive, these folios cannot get swapped out
(proc$pid/smaps_rollup will always indicate 1 GiB RSS no matter how much
one forces memory reclaim), and when having a memory block onlined to
ZONE_MOVABLE, trying to offline it will loop forever and complain about
failed migration of a page that should be movable.
# echo offline > /sys/devices/system/memory/memory136/state
# echo online_movable > /sys/devices/system/memory/memory136/state
# ./hmm-swap &
... wait until everything is device-exclusive
# echo offline > /sys/devices/system/memory/memory136/state
[ 285.193431][T14882] page: refcount:2 mapcount:0 mapping:0000000000000000
index:0x7f20671f7 pfn:0x442b6a
[ 285.196618][T14882] memcg:ffff888179298000
[ 285.198085][T14882] anon flags: 0x5fff0000002091c(referenced|uptodate|
dirty|active|owner_2|swapbacked|node=1|zone=3|lastcpupid=0x7ff)
[ 285.201734][T14882] raw: ...
[ 285.204464][T14882] raw: ...
[ 285.207196][T14882] page dumped because: migration failure
[ 285.209072][T14882] page_owner tracks the page as allocated
[ 285.210915][T14882] page last allocated via order 0, migratetype
Movable, gfp_mask 0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO),
id 14926, tgid 14926 (hmm-swap), ts 254506295376, free_ts 227402023774
[ 285.216765][T14882] post_alloc_hook+0x197/0x1b0
[ 285.218874][T14882] get_page_from_freelist+0x76e/0x3280
[ 285.220864][T14882] __alloc_frozen_pages_noprof+0x38e/0x2740
[ 285.223302][T14882] alloc_pages_mpol+0x1fc/0x540
[ 285.225130][T14882] folio_alloc_mpol_noprof+0x36/0x340
[ 285.227222][T14882] vma_alloc_folio_noprof+0xee/0x1a0
[ 285.229074][T14882] __handle_mm_fault+0x2b38/0x56a0
[ 285.230822][T14882] handle_mm_fault+0x368/0x9f0
...
This series fixes all issues I found so far. There is no easy way to fix
without a bigger rework/cleanup. I have a bunch of cleanups on top (some
previous sent, some the result of the discussion in v1) that I will send
out separately once this landed and I get to it.
I wish we could just use some special present PROT_NONE PTEs instead of
these (non-present, non-none) fake-swap entries; but that just results in
the same problem we keep having (lack of spare PTE bits), and staring at
other similar fake-swap entries, that ship has sailed.
With this series, make_device_exclusive() doesn't actually belong into
mm/rmap.c anymore, but I'll leave moving that for another day.
I only tested this series with the hmm-tests selftests due to lack of HW,
so I'd appreciate some testing, especially if the interaction between two
GPUs wanting a device-exclusive entry works as expected.
<program>
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <linux/types.h>
#include <linux/ioctl.h>
#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x05, struct hmm_dmirror_cmd)
struct hmm_dmirror_cmd {
__u64 addr;
__u64 ptr;
__u64 npages;
__u64 cpages;
__u64 faults;
};
const size_t size = 1 * 1024 * 1024 * 1024ul;
const size_t chunk_size = 2 * 1024 * 1024ul;
int main(void)
{
struct hmm_dmirror_cmd cmd;
size_t cur_size;
int fd, ret;
char *addr, *mirror;
fd = open("/dev/hmm_dmirror1", O_RDWR, 0);
if (fd < 0) {
perror("open failed\n");
exit(1);
}
addr = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap failed\n");
exit(1);
}
madvise(addr, size, MADV_NOHUGEPAGE);
memset(addr, 1, size);
mirror = malloc(chunk_size);
for (cur_size = 0; cur_size < size; cur_size += chunk_size) {
cmd.addr = (uintptr_t)addr + cur_size;
cmd.ptr = (uintptr_t)mirror;
cmd.npages = chunk_size / getpagesize();
ret = ioctl(fd, HMM_DMIRROR_EXCLUSIVE, &cmd);
if (ret) {
perror("ioctl failed\n");
exit(1);
}
}
pause();
return 0;
}
</program>
[1] https://lkml.kernel.org/r/25e02685-4f1d-47fa-be5b-01ff85bb0ce2@redhat.com
This patch (of 17):
We only have two FOLL_SPLIT_PMD users. While uprobe refuses hugetlb
early, make_device_exclusive_range() can end up getting called on hugetlb
VMAs.
Right now, this means that with a PMD-sized hugetlb page, we can end up
calling split_huge_pmd(), because pmd_trans_huge() also succeeds with
hugetlb PMDs.
For example, using a modified hmm-test selftest one can trigger:
[ 207.017134][T14945] ------------[ cut here ]------------
[ 207.018614][T14945] kernel BUG at mm/page_table_check.c:87!
[ 207.019716][T14945] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 207.021072][T14945] CPU: 3 UID: 0 PID: ...
[ 207.023036][T14945] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
[ 207.024834][T14945] RIP: 0010:page_table_check_clear.part.0+0x488/0x510
[ 207.026128][T14945] Code: ...
[ 207.029965][T14945] RSP: 0018:ffffc9000cb8f348 EFLAGS: 00010293
[ 207.031139][T14945] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffffffff8249a0cd
[ 207.032649][T14945] RDX: ffff88811e883c80 RSI: ffffffff8249a357 RDI: ffff88811e883c80
[ 207.034183][T14945] RBP: ffff888105c0a050 R08: 0000000000000005 R09: 0000000000000000
[ 207.035688][T14945] R10: 00000000ffffffff R11: 0000000000000003 R12: 0000000000000001
[ 207.037203][T14945] R13: 0000000000000200 R14: 0000000000000001 R15: dffffc0000000000
[ 207.038711][T14945] FS: 00007f2783275740(0000) GS:ffff8881f4980000(0000) knlGS:0000000000000000
[ 207.040407][T14945] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 207.041660][T14945] CR2: 00007f2782c00000 CR3: 0000000132356000 CR4: 0000000000750ef0
[ 207.043196][T14945] PKRU: 55555554
[ 207.043880][T14945] Call Trace:
[ 207.044506][T14945] <TASK>
[ 207.045086][T14945] ? __die+0x51/0x92
[ 207.045864][T14945] ? die+0x29/0x50
[ 207.046596][T14945] ? do_trap+0x250/0x320
[ 207.047430][T14945] ? do_error_trap+0xe7/0x220
[ 207.048346][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.049535][T14945] ? handle_invalid_op+0x34/0x40
[ 207.050494][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.051681][T14945] ? exc_invalid_op+0x2e/0x50
[ 207.052589][T14945] ? asm_exc_invalid_op+0x1a/0x20
[ 207.053596][T14945] ? page_table_check_clear.part.0+0x1fd/0x510
[ 207.054790][T14945] ? page_table_check_clear.part.0+0x487/0x510
[ 207.055993][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.057195][T14945] ? page_table_check_clear.part.0+0x487/0x510
[ 207.058384][T14945] __page_table_check_pmd_clear+0x34b/0x5a0
[ 207.059524][T14945] ? __pfx___page_table_check_pmd_clear+0x10/0x10
[ 207.060775][T14945] ? __pfx___mutex_unlock_slowpath+0x10/0x10
[ 207.061940][T14945] ? __pfx___lock_acquire+0x10/0x10
[ 207.062967][T14945] pmdp_huge_clear_flush+0x279/0x360
[ 207.064024][T14945] split_huge_pmd_locked+0x82b/0x3750
...
Before commit 9cb28da54643 ("mm/gup: handle hugetlb in the generic
follow_page_mask code"), we would have ignored the flag; instead, let's
simply refuse the combination completely in check_vma_flags(): the caller
is likely not prepared to handle any hugetlb folios.
We'll teach make_device_exclusive_range() separately to ignore any hugetlb
folios as a future-proof safety net.
Link: https://lkml.kernel.org/r/20250210193801.781278-1-david@redhat.com
Link: https://lkml.kernel.org/r/20250210193801.781278-2-david@redhat.com
Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Lyude <lyude@redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yanteng Si <si.yanteng@linux.dev>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Barry Song <v-songbaohua@oppo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yosry Ahmed <yosry.ahmed@linux.dev>
Date: Wed Feb 26 18:56:25 2025 +0000
mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead()
commit c11bcbc0a517acf69282c8225059b2a8ac5fe628 upstream.
Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
(through crypto_exit_scomp_ops_async()).
On the other hand, crypto_alloc_acomp_node() holds the scomp_lock (through
crypto_scomp_init_tfm()), and then allocates memory. If the allocation
results in reclaim, we may attempt to hold the per-CPU acomp_ctx mutex.
The above dependencies can cause an ABBA deadlock. For example in the
following scenario:
(1) Task A running on CPU #1:
crypto_alloc_acomp_node()
Holds scomp_lock
Enters reclaim
Reads per_cpu_ptr(pool->acomp_ctx, 1)
(2) Task A is descheduled
(3) CPU #1 goes offline
zswap_cpu_comp_dead(CPU #1)
Holds per_cpu_ptr(pool->acomp_ctx, 1))
Calls crypto_free_acomp()
Waits for scomp_lock
(4) Task A running on CPU #2:
Waits for per_cpu_ptr(pool->acomp_ctx, 1) // Read on CPU #1
DEADLOCK
Since there is no requirement to call crypto_free_acomp() with the per-CPU
acomp_ctx mutex held in zswap_cpu_comp_dead(), move it after the mutex is
unlocked. Also move the acomp_request_free() and kfree() calls for
consistency and to avoid any potential sublte locking dependencies in the
future.
With this, only setting acomp_ctx fields to NULL occurs with the mutex
held. This is similar to how zswap_cpu_comp_prepare() only initializes
acomp_ctx fields with the mutex held, after performing all allocations
before holding the mutex.
Opportunistically, move the NULL check on acomp_ctx so that it takes place
before the mutex dereference.
Link: https://lkml.kernel.org/r/20250226185625.2672936-1-yosry.ahmed@linux.dev
Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67bcea51.050a0220.bbfd1.0096.GAE@google.com/
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Tested-by: Nhat Pham <nphamcs@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Chris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Miaoqian Lin <linmq006@gmail.com>
Date: Tue Mar 18 22:02:25 2025 +0800
mmc: omap: Fix memory leak in mmc_omap_new_slot
commit 3834a759afb817e23a7a2f09c2c9911b0ce5c588 upstream.
Add err_free_host label to properly pair mmc_alloc_host() with
mmc_free_host() in GPIO error paths. The allocated host memory was
leaked when GPIO lookups failed.
Fixes: e519f0bb64ef ("ARM/mmc: Convert old mmci-omap to GPIO descriptors")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318140226.19650-1-linmq006@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ulf Hansson <ulf.hansson@linaro.org>
Date: Wed Mar 12 13:17:12 2025 +0100
mmc: sdhci-omap: Disable MMC_CAP_AGGRESSIVE_PM for eMMC/SD
commit 49d162635151d0dd04935070d7cf67137ab863aa upstream.
We have received reports about cards can become corrupt related to the
aggressive PM support. Let's make a partial revert of the change that
enabled the feature.
Reported-by: David Owens <daowens01@gmail.com>
Reported-by: Romain Naour <romain.naour@smile.fr>
Reported-by: Robert Nelson <robertcnelson@gmail.com>
Tested-by: Robert Nelson <robertcnelson@gmail.com>
Fixes: 3edf588e7fe0 ("mmc: sdhci-omap: Allow SDIO card power off and enable aggressive PM")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20250312121712.1168007-1-ulf.hansson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Karel Balej <balejk@matfyz.cz>
Date: Mon Mar 10 15:07:04 2025 +0100
mmc: sdhci-pxav3: set NEED_RSP_BUSY capability
commit a41fcca4b342811b473bbaa4b44f1d34d87fcce6 upstream.
Set the MMC_CAP_NEED_RSP_BUSY capability for the sdhci-pxav3 host to
prevent conversion of R1B responses to R1. Without this, the eMMC card
in the samsung,coreprimevelte smartphone using the Marvell PXA1908 SoC
with this mmc host doesn't probe with the ETIMEDOUT error originating in
__mmc_poll_for_busy.
Note that the other issues reported for this phone and host, namely
floods of "Tuning failed, falling back to fixed sampling clock" dmesg
messages for the eMMC and unstable SDIO are not mitigated by this
change.
Link: https://lore.kernel.org/r/20200310153340.5593-1-ulf.hansson@linaro.org/
Link: https://lore.kernel.org/r/D7204PWIGQGI.1FRFQPPIEE2P9@matfyz.cz/
Link: https://lore.kernel.org/r/20250115-pxa1908-lkml-v14-0-847d24f3665a@skole.hr/
Cc: stable@vger.kernel.org
Signed-off-by: Karel Balej <balejk@matfyz.cz>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Duje Mihanović <duje.mihanovic@skole.hr>
Link: https://lore.kernel.org/r/20250310140707.23459-1-balejk@matfyz.cz
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lama Kayal <lkayal@nvidia.com>
Date: Sun Mar 23 14:28:26 2025 +0200
net/mlx5e: SHAMPO, Make reserved size independent of page size
[ Upstream commit fab05835688526f9de123d1e98e4d1f838da4e22 ]
When hw-gro is enabled, the maximum number of header entries that are
needed per wqe (hd_per_wqe) is calculated based on the size of the
reservations among other parameters.
Miscalculation of the size of reservations leads to incorrect
calculation of hd_per_wqe as 0, particularly in the case of large page
size like in aarch64, this prevents the SHAMPO header from being
correctly initialized in the device, ultimately causing the following
cqe err that indicates a violation of PD.
mlx5_core 0000:00:08.0 eth2: ERR CQE on RQ: 0x1180
mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x510, ci 0x0, qn 0x1180, opcode 0xe, syndrome 0x4, vendor syndrome 0x32
00000000: 00 00 00 00 04 4a 00 00 00 00 00 00 20 00 93 32
00000010: 55 00 00 00 fb cc 00 00 00 00 00 00 07 18 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a
00000030: 00 00 00 9a 93 00 32 04 00 00 00 00 00 00 da e1
Use the correct formula for calculating the size of reservations,
precisely it shouldn't be dependent on page size, instead use the
correct multiply of MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE.
Fixes: e5ca8fb08ab2 ("net/mlx5e: Add control path for SHAMPO feature")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1742732906-166564-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antoine Tenart <atenart@kernel.org>
Date: Wed Mar 26 18:36:32 2025 +0100
net: decrease cached dst counters in dst_release
[ Upstream commit 3a0a3ff6593d670af2451ec363ccb7b18aec0c0a ]
Upstream fix ac888d58869b ("net: do not delay dst_entries_add() in
dst_release()") moved decrementing the dst count from dst_destroy to
dst_release to avoid accessing already freed data in case of netns
dismantle. However in case CONFIG_DST_CACHE is enabled and OvS+tunnels
are used, this fix is incomplete as the same issue will be seen for
cached dsts:
Unable to handle kernel paging request at virtual address ffff5aabf6b5c000
Call trace:
percpu_counter_add_batch+0x3c/0x160 (P)
dst_release+0xec/0x108
dst_cache_destroy+0x68/0xd8
dst_destroy+0x13c/0x168
dst_destroy_rcu+0x1c/0xb0
rcu_do_batch+0x18c/0x7d0
rcu_core+0x174/0x378
rcu_core_si+0x18/0x30
Fix this by invalidating the cache, and thus decrementing cached dst
counters, in dst_release too.
Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250326173634.31096-1-atenart@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Taehee Yoo <ap420073@gmail.com>
Date: Sun Mar 9 13:42:18 2025 +0000
net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
[ Upstream commit a70f891e0fa0435379ad4950e156a15a4ef88b4d ]
When devmem socket is closed, netdev_rx_queue_restart() is called to
reset queue by the net_devmem_unbind_dmabuf(). But callback may return
-ENETDOWN if the interface is down because queues are already freed
when the interface is down so queue reset is not needed.
So, it should not warn if the return value is -ENETDOWN.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250309134219.91670-8-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
Date: Tue Apr 1 15:56:37 2025 +0200
net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy
[ Upstream commit a58d882841a0750da3c482cd3d82432b1c7edb77 ]
The mv88e6xxx has an internal PPU that polls PHY state. If we want to
access the internal PHYs, we need to disable the PPU first. Because
that is a slow operation, a 10ms timer is used to re-enable it,
canceled with every access, so bulk operations effectively only
disable it once and re-enable it some 10ms after the last access.
If a PHY is accessed and then the mv88e6xxx module is removed before
the 10ms are up, the PPU re-enable ends up accessing a dangling pointer.
This especially affects probing during bootup. The MDIO bus and PHY
registration may succeed, but registration with the DSA framework
may fail later on (e.g. because the CPU port depends on another,
very slow device that isn't done probing yet, returning -EPROBE_DEFER).
In this case, probe() fails, but the MDIO subsystem may already have
accessed the MIDO bus or PHYs, arming the timer.
This is fixed as follows:
- If probe fails after mv88e6xxx_phy_init(), make sure we also call
mv88e6xxx_phy_destroy() before returning
- In mv88e6xxx_remove(), make sure we do the teardown in the correct
order, calling mv88e6xxx_phy_destroy() after unregistering the
switch device.
- In mv88e6xxx_phy_destroy(), destroy both the timer and the work item
that the timer might schedule, synchronously waiting in case one of
the callbacks already fired and destroying the timer first, before
waiting for the work item.
- Access to the PPU is guarded by a mutex, the worker acquires it
with a mutex_trylock(), not proceeding with the expensive shutdown
if that fails. We grab the mutex in mv88e6xxx_phy_destroy() to make
sure the slow PPU shutdown is already done or won't even enter, when
we wait for the work item.
Fixes: 2e5f032095ff ("dsa: add support for the Marvell 88E6131 switch chip")
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250401135705.92760-1-david.oberhollenzer@sigma-star.at
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Thu Feb 27 16:45:34 2025 -0800
net: dsa: rtl8366rb: don't prompt users for LED control
[ Upstream commit c34424eb3be4c01db831428c0d7d483701ae820f ]
Make NET_DSA_REALTEK_RTL8366RB_LEDS a hidden symbol.
It seems very unlikely user would want to intentionally
disable it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250228004534.3428681-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lin Ma <linma@zju.edu.cn>
Date: Thu Apr 3 00:56:32 2025 +0800
net: fix geneve_opt length integer overflow
[ Upstream commit b27055a08ad4b415dcf15b63034f9cb236f7fb40 ]
struct geneve_opt uses 5 bit length for each single option, which
means every vary size option should be smaller than 128 bytes.
However, all current related Netlink policies cannot promise this
length condition and the attacker can exploit a exact 128-byte size
option to *fake* a zero length option and confuse the parsing logic,
further achieve heap out-of-bounds read.
One example crash log is like below:
[ 3.905425] ==================================================================
[ 3.905925] BUG: KASAN: slab-out-of-bounds in nla_put+0xa9/0xe0
[ 3.906255] Read of size 124 at addr ffff888005f291cc by task poc/177
[ 3.906646]
[ 3.906775] CPU: 0 PID: 177 Comm: poc-oob-read Not tainted 6.1.132 #1
[ 3.907131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 3.907784] Call Trace:
[ 3.907925] <TASK>
[ 3.908048] dump_stack_lvl+0x44/0x5c
[ 3.908258] print_report+0x184/0x4be
[ 3.909151] kasan_report+0xc5/0x100
[ 3.909539] kasan_check_range+0xf3/0x1a0
[ 3.909794] memcpy+0x1f/0x60
[ 3.909968] nla_put+0xa9/0xe0
[ 3.910147] tunnel_key_dump+0x945/0xba0
[ 3.911536] tcf_action_dump_1+0x1c1/0x340
[ 3.912436] tcf_action_dump+0x101/0x180
[ 3.912689] tcf_exts_dump+0x164/0x1e0
[ 3.912905] fw_dump+0x18b/0x2d0
[ 3.913483] tcf_fill_node+0x2ee/0x460
[ 3.914778] tfilter_notify+0xf4/0x180
[ 3.915208] tc_new_tfilter+0xd51/0x10d0
[ 3.918615] rtnetlink_rcv_msg+0x4a2/0x560
[ 3.919118] netlink_rcv_skb+0xcd/0x200
[ 3.919787] netlink_unicast+0x395/0x530
[ 3.921032] netlink_sendmsg+0x3d0/0x6d0
[ 3.921987] __sock_sendmsg+0x99/0xa0
[ 3.922220] __sys_sendto+0x1b7/0x240
[ 3.922682] __x64_sys_sendto+0x72/0x90
[ 3.922906] do_syscall_64+0x5e/0x90
[ 3.923814] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 3.924122] RIP: 0033:0x7e83eab84407
[ 3.924331] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[ 3.925330] RSP: 002b:00007ffff505e370 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 3.925752] RAX: ffffffffffffffda RBX: 00007e83eaafa740 RCX: 00007e83eab84407
[ 3.926173] RDX: 00000000000001a8 RSI: 00007ffff505e3c0 RDI: 0000000000000003
[ 3.926587] RBP: 00007ffff505f460 R08: 00007e83eace1000 R09: 000000000000000c
[ 3.926977] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffff505f3c0
[ 3.927367] R13: 00007ffff505f5c8 R14: 00007e83ead1b000 R15: 00005d4fbbe6dcb8
Fix these issues by enforing correct length condition in related
policies.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts")
Fixes: 4ece47787077 ("lwtunnel: add options setting and dumping for geneve")
Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key")
Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250402165632.6958-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dave Marquardt <davemarq@linux.ibm.com>
Date: Wed Apr 2 10:44:03 2025 -0500
net: ibmveth: make veth_pool_store stop hanging
[ Upstream commit 053f3ff67d7feefc75797863f3d84b47ad47086f ]
v2:
- Created a single error handling unlock and exit in veth_pool_store
- Greatly expanded commit message with previous explanatory-only text
Summary: Use rtnl_mutex to synchronize veth_pool_store with itself,
ibmveth_close and ibmveth_open, preventing multiple calls in a row to
napi_disable.
Background: Two (or more) threads could call veth_pool_store through
writing to /sys/devices/vio/30000002/pool*/*. You can do this easily
with a little shell script. This causes a hang.
I configured LOCKDEP, compiled ibmveth.c with DEBUG, and built a new
kernel. I ran this test again and saw:
Setting pool0/active to 0
Setting pool1/active to 1
[ 73.911067][ T4365] ibmveth 30000002 eth0: close starting
Setting pool1/active to 1
Setting pool1/active to 0
[ 73.911367][ T4366] ibmveth 30000002 eth0: close starting
[ 73.916056][ T4365] ibmveth 30000002 eth0: close complete
[ 73.916064][ T4365] ibmveth 30000002 eth0: open starting
[ 110.808564][ T712] systemd-journald[712]: Sent WATCHDOG=1 notification.
[ 230.808495][ T712] systemd-journald[712]: Sent WATCHDOG=1 notification.
[ 243.683786][ T123] INFO: task stress.sh:4365 blocked for more than 122 seconds.
[ 243.683827][ T123] Not tainted 6.14.0-01103-g2df0c02dab82-dirty #8
[ 243.683833][ T123] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 243.683838][ T123] task:stress.sh state:D stack:28096 pid:4365 tgid:4365 ppid:4364 task_flags:0x400040 flags:0x00042000
[ 243.683852][ T123] Call Trace:
[ 243.683857][ T123] [c00000000c38f690] [0000000000000001] 0x1 (unreliable)
[ 243.683868][ T123] [c00000000c38f840] [c00000000001f908] __switch_to+0x318/0x4e0
[ 243.683878][ T123] [c00000000c38f8a0] [c000000001549a70] __schedule+0x500/0x12a0
[ 243.683888][ T123] [c00000000c38f9a0] [c00000000154a878] schedule+0x68/0x210
[ 243.683896][ T123] [c00000000c38f9d0] [c00000000154ac80] schedule_preempt_disabled+0x30/0x50
[ 243.683904][ T123] [c00000000c38fa00] [c00000000154dbb0] __mutex_lock+0x730/0x10f0
[ 243.683913][ T123] [c00000000c38fb10] [c000000001154d40] napi_enable+0x30/0x60
[ 243.683921][ T123] [c00000000c38fb40] [c000000000f4ae94] ibmveth_open+0x68/0x5dc
[ 243.683928][ T123] [c00000000c38fbe0] [c000000000f4aa20] veth_pool_store+0x220/0x270
[ 243.683936][ T123] [c00000000c38fc70] [c000000000826278] sysfs_kf_write+0x68/0xb0
[ 243.683944][ T123] [c00000000c38fcb0] [c0000000008240b8] kernfs_fop_write_iter+0x198/0x2d0
[ 243.683951][ T123] [c00000000c38fd00] [c00000000071b9ac] vfs_write+0x34c/0x650
[ 243.683958][ T123] [c00000000c38fdc0] [c00000000071bea8] ksys_write+0x88/0x150
[ 243.683966][ T123] [c00000000c38fe10] [c0000000000317f4] system_call_exception+0x124/0x340
[ 243.683973][ T123] [c00000000c38fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
...
[ 243.684087][ T123] Showing all locks held in the system:
[ 243.684095][ T123] 1 lock held by khungtaskd/123:
[ 243.684099][ T123] #0: c00000000278e370 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x50/0x248
[ 243.684114][ T123] 4 locks held by stress.sh/4365:
[ 243.684119][ T123] #0: c00000003a4cd3f8 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x88/0x150
[ 243.684132][ T123] #1: c000000041aea888 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x154/0x2d0
[ 243.684143][ T123] #2: c0000000366fb9a8 (kn->active#64){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x160/0x2d0
[ 243.684155][ T123] #3: c000000035ff4cb8 (&dev->lock){+.+.}-{3:3}, at: napi_enable+0x30/0x60
[ 243.684166][ T123] 5 locks held by stress.sh/4366:
[ 243.684170][ T123] #0: c00000003a4cd3f8 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x88/0x150
[ 243.684183][ T123] #1: c00000000aee2288 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x154/0x2d0
[ 243.684194][ T123] #2: c0000000366f4ba8 (kn->active#64){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x160/0x2d0
[ 243.684205][ T123] #3: c000000035ff4cb8 (&dev->lock){+.+.}-{3:3}, at: napi_disable+0x30/0x60
[ 243.684216][ T123] #4: c0000003ff9bbf18 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x138/0x12a0
From the ibmveth debug, two threads are calling veth_pool_store, which
calls ibmveth_close and ibmveth_open. Here's the sequence:
T4365 T4366
----------------- ----------------- ---------
veth_pool_store veth_pool_store
ibmveth_close
ibmveth_close
napi_disable
napi_disable
ibmveth_open
napi_enable <- HANG
ibmveth_close calls napi_disable at the top and ibmveth_open calls
napi_enable at the top.
https://docs.kernel.org/networking/napi.html]] says
The control APIs are not idempotent. Control API calls are safe
against concurrent use of datapath APIs but an incorrect sequence of
control API calls may result in crashes, deadlocks, or race
conditions. For example, calling napi_disable() multiple times in a
row will deadlock.
In the normal open and close paths, rtnl_mutex is acquired to prevent
other callers. This is missing from veth_pool_store. Use rtnl_mutex in
veth_pool_store fixes these hangs.
Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Fixes: 860f242eb534 ("[PATCH] ibmveth change buffer pools dynamically")
Reviewed-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250402154403.386744-1-davemarq@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tobias Waldekranz <tobias@waldekranz.com>
Date: Tue Apr 1 08:58:04 2025 +0200
net: mvpp2: Prevent parser TCAM memory corruption
[ Upstream commit 96844075226b49af25a69a1d084b648ec2d9b08d ]
Protect the parser TCAM/SRAM memory, and the cached (shadow) SRAM
information, from concurrent modifications.
Both the TCAM and SRAM tables are indirectly accessed by configuring
an index register that selects the row to read or write to. This means
that operations must be atomic in order to, e.g., avoid spreading
writes across multiple rows. Since the shadow SRAM array is used to
find free rows in the hardware table, it must also be protected in
order to avoid TOCTOU errors where multiple cores allocate the same
row.
This issue was detected in a situation where `mvpp2_set_rx_mode()` ran
concurrently on two CPUs. In this particular case the
MVPP2_PE_MAC_UC_PROMISCUOUS entry was corrupted, causing the
classifier unit to drop all incoming unicast - indicated by the
`rx_classifier_drops` counter.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250401065855.3113635-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jim Liu <jim.t90615@gmail.com>
Date: Thu Mar 27 14:29:42 2025 +0800
net: phy: broadcom: Correct BCM5221 PHY model detection
[ Upstream commit 4f1eaabb4b66a1f7473f584e14e15b2ac19dfaf3 ]
Correct detect condition is applied to the entire 5221 family of PHYs.
Fixes: 3abbd0699b67 ("net: phy: broadcom: add support for BCM5221 phy")
Signed-off-by: Jim Liu <jim.t90615@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sat Mar 29 15:25:35 2025 -0700
net_sched: skbprio: Remove overly strict queue assertions
[ Upstream commit ce8fe975fd99b49c29c42e50f2441ba53112b2e8 ]
In the current implementation, skbprio enqueue/dequeue contains an assertion
that fails under certain conditions when SKBPRIO is used as a child qdisc under
TBF with specific parameters. The failure occurs because TBF sometimes peeks at
packets in the child qdisc without actually dequeuing them when tokens are
unavailable.
This peek operation creates a discrepancy between the parent and child qdisc
queue length counters. When TBF later receives a high-priority packet,
SKBPRIO's queue length may show a different value than what's reflected in its
internal priority queue tracking, triggering the assertion.
The fix removes this overly strict assertions in SKBPRIO, they are not
necessary at all.
Reported-by: syzbot+a3422a19b05ea96bee18@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a3422a19b05ea96bee18
Fixes: aea5f654e6b7 ("net/sched: add skbprio scheduler")
Cc: Nishanth Devarajan <ndev2021@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250329222536.696204-2-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Tue Apr 1 14:36:47 2025 +0200
netfilter: nf_tables: don't unregister hook when table is dormant
[ Upstream commit 688c15017d5cd5aac882400782e7213d40dc3556 ]
When nf_tables_updchain encounters an error, hook registration needs to
be rolled back.
This should only be done if the hook has been registered, which won't
happen when the table is flagged as dormant (inactive).
Just move the assignment into the registration block.
Reported-by: syzbot+53ed3a6440173ddbf499@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=53ed3a6440173ddbf499
Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Fri Mar 21 23:24:20 2025 +0100
netfilter: nft_set_hash: GC reaps elements with conncount for dynamic sets only
[ Upstream commit 9d74da1177c800eb3d51c13f9821b7b0683845a5 ]
conncount has its own GC handler which determines when to reap stale
elements, this is convenient for dynamic sets. However, this also reaps
non-dynamic sets with static configurations coming from control plane.
Always run connlimit gc handler but honor feedback to reap element if
this set is dynamic.
Fixes: 290180e2448c ("netfilter: nf_tables: add connlimit support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lin Ma <linma@zju.edu.cn>
Date: Thu Apr 3 01:00:26 2025 +0800
netfilter: nft_tunnel: fix geneve_opt type confusion addition
[ Upstream commit 1b755d8eb1ace3870789d48fbd94f386ad6e30be ]
When handling multiple NFTA_TUNNEL_KEY_OPTS_GENEVE attributes, the
parsing logic should place every geneve_opt structure one by one
compactly. Hence, when deciding the next geneve_opt position, the
pointer addition should be in units of char *.
However, the current implementation erroneously does type conversion
before the addition, which will lead to heap out-of-bounds write.
[ 6.989857] ==================================================================
[ 6.990293] BUG: KASAN: slab-out-of-bounds in nft_tunnel_obj_init+0x977/0xa70
[ 6.990725] Write of size 124 at addr ffff888005f18974 by task poc/178
[ 6.991162]
[ 6.991259] CPU: 0 PID: 178 Comm: poc-oob-write Not tainted 6.1.132 #1
[ 6.991655] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 6.992281] Call Trace:
[ 6.992423] <TASK>
[ 6.992586] dump_stack_lvl+0x44/0x5c
[ 6.992801] print_report+0x184/0x4be
[ 6.993790] kasan_report+0xc5/0x100
[ 6.994252] kasan_check_range+0xf3/0x1a0
[ 6.994486] memcpy+0x38/0x60
[ 6.994692] nft_tunnel_obj_init+0x977/0xa70
[ 6.995677] nft_obj_init+0x10c/0x1b0
[ 6.995891] nf_tables_newobj+0x585/0x950
[ 6.996922] nfnetlink_rcv_batch+0xdf9/0x1020
[ 6.998997] nfnetlink_rcv+0x1df/0x220
[ 6.999537] netlink_unicast+0x395/0x530
[ 7.000771] netlink_sendmsg+0x3d0/0x6d0
[ 7.001462] __sock_sendmsg+0x99/0xa0
[ 7.001707] ____sys_sendmsg+0x409/0x450
[ 7.002391] ___sys_sendmsg+0xfd/0x170
[ 7.003145] __sys_sendmsg+0xea/0x170
[ 7.004359] do_syscall_64+0x5e/0x90
[ 7.005817] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 7.006127] RIP: 0033:0x7ec756d4e407
[ 7.006339] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[ 7.007364] RSP: 002b:00007ffed5d46760 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[ 7.007827] RAX: ffffffffffffffda RBX: 00007ec756cc4740 RCX: 00007ec756d4e407
[ 7.008223] RDX: 0000000000000000 RSI: 00007ffed5d467f0 RDI: 0000000000000003
[ 7.008620] RBP: 00007ffed5d468a0 R08: 0000000000000000 R09: 0000000000000000
[ 7.009039] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[ 7.009429] R13: 00007ffed5d478b0 R14: 00007ec756ee5000 R15: 00005cbd4e655cb8
Fix this bug with correct pointer addition and conversion in parse
and dump code.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Mar 14 16:41:59 2025 +0000
netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
[ Upstream commit 07c574eb53d4cc9aa7b985bc8bfcb302e5dc4694 ]
Fix netfs_unbuffered_read() to return an ssize_t rather than an int as
netfs_wait_for_read() returns ssize_t and this gets implicitly truncated.
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-5-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Viacheslav Dubeyko <slava@dubeyko.com>
cc: Alex Markuze <amarkuze@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Debin Zhu <mowenroot@163.com>
Date: Tue Apr 1 20:40:18 2025 +0800
netlabel: Fix NULL pointer exception caused by CALIPSO on IPv4 sockets
[ Upstream commit 078aabd567de3d63d37d7673f714e309d369e6e2 ]
When calling netlbl_conn_setattr(), addr->sa_family is used
to determine the function behavior. If sk is an IPv4 socket,
but the connect function is called with an IPv6 address,
the function calipso_sock_setattr() is triggered.
Inside this function, the following code is executed:
sk_fullsock(__sk) ? inet_sk(__sk)->pinet6 : NULL;
Since sk is an IPv4 socket, pinet6 is NULL, leading to a
null pointer dereference.
This patch fixes the issue by checking if inet6_sk(sk)
returns a NULL pointer before accessing pinet6.
Signed-off-by: Debin Zhu <mowenroot@163.com>
Signed-off-by: Bitao Ouyang <1985755126@qq.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Fixes: ceba1832b1b2 ("calipso: Set the calipso socket label to match the secattr.")
Link: https://patch.msgid.link/20250401124018.4763-1-mowenroot@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Wed Apr 2 14:02:40 2025 +0300
nfs: Add missing release on error in nfs_lock_and_join_requests()
[ Upstream commit 8e5419d6542fdf2dca9a0acdef2b8255f0e4ba69 ]
Call nfs_release_request() on this error path before returning.
Fixes: c3f2235782c3 ("nfs: fold nfs_folio_find_and_lock_request into nfs_lock_and_join_requests")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/3aaaa3d5-1c8a-41e4-98c7-717801ddd171@stanley.mountain
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: NeilBrown <neilb@suse.de>
Date: Wed Dec 4 13:53:09 2024 +1100
NFS: fix open_owner_id_maxsz and related fields.
[ Upstream commit 43502f6e8d1e767d6736ea0676cc784025cf6eeb ]
A recent change increased the size of an NFSv4 open owner, but didn't
increase the corresponding max_sz defines. This is not know to have
caused failure, but should be fixed.
This patch also fixes some relates _maxsz fields that are wrong.
Note that the XXX_owner_id_maxsz values now are only the size of the id
and do NOT include the len field that will always preceed the id in xdr
encoding. I think this is clearer.
Reported-by: David Disseldorp <ddiss@suse.com>
Fixes: d98f72272500 ("nfs: simplify and guarantee owner uniqueness.")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue Mar 25 17:58:50 2025 -0400
NFS: Shut down the nfs_client only after all the superblocks
[ Upstream commit 2d3e998a0bc7fe26a724f87a8ce217848040520e ]
The nfs_client manages state for all the superblocks in the
"cl_superblocks" list, so it must not be shut down until all of them are
gone.
Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when shutdown")
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jeff Layton <jlayton@kernel.org>
Date: Thu Feb 13 09:08:29 2025 -0500
nfsd: allow SC_STATUS_FREEABLE when searching via nfs4_lookup_stateid()
commit d1bc15b147d35b4cb7ca99a9a7d79d41ca342c13 upstream.
The pynfs DELEG8 test fails when run against nfsd. It acquires a
delegation and then lets the lease time out. It then tries to use the
deleg stateid and expects to see NFS4ERR_DELEG_REVOKED, but it gets
bad NFS4ERR_BAD_STATEID instead.
When a delegation is revoked, it's initially marked with
SC_STATUS_REVOKED, or SC_STATUS_ADMIN_REVOKED and later, it's marked
with the SC_STATUS_FREEABLE flag, which denotes that it is waiting for
s FREE_STATEID call.
nfs4_lookup_stateid() accepts a statusmask that includes the status
flags that a found stateid is allowed to have. Currently, that mask
never includes SC_STATUS_FREEABLE, which means that revoked delegations
are (almost) never found.
Add SC_STATUS_FREEABLE to the always-allowed status flags, and remove it
from nfsd4_delegreturn() since it's now always implied.
Fixes: 8dd91e8d31fe ("nfsd: fix race between laundromat and free_stateid")
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Olga Kornievskaia <okorniev@redhat.com>
Date: Fri Jan 17 11:32:58 2025 -0500
nfsd: fix management of listener transports
commit d093c90892607be505e801469d6674459e69ab89 upstream.
Currently, when no active threads are running, a root user using nfsdctl
command can try to remove a particular listener from the list of previously
added ones, then start the server by increasing the number of threads,
it leads to the following problem:
[ 158.835354] refcount_t: addition on 0; use-after-free.
[ 158.835603] WARNING: CPU: 2 PID: 9145 at lib/refcount.c:25 refcount_warn_saturate+0x160/0x1a0
[ 158.836017] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd auth_rpcgss nfs_acl lockd grace overlay isofs uinput snd_seq_dummy snd_hrtimer nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables qrtr sunrpc vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_hda_codec_generic mc e1000e snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore sg loop dm_multipath dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs libcrc32c crct10dif_ce ghash_ce vmwgfx sha2_ce sha256_arm64 sr_mod sha1_ce cdrom nvme drm_client_lib drm_ttm_helper ttm nvme_core drm_kms_helper nvme_auth drm fuse
[ 158.840093] CPU: 2 UID: 0 PID: 9145 Comm: nfsd Kdump: loaded Tainted: G B W 6.13.0-rc6+ #7
[ 158.840624] Tainted: [B]=BAD_PAGE, [W]=WARN
[ 158.840802] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024
[ 158.841220] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 158.841563] pc : refcount_warn_saturate+0x160/0x1a0
[ 158.841780] lr : refcount_warn_saturate+0x160/0x1a0
[ 158.842000] sp : ffff800089be7d80
[ 158.842147] x29: ffff800089be7d80 x28: ffff00008e68c148 x27: ffff00008e68c148
[ 158.842492] x26: ffff0002e3b5c000 x25: ffff600011cd1829 x24: ffff00008653c010
[ 158.842832] x23: ffff00008653c000 x22: 1fffe00011cd1829 x21: ffff00008653c028
[ 158.843175] x20: 0000000000000002 x19: ffff00008653c010 x18: 0000000000000000
[ 158.843505] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 158.843836] x14: 0000000000000000 x13: 0000000000000001 x12: ffff600050a26493
[ 158.844143] x11: 1fffe00050a26492 x10: ffff600050a26492 x9 : dfff800000000000
[ 158.844475] x8 : 00009fffaf5d9b6e x7 : ffff000285132493 x6 : 0000000000000001
[ 158.844823] x5 : ffff000285132490 x4 : ffff600050a26493 x3 : ffff8000805e72bc
[ 158.845174] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000098588000
[ 158.845528] Call trace:
[ 158.845658] refcount_warn_saturate+0x160/0x1a0 (P)
[ 158.845894] svc_recv+0x58c/0x680 [sunrpc]
[ 158.846183] nfsd+0x1fc/0x348 [nfsd]
[ 158.846390] kthread+0x274/0x2f8
[ 158.846546] ret_from_fork+0x10/0x20
[ 158.846714] ---[ end trace 0000000000000000 ]---
nfsd_nl_listener_set_doit() would manipulate the list of transports of
server's sv_permsocks and close the specified listener but the other
list of transports (server's sp_xprts list) would not be changed leading
to the problem above.
Instead, determined if the nfsdctl is trying to remove a listener, in
which case, delete all the existing listener transports and re-create
all-but-the-removed ones.
Fixes: 16a471177496 ("NFSD: add listener-{set,get} netlink command")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Sun Jan 26 16:50:18 2025 -0500
NFSD: Never return NFS4ERR_FILE_OPEN when removing a directory
commit 370345b4bd184a49ac68d6591801e5e3605b355a upstream.
RFC 8881 Section 18.25.4 paragraph 5 tells us that the server
should return NFS4ERR_FILE_OPEN only if the target object is an
opened file. This suggests that returning this status when removing
a directory will confuse NFS clients.
This is a version-specific issue; nfsd_proc_remove/rmdir() and
nfsd3_proc_remove/rmdir() already return nfserr_access as
appropriate.
Unfortunately there is no quick way for nfsd4_remove() to determine
whether the target object is a file or not, so the check is done in
in nfsd_unlink() for now.
Reported-by: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 466e16f0920f ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Sun Jan 26 16:50:17 2025 -0500
NFSD: nfsd_unlink() clobbers non-zero status returned from fh_fill_pre_attrs()
commit d7d8e3169b56e7696559a2427c922c0d55debcec upstream.
If fh_fill_pre_attrs() returns a non-zero status, the error flow
takes it through out_unlock, which then overwrites the returned
status code with
err = nfserrno(host_err);
Fixes: a332018a91c4 ("nfsd: handle failure to collect pre/post-op attrs more sanely")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Li Lingfeng <lilingfeng3@huawei.com>
Date: Thu Feb 13 22:42:20 2025 +0800
nfsd: put dl_stid if fail to queue dl_recall
commit 230ca758453c63bd38e4d9f4a21db698f7abada8 upstream.
Before calling nfsd4_run_cb to queue dl_recall to the callback_wq, we
increment the reference count of dl_stid.
We expect that after the corresponding work_struct is processed, the
reference count of dl_stid will be decremented through the callback
function nfsd4_cb_recall_release.
However, if the call to nfsd4_run_cb fails, the incremented reference
count of dl_stid will not be decremented correspondingly, leading to the
following nfs4_stid leak:
unreferenced object 0xffff88812067b578 (size 344):
comm "nfsd", pid 2761, jiffies 4295044002 (age 5541.241s)
hex dump (first 32 bytes):
01 00 00 00 6b 6b 6b 6b b8 02 c0 e2 81 88 ff ff ....kkkk........
00 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 ad 4e ad de .kkkkkkk.....N..
backtrace:
kmem_cache_alloc+0x4b9/0x700
nfsd4_process_open1+0x34/0x300
nfsd4_open+0x2d1/0x9d0
nfsd4_proc_compound+0x7a2/0xe30
nfsd_dispatch+0x241/0x3e0
svc_process_common+0x5d3/0xcc0
svc_process+0x2a3/0x320
nfsd+0x180/0x2e0
kthread+0x199/0x1d0
ret_from_fork+0x30/0x50
ret_from_fork_asm+0x1b/0x30
unreferenced object 0xffff8881499f4d28 (size 368):
comm "nfsd", pid 2761, jiffies 4295044005 (age 5541.239s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 30 4d 9f 49 81 88 ff ff ........0M.I....
30 4d 9f 49 81 88 ff ff 20 00 00 00 01 00 00 00 0M.I.... .......
backtrace:
kmem_cache_alloc+0x4b9/0x700
nfs4_alloc_stid+0x29/0x210
alloc_init_deleg+0x92/0x2e0
nfs4_set_delegation+0x284/0xc00
nfs4_open_delegation+0x216/0x3f0
nfsd4_process_open2+0x2b3/0xee0
nfsd4_open+0x770/0x9d0
nfsd4_proc_compound+0x7a2/0xe30
nfsd_dispatch+0x241/0x3e0
svc_process_common+0x5d3/0xcc0
svc_process+0x2a3/0x320
nfsd+0x180/0x2e0
kthread+0x199/0x1d0
ret_from_fork+0x30/0x50
ret_from_fork_asm+0x1b/0x30
Fix it by checking the result of nfsd4_run_cb and call nfs4_put_stid if
fail to queue dl_recall.
Cc: stable@vger.kernel.org
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Tue Jan 14 17:09:24 2025 -0500
NFSD: Skip sending CB_RECALL_ANY when the backchannel isn't up
commit 8a388c1fabeb6606e16467b23242416c0dbeffad upstream.
NFSD sends CB_RECALL_ANY to clients when the server is low on
memory or that client has a large number of delegations outstanding.
We've seen cases where NFSD attempts to send CB_RECALL_ANY requests
to disconnected clients, and gets confused. These calls never go
anywhere if a backchannel transport to the target client isn't
available. Before the server can send any backchannel operation, the
client has to connect first and then do a BIND_CONN_TO_SESSION.
This patch doesn't address the root cause of the confusion, but
there's no need to queue up these optional operations if they can't
go anywhere.
Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue Feb 18 18:37:51 2025 -0500
NFSv4: Avoid unnecessary scans of filesystems for delayed delegations
[ Upstream commit e767b59e29b8327d25edde65efc743f479f30d0a ]
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to manage the
delayed return of delegations, then only scan those filesystems
containing delegations that were marked as being delayed.
Fixes: be20037725d1 ("NFSv4: Fix delegation return in cases where we have to retry")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue Feb 18 19:03:21 2025 -0500
NFSv4: Avoid unnecessary scans of filesystems for expired delegations
[ Upstream commit f163aa81a799e2d46d7f8f0b42a0e7770eaa0d06 ]
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to reap the
expired delegations, it should scan only those filesystems that hold
delegations that need to be reaped.
Fixes: 7f156ef0bf45 ("NFSv4: Clean up nfs_delegation_reap_expired()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue Feb 18 18:14:26 2025 -0500
NFSv4: Avoid unnecessary scans of filesystems for returning delegations
[ Upstream commit 35a566a24e58f1b5f89737edf60b77de58719ed0 ]
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to return
delegations asynchronously, it should only scan those filesystems that
hold delegations that need to be returned.
Fixes: af3b61bf6131 ("NFSv4: Clean up nfs_client_return_marked_delegations()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue Feb 18 16:50:30 2025 -0500
NFSv4: Don't trigger uneccessary scans for return-on-close delegations
[ Upstream commit 47acca884f714f41d95dc654f802845544554784 ]
The amount of looping through the list of delegations is occasionally
leading to soft lockups. Avoid at least some loops by not requiring the
NFSv4 state manager to scan for delegations that are marked for
return-on-close. Instead, either mark them for immediate return (if
possible) or else leave it up to nfs4_inode_return_delegation_on_close()
to return them once the file is closed by the application.
Fixes: b757144fd77c ("NFSv4: Be less aggressive about returning delegations for open files")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikita Shubin <n.shubin@yadro.com>
Date: Thu Jun 6 11:15:19 2024 +0300
ntb: intel: Fix using link status DB's
[ Upstream commit 8144e9c8f30fb23bb736a5d24d5c9d46965563c4 ]
Make sure we are not using DB's which were remapped for link status.
Fixes: f6e51c354b60 ("ntb: intel: split out the gen3 code")
Signed-off-by: Nikita Shubin <n.shubin@yadro.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yajun Deng <yajun.deng@linux.dev>
Date: Wed Aug 16 16:33:05 2023 +0800
ntb_hw_switchtec: Fix shift-out-of-bounds in switchtec_ntb_mw_set_trans
[ Upstream commit de203da734fae00e75be50220ba5391e7beecdf9 ]
There is a kernel API ntb_mw_clear_trans() would pass 0 to both addr and
size. This would make xlate_pos negative.
[ 23.734156] switchtec switchtec0: MW 0: part 0 addr 0x0000000000000000 size 0x0000000000000000
[ 23.734158] ================================================================================
[ 23.734172] UBSAN: shift-out-of-bounds in drivers/ntb/hw/mscc/ntb_hw_switchtec.c:293:7
[ 23.734418] shift exponent -1 is negative
Ensuring xlate_pos is a positive or zero before BIT.
Fixes: 1e2fd202f859 ("ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Markus Elfring <elfring@users.sourceforge.net>
Date: Mon Sep 23 10:38:11 2024 +0200
ntb_perf: Delete duplicate dmaengine_unmap_put() call in perf_copy_chunk()
commit 4279e72cab31dd3eb8c89591eb9d2affa90ab6aa upstream.
The function call “dmaengine_unmap_put(unmap)” was used in an if branch.
The same call was immediately triggered by a subsequent goto statement.
Thus avoid such a call repetition.
This issue was detected by using the Coccinelle software.
Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Icenowy Zheng <uwu@icenowy.me>
Date: Thu Feb 13 01:04:43 2025 +0800
nvme-pci: clean up CMBMSC when registering CMB fails
[ Upstream commit 6a3572e10f740acd48e2713ef37e92186a3ce5e8 ]
CMB decoding should get disabled when the CMB block isn't successfully
registered to P2P DMA subsystem.
Clean up the CMBMSC register in this error handling codepath to disable
CMB decoding (and CMBLOC/CMBSZ registers).
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Thu Mar 6 14:25:57 2025 -0800
nvme-pci: fix stuck reset on concurrent DPC and HP
[ Upstream commit 3f674e7b670b7b7d9261935820e4eba3c059f835 ]
The PCIe error handling has the nvme driver quiesce the device, attempt
to restart it, then wait for that restart to complete.
A PCIe DPC event also toggles the PCIe link. If the slot doesn't have
out-of-band presence detection, this will trigger a pciehp
re-enumeration.
The error handling that calls nvme_error_resume is holding the device
lock while this happens. This lock blocks pciehp's request to disconnect
the driver from proceeding.
Meanwhile the nvme's reset can't make forward progress because its
device isn't there anymore with outstanding IO, and the timeout handler
won't do anything to fix it because the device is undergoing error
handling.
End result: deadlocked.
Fix this by having the timeout handler short cut the disabling for a
disconnected PCIe device. The downside is that we're relying on an IO
timeout to clean up this mess, which could be a minute by default.
Tested-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Icenowy Zheng <uwu@icenowy.me>
Date: Thu Feb 13 01:04:44 2025 +0800
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
[ Upstream commit 56cf7ef0d490b28fad8f8629fc135c5ab7c9f54e ]
The PCI P2PDMA code will register the CMB block to the memory
hot-plugging subsystem, which have an alignment requirement. Memory
blocks that do not satisfy this alignment requirement (usually 2MB) will
lead to a WARNING from memory hotplugging.
Verify the CMB block's address and size against the alignment and only
try to send CMB blocks compatible with it to prevent this warning.
Tested on Intel DC D4502 SSD, which has a 512K CMB block that is too
small for memory hotplugging (thus PCI P2PDMA).
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sagi Grimberg <sagi@grimberg.me>
Date: Thu Feb 20 13:18:30 2025 +0200
nvme-tcp: fix possible UAF in nvme_tcp_poll
[ Upstream commit 8c1624b63a7d24142a2bbc3a5ee7e95f004ea36e ]
nvme_tcp_poll() may race with the send path error handler because
it may complete the request while it is actively being polled for
completion, resulting in a UAF panic [1]:
We should make sure to stop polling when we see an error when
trying to read from the socket. Hence make sure to propagate the
error so that the block layer breaks the polling cycle.
[1]:
--
[35665.692310] nvme nvme2: failed to send request -13
[35665.702265] nvme nvme2: unsupported pdu type (3)
[35665.702272] BUG: kernel NULL pointer dereference, address: 0000000000000000
[35665.702542] nvme nvme2: queue 1 receive failed: -22
[35665.703209] #PF: supervisor write access in kernel mode
[35665.703213] #PF: error_code(0x0002) - not-present page
[35665.703214] PGD 8000003801cce067 P4D 8000003801cce067 PUD 37e6f79067 PMD 0
[35665.703220] Oops: 0002 [#1] SMP PTI
[35665.703658] nvme nvme2: starting error recovery
[35665.705809] Hardware name: Inspur aaabbb/YZMB-00882-104, BIOS 4.1.26 09/22/2022
[35665.705812] Workqueue: kblockd blk_mq_requeue_work
[35665.709172] RIP: 0010:_raw_spin_lock+0xc/0x30
[35665.715788] Call Trace:
[35665.716201] <TASK>
[35665.716613] ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717049] ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717457] ? blk_mq_request_bypass_insert+0x2c/0xb0
[35665.717950] ? __die_body.cold+0x8/0xd
[35665.718361] ? page_fault_oops+0xac/0x140
[35665.718749] ? blk_mq_start_request+0x30/0xf0
[35665.719144] ? nvme_tcp_queue_rq+0xc7/0x170 [nvme_tcp]
[35665.719547] ? exc_page_fault+0x62/0x130
[35665.719938] ? asm_exc_page_fault+0x22/0x30
[35665.720333] ? _raw_spin_lock+0xc/0x30
[35665.720723] blk_mq_request_bypass_insert+0x2c/0xb0
[35665.721101] blk_mq_requeue_work+0xa5/0x180
[35665.721451] process_one_work+0x1e8/0x390
[35665.721809] worker_thread+0x53/0x3d0
[35665.722159] ? process_one_work+0x390/0x390
[35665.722501] kthread+0x124/0x150
[35665.722849] ? set_kthread_struct+0x50/0x50
[35665.723182] ret_from_fork+0x1f/0x30
Reported-by: Zhang Guanghui <zhang.guanghui@cestc.cn>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Caleb Sander Mateos <csander@purestorage.com>
Date: Fri Mar 28 09:46:45 2025 -0600
nvme/ioctl: don't warn on vectorized uring_cmd with fixed buffer
[ Upstream commit eada75467fca0b016b9b22212637c07216135c20 ]
The vectorized io_uring NVMe passthru opcodes don't yet support fixed
buffers. But since userspace can trigger this condition based on the
io_uring SQE parameters, it shouldn't cause a kernel warning.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes: 23fd22e55b76 ("nvme: wire up fixed buffer support for nvme passthrough")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 24 14:56:06 2025 -0700
objtool, media: dib8000: Prevent divide-by-zero in dib8000_set_dds()
[ Upstream commit e63d465f59011dede0a0f1d21718b59a64c3ff5c ]
If dib8000_set_dds()'s call to dib8000_read32() returns zero, the result
is a divide-by-zero. Prevent that from happening.
Fixes the following warning with an UBSAN kernel:
drivers/media/dvb-frontends/dib8000.o: warning: objtool: dib8000_tune() falls through to next function dib8096p_cfg_DibRx()
Fixes: 173a64cb3fcf ("[media] dib8000: enhancement")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/bd1d504d930ae3f073b1e071bcf62cae7708773c.1742852847.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/r/202503210602.fvH5DO1i-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 24 14:56:05 2025 -0700
objtool, nvmet: Fix out-of-bounds stack access in nvmet_ctrl_state_show()
[ Upstream commit 107a23185d990e3df6638d9a84c835f963fe30a6 ]
The csts_state_names[] array only has six sparse entries, but the
iteration code in nvmet_ctrl_state_show() iterates seven, resulting in a
potential out-of-bounds stack read. Fix that.
Fixes the following warning with an UBSAN kernel:
vmlinux.o: warning: objtool: .text.nvmet_ctrl_state_show: unexpected end of section
Fixes: 649fd41420a8 ("nvmet: add debugfs support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/f1f60858ee7a941863dc7f5506c540cb9f97b5f6.1742852847.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/oe-kbuild-all/202503171547.LlCTJLQL-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 31 21:26:43 2025 -0700
objtool/loongarch: Add unwind hints in prepare_frametrace()
[ Upstream commit 7c977393b8277ed319e92e4b598b26598c9d30c0 ]
If 'regs' points to a local stack variable, prepare_frametrace() stores
all registers to the stack. This confuses objtool as it expects them to
be restored from the stack later.
The stores don't affect stack tracing, so use unwind hints to hide them
from objtool.
Fixes the following warnings:
arch/loongarch/kernel/traps.o: warning: objtool: show_stack+0xe0: stack state mismatch: reg1[22]=-1+0 reg2[22]=-2-160
arch/loongarch/kernel/traps.o: warning: objtool: show_stack+0xe0: stack state mismatch: reg1[23]=-1+0 reg2[23]=-2-152
Fixes: cb8a2ef0848c ("LoongArch: Add ORC stack unwinder support")
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/270cadd8040dda74db2307f23497bb68e65db98d.1743481539.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/oe-kbuild-all/202503280703.OARM8SrY-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Thu Mar 27 22:04:21 2025 -0700
objtool: Fix segfault in ignore_unreachable_insn()
[ Upstream commit 69d41d6dafff0967565b971d950bd10443e4076c ]
Check 'prev_insn' before dereferencing it.
Fixes: bd841d6154f5 ("objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/5df4ff89c9e4b9e788b77b0531234ffa7ba03e9e.1743136205.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/d86b4cc6-0b97-4095-8793-a7384410b8ab@app.fastmail.com
Closes: https://lore.kernel.org/Z-V_rruKY0-36pqA@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Laight <david.laight.linux@gmail.com>
Date: Mon Mar 31 21:26:42 2025 -0700
objtool: Fix verbose disassembly if CROSS_COMPILE isn't set
[ Upstream commit e77956e4e5c11218e60a1fe8cdbccd02476f2e56 ]
In verbose mode, when printing the disassembly of affected functions, if
CROSS_COMPILE isn't set, the objdump command string gets prefixed with
"(null)".
Somehow this worked before. Maybe some versions of glibc return an
empty string instead of NULL. Fix it regardless.
[ jpoimboe: Rewrite commit log. ]
Fixes: ca653464dd097 ("objtool: Add verbose option for disassembling affected functions")
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250215142321.14081-1-david.laight.linux@gmail.com
Link: https://lore.kernel.org/r/b931a4786bc0127aa4c94e8b35ed617dcbd3d3da.1743481539.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasiliy Kovalev <kovalev@altlinux.org>
Date: Fri Feb 14 11:49:08 2025 +0300
ocfs2: validate l_tree_depth to avoid out-of-bounds access
[ Upstream commit a406aff8c05115119127c962cbbbbd202e1973ef ]
The l_tree_depth field is 16-bit (__le16), but the actual maximum depth is
limited to OCFS2_MAX_PATH_DEPTH.
Add a check to prevent out-of-bounds access if l_tree_depth has an invalid
value, which may occur when reading from a corrupted mounted disk [1].
Link: https://lkml.kernel.org/r/20250214084908.736528-1-kovalev@altlinux.org
Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
Reported-by: syzbot+66c146268dc88f4341fd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=66c146268dc88f4341fd [1]
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Kurt Hackel <kurt.hackel@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Vasiliy Kovalev <kovalev@altlinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geetha sowjanya <gakula@marvell.com>
Date: Thu Mar 27 14:44:41 2025 +0530
octeontx2-af: Fix mbox INTR handler when num VFs > 64
[ Upstream commit 0fdba88a211508984eb5df62008c29688692b134 ]
When number of RVU VFs > 64, the vfs value passed to "rvu_queue_work"
function is incorrect. Due to which mbox workqueue entries for
VFs 0 to 63 never gets added to workqueue.
Fixes: 9bdc47a6e328 ("octeontx2-af: Mbox communication support btw AF and it's VFs")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250327091441.1284-1-gakula@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geetha sowjanya <gakula@marvell.com>
Date: Thu Mar 27 15:10:54 2025 +0530
octeontx2-af: Free NIX_AF_INT_VEC_GEN irq
[ Upstream commit 323d6db6dc7decb06f2545efb9496259ddacd4f4 ]
Due to the incorrect initial vector number in
rvu_nix_unregister_interrupts(), NIX_AF_INT_VEC_GEN is not
geeting free. Fix the vector number to include NIX_AF_INT_VEC_GEN
irq.
Fixes: 5ed66306eab6 ("octeontx2-af: Add devlink health reporters for NIX")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250327094054.2312-1-gakula@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zijun Hu <quic_zijuhu@quicinc.com>
Date: Tue Feb 25 21:58:06 2025 +0800
of: property: Increase NR_FWNODE_REFERENCE_ARGS
[ Upstream commit eb50844d728f11e87491f7c7af15a4a737f1159d ]
Currently, the following two macros have different values:
// The maximal argument count for firmware node reference
#define NR_FWNODE_REFERENCE_ARGS 8
// The maximal argument count for DT node reference
#define MAX_PHANDLE_ARGS 16
It may cause firmware node reference's argument count out of range if
directly assign DT node reference's argument count to firmware's.
drivers/of/property.c:of_fwnode_get_reference_args() is doing the direct
assignment, so may cause firmware's argument count @args->nargs got out
of range, namely, in [9, 16].
Fix by increasing NR_FWNODE_REFERENCE_ARGS to 16 to meet DT requirement.
Will align both macros later to avoid such inconsistency.
Fixes: 3e3119d3088f ("device property: Introduce fwnode_property_get_reference_args")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20250225-fix_arg_count-v4-1-13cdc519eb31@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tushar Dave <tdave@nvidia.com>
Date: Thu Feb 6 19:03:38 2025 -0800
PCI/ACS: Fix 'pci=config_acs=' parameter
[ Upstream commit 9cf8a952d57b422d3ff8a9a0163f8adf694f4b2b ]
Commit 47c8846a49ba ("PCI: Extend ACS configurability") introduced bugs
that fail to configure ACS ctrl to the value specified by the kernel
parameter. Essentially there are two bugs:
1) When ACS is configured for multiple PCI devices using 'config_acs'
kernel parameter, it results into error "PCI: Can't parse ACS command
line parameter". This is due to a bug that doesn't preserve the ACS
mask, but instead overwrites the mask with value 0.
For example, using 'config_acs' to configure ACS ctrl for multiple BDFs
fails:
Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
PCI: Can't parse ACS command line parameter
pci 0020:02:00.0: ACS mask = 0x007f
pci 0020:02:00.0: ACS flags = 0x007b
pci 0020:02:00.0: Configured ACS to 0x007b
After this fix:
Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x007f
pci 0020:02:00.0: ACS flags = 0x007b
pci 0020:02:00.0: ACS control = 0x005f
pci 0020:02:00.0: ACS fw_ctrl = 0x0053
pci 0020:02:00.0: Configured ACS to 0x007b
pci 0039:00:00.0: ACS mask = 0x0070
pci 0039:00:00.0: ACS flags = 0x0050
pci 0039:00:00.0: ACS control = 0x001d
pci 0039:00:00.0: ACS fw_ctrl = 0x0000
pci 0039:00:00.0: Configured ACS to 0x0050
2) In the bit manipulation logic, we copy the bit from the firmware
settings when mask bit 0.
For example, 'disable_acs_redir' fails to clear all three ACS P2P redir
bits due to the wrong bit fiddling:
Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x002c
pci 0020:02:00.0: ACS flags = 0xffd3
pci 0020:02:00.0: Configured ACS to 0xfffb
pci 0030:02:00.0: ACS mask = 0x002c
pci 0030:02:00.0: ACS flags = 0xffd3
pci 0030:02:00.0: Configured ACS to 0xffdf
pci 0039:00:00.0: ACS mask = 0x002c
pci 0039:00:00.0: ACS flags = 0xffd3
pci 0039:00:00.0: Configured ACS to 0xffd3
After this fix:
Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x002c
pci 0020:02:00.0: ACS flags = 0xffd3
pci 0020:02:00.0: ACS control = 0x007f
pci 0020:02:00.0: ACS fw_ctrl = 0x007b
pci 0020:02:00.0: Configured ACS to 0x0053
pci 0030:02:00.0: ACS mask = 0x002c
pci 0030:02:00.0: ACS flags = 0xffd3
pci 0030:02:00.0: ACS control = 0x005f
pci 0030:02:00.0: ACS fw_ctrl = 0x005f
pci 0030:02:00.0: Configured ACS to 0x0053
pci 0039:00:00.0: ACS mask = 0x002c
pci 0039:00:00.0: ACS flags = 0xffd3
pci 0039:00:00.0: ACS control = 0x001d
pci 0039:00:00.0: ACS fw_ctrl = 0x0000
pci 0039:00:00.0: Configured ACS to 0x0000
Link: https://lore.kernel.org/r/20250207030338.456887-1-tdave@nvidia.com
Fixes: 47c8846a49ba ("PCI: Extend ACS configurability")
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Stodden <daniel.stodden@gmail.com>
Date: Sun Dec 22 19:39:08 2024 -0800
PCI/ASPM: Fix link state exit during switch upstream function removal
[ Upstream commit cbf937dcadfd571a434f8074d057b32cd14fbea5 ]
Before 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to
avoid use-after-free"), we would free the ASPM link only after the last
function on the bus pertaining to the given link was removed.
That was too late. If function 0 is removed before sibling function,
link->downstream would point to free'd memory after.
After above change, we freed the ASPM parent link state upon any function
removal on the bus pertaining to a given link.
That is too early. If the link is to a PCIe switch with MFD on the upstream
port, then removing functions other than 0 first would free a link which
still remains parent_link to the remaining downstream ports.
The resulting GPFs are especially frequent during hot-unplug, because
pciehp removes devices on the link bus in reverse order.
On that switch, function 0 is the virtual P2P bridge to the internal bus.
Free exactly when function 0 is removed -- before the parent link is
obsolete, but after all subordinate links are gone.
Link: https://lore.kernel.org/r/e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com
Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free")
Signed-off-by: Daniel Stodden <dns@arista.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Feng Tang <feng.tang@linux.alibaba.com>
Date: Mon Mar 3 10:36:30 2025 +0800
PCI/portdrv: Only disable pciehp interrupts early when needed
[ Upstream commit 9d7db4db19827380e225914618c0c1bf435ed2f5 ]
Firmware developers reported that Linux issues two PCIe hotplug commands in
very short intervals on an ARM server, which doesn't comply with the PCIe
spec. According to PCIe r6.1, sec 6.7.3.2, if the Command Completed event
is supported, software must wait for a command to complete or wait at
least 1 second before sending a new command.
In the failure case, the first PCIe hotplug command is from
get_port_device_capability(), which sends a command to disable PCIe hotplug
interrupts without waiting for its completion, and the second command comes
from pcie_enable_notification() of pciehp driver, which enables hotplug
interrupts again.
Fix this by only disabling the hotplug interrupts when the pciehp driver is
not enabled.
Link: https://lore.kernel.org/r/20250303023630.78397-1-feng.tang@linux.alibaba.com
Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nishanth Aravamudan <naravamudan@nvidia.com>
Date: Fri Feb 7 14:56:00 2025 -0600
PCI: Avoid reset when disabled via sysfs
[ Upstream commit 479380efe1625e251008d24b2810283db60d6fcd ]
After d88f521da3ef ("PCI: Allow userspace to query and set device reset
mechanism"), userspace can disable reset of specific PCI devices by writing
an empty string to the sysfs reset_method file.
However, pci_slot_resettable() does not check pci_reset_supported(), which
means that pci_reset_function() will still reset the device even if
userspace has disabled all the reset methods.
I was able to reproduce this issue with a vfio device passed to a qemu
guest, where I had disabled PCI reset via sysfs.
Add an explicit check of pci_reset_supported() in both
pci_slot_resettable() and pci_bus_resettable() to ensure both the reset
status and reset execution are bypassed if an administrator disables it for
a device.
Link: https://lore.kernel.org/r/20250207205600.1846178-1-naravamudan@nvidia.com
Fixes: d88f521da3ef ("PCI: Allow userspace to query and set device reset mechanism")
Signed-off-by: Nishanth Aravamudan <naravamudan@nvidia.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Amey Narkhede <ameynarkhede03@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jim Quinlan <james.quinlan@broadcom.com>
Date: Fri Feb 14 12:39:32 2025 -0500
PCI: brcmstb: Fix error path after a call to regulator_bulk_get()
[ Upstream commit 3651ad5249c51cf7eee078e12612557040a6bdb4 ]
If the regulator_bulk_get() returns an error and no regulators
are created, we need to set their number to zero.
If we don't do this and the PCIe link up fails, a call to the
regulator_bulk_free() will result in a kernel panic.
While at it, print the error value, as we cannot return an error
upwards as the kernel will WARN() on an error from add_bus().
Fixes: 9e6be018b263 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250214173944.47506-5-james.quinlan@broadcom.com
[kwilczynski: commit log, use comma in the message to match style with
other similar messages]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jim Quinlan <james.quinlan@broadcom.com>
Date: Fri Feb 14 12:39:33 2025 -0500
PCI: brcmstb: Fix potential premature regulator disabling
[ Upstream commit b7de1b60ecab2f7b6f05d8116e93228a0bbb8563 ]
The platform supports enabling and disabling regulators only on
ports below the Root Complex.
Thus, we need to verify this both when adding and removing the bus,
otherwise regulators may be disabled prematurely when a bus further
down the topology is removed.
Fixes: 9e6be018b263 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-6-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jim Quinlan <james.quinlan@broadcom.com>
Date: Fri Feb 14 12:39:29 2025 -0500
PCI: brcmstb: Set generation limit before PCIe link up
[ Upstream commit 72d36589c6b7bef6b30eb99fcb7082f72faca37f ]
When the user elects to limit the PCIe generation via the appropriate
devicetree property, apply the settings before the PCIe link up, not
after.
Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-2-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jim Quinlan <james.quinlan@broadcom.com>
Date: Fri Feb 14 12:39:30 2025 -0500
PCI: brcmstb: Use internal register to change link capability
[ Upstream commit 0c97321e11e0e9e18546f828492758f6aaecec59 ]
The driver has been mistakenly writing to a read-only (RO)
configuration space register (PCI_EXP_LNKCAP) to change the
PCIe link capability.
Although harmless in this case, the proper write destination
is an internal register that is reflected by PCI_EXP_LNKCAP.
Thus, fix the brcm_pcie_set_gen() function to correctly update
the link capability.
Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-3-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans Zhang <18255117159@163.com>
Date: Sat Feb 15 00:57:24 2025 +0800
PCI: cadence-ep: Fix the driver to send MSG TLP for INTx without data payload
[ Upstream commit 3ac47fbf4f6e8c3a7c3855fac68cc3246f90f850 ]
Per the Cadence's "PCIe Controller IP for AX14" user guide, Version
1.04, Section 9.1.7.1, "AXI Subordinate to PCIe Address Translation
Registers", Table 9.4, the bit 16 of the AXI Subordinate Address
(axi_s_awaddr) when set corresponds to MSG with data, and when not set,
to MSG without data.
However, the driver is currently doing the opposite and due to this,
the INTx is never received on the host.
So, fix the driver to reflect the documentation and also make INTx work.
Fixes: 37dddf14f1ae ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller")
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214165724.184599-1-18255117159@163.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Wed Mar 5 18:00:07 2025 +0300
PCI: dwc: ep: Return -ENOMEM for allocation failures
[ Upstream commit 8189aa56dbed0bfb46b7b30d4d231f57ab17b3f4 ]
If the bitmap or memory allocations fail, then dw_pcie_ep_init_registers()
will incorrectly return a success.
Return -ENOMEM instead.
Fixes: 869bc5253406 ("PCI: dwc: ep: Fix DBI access failure for drivers requiring refclk from host")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/36dcb6fc-f292-4dd5-bd45-a8c6f9dc3df7@stanley.mountain
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Thu Mar 20 16:28:37 2025 +0200
PCI: Fix BAR resizing when VF BARs are assigned
[ Upstream commit 9ec19bfa78bd788945e2445b09de7b4482dee432 ]
__resource_resize_store() attempts to release all resources of the device
before attempting the resize. The loop, however, only covers standard BARs
(< PCI_STD_NUM_BARS). If a device has VF BARs that are assigned,
pci_reassign_bridge_resources() finds the bridge window still has some
assigned child resources and returns -NOENT which makes
pci_resize_resource() to detect an error and abort the resize.
Change the release loop to cover all resources up to VF BARs which allows
the resize operation to release the bridge windows and attempt to assigned
them again with the different size.
If SR-IOV is enabled, disallow resize as it requires releasing also IOV
resources.
Link: https://lore.kernel.org/r/20250320142837.8027-1-ilpo.jarvinen@linux.intel.com
Fixes: 91fa127794ac ("PCI: Expose PCIe Resizable BAR support via sysfs")
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat Mar 1 19:42:54 2025 +0100
PCI: histb: Fix an error handling path in histb_pcie_probe()
[ Upstream commit b36fb50701619efca5f5450b355d42575cf532ed ]
If an error occurs after a successful phy_init() call, then phy_exit()
should be called.
Add the missing call, as already done in the remove function.
Fixes: bbd11bddb398 ("PCI: hisi: Add HiSilicon STB SoC PCIe controller driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[kwilczynski: remove unnecessary hipcie->phy NULL check from
histb_pcie_probe() and squash a patch that removes similar NULL
check for hipcie-phy from histb_pcie_remove() from
https://lore.kernel.org/linux-pci/c369b5d25e17a44984ae5a889ccc28a59a0737f7.1742058005.git.christophe.jaillet@wanadoo.fr]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/8301fc15cdea5d2dac21f57613e8e6922fb1ad95.1740854531.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Fri Mar 21 18:21:14 2025 +0200
PCI: pciehp: Don't enable HPIE when resuming in poll mode
[ Upstream commit 527664f738afb6f2c58022cd35e63801e5dc7aec ]
PCIe hotplug can operate in poll mode without interrupt handlers using a
polling kthread only. eb34da60edee ("PCI: pciehp: Disable hotplug
interrupt during suspend") failed to consider that and enables HPIE
(Hot-Plug Interrupt Enable) unconditionally when resuming the Port.
Only set HPIE if non-poll mode is in use. This makes
pcie_enable_interrupt() match how pcie_enable_notification() already
handles HPIE.
Link: https://lore.kernel.org/r/20250321162114.3939-1-ilpo.jarvinen@linux.intel.com
Fixes: eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Mon Dec 16 19:56:08 2024 +0200
PCI: Remove add_align overwrite unrelated to size0
[ Upstream commit d06cc1e3809040e8250f69a4c656e3717e6b963c ]
Commit 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
relaxed bridge window tail alignment rule for the non-optional part
(size0, no add_size/add_align). The change, however, also overwrote
add_align, which is only related to case where optional size1 related
entry is added into realloc head.
Correct this by removing the add_align overwrite.
Link: https://lore.kernel.org/r/20241216175632.4175-2-ilpo.jarvinen@linux.intel.com
Fixes: 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Fri Mar 7 11:46:34 2025 +0300
PCI: Remove stray put_device() in pci_register_host_bridge()
[ Upstream commit 6e8d06e5096c80cbf41313b4a204f43071ca42be ]
This put_device() was accidentally left over from when we changed the code
from using device_register() to calling device_add(). Delete it.
Link: https://lore.kernel.org/r/55b24870-89fb-4c91-b85d-744e35db53c2@stanley.mountain
Fixes: 9885440b16b8 ("PCI: Fix pci_host_bridge struct device release/free handling")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai-Heng Feng <kaihengf@nvidia.com>
Date: Wed Dec 4 10:24:57 2024 +0800
PCI: Use downstream bridges for distributing resources
[ Upstream commit 1a596ad00ffe9b37fc60a93cbdd4daead3bf95f3 ]
7180c1d08639 ("PCI: Distribute available resources for root buses, too")
breaks BAR assignment on some devices:
pci 0006:03:00.0: BAR 0 [mem 0x6300c0000000-0x6300c1ffffff 64bit pref]: assigned
pci 0006:03:00.1: BAR 0 [mem 0x6300c2000000-0x6300c3ffffff 64bit pref]: assigned
pci 0006:03:00.2: BAR 0 [mem size 0x00800000 64bit pref]: can't assign; no space
pci 0006:03:00.0: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space
pci 0006:03:00.1: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space
The apertures of domain 0006 before 7180c1d08639:
6300c0000000-63ffffffffff : PCI Bus 0006:00
6300c0000000-6300c9ffffff : PCI Bus 0006:01
6300c0000000-6300c9ffffff : PCI Bus 0006:02 # 160MB
6300c0000000-6300c8ffffff : PCI Bus 0006:03 # 144MB
6300c0000000-6300c1ffffff : 0006:03:00.0 # 32MB
6300c2000000-6300c3ffffff : 0006:03:00.1 # 32MB
6300c4000000-6300c47fffff : 0006:03:00.2 # 8MB
6300c4800000-6300c67fffff : 0006:03:00.0 # 32MB
6300c6800000-6300c87fffff : 0006:03:00.1 # 32MB
6300c9000000-6300c9bfffff : PCI Bus 0006:04 # 12MB
6300c9000000-6300c9bfffff : PCI Bus 0006:05 # 12MB
6300c9000000-6300c91fffff : PCI Bus 0006:06 # 2MB
6300c9200000-6300c93fffff : PCI Bus 0006:07 # 2MB
6300c9400000-6300c95fffff : PCI Bus 0006:08 # 2MB
6300c9600000-6300c97fffff : PCI Bus 0006:09 # 2MB
After 7180c1d08639:
6300c0000000-63ffffffffff : PCI Bus 0006:00
6300c0000000-6300c9ffffff : PCI Bus 0006:01
6300c0000000-6300c9ffffff : PCI Bus 0006:02 # 160MB
6300c0000000-6300c43fffff : PCI Bus 0006:03 # 68MB
6300c0000000-6300c1ffffff : 0006:03:00.0 # 32MB
6300c2000000-6300c3ffffff : 0006:03:00.1 # 32MB
--- no space --- : 0006:03:00.2 # 8MB
--- no space --- : 0006:03:00.0 # 32MB
--- no space --- : 0006:03:00.1 # 32MB
6300c4400000-6300c4dfffff : PCI Bus 0006:04 # 10MB
6300c4400000-6300c4dfffff : PCI Bus 0006:05 # 10MB
6300c4400000-6300c45fffff : PCI Bus 0006:06 # 2MB
6300c4600000-6300c47fffff : PCI Bus 0006:07 # 2MB
6300c4800000-6300c49fffff : PCI Bus 0006:08 # 2MB
6300c4a00000-6300c4bfffff : PCI Bus 0006:09 # 2MB
We can see that the window to 0006:03 gets shrunken too much and 0006:04
eats away the window for 0006:03:00.2.
The offending commit distributes the upstream bridge's resources multiple
times to every downstream bridge, hence makes the aperture smaller than
desired because calculation of io_per_b, mmio_per_b and mmio_pref_per_b
becomes incorrect.
Instead, distribute downstream bridges' own resources to resolve the issue.
Link: https://lore.kernel.org/r/20241204022457.51322-1-kaihengf@nvidia.com
Fixes: 7180c1d08639 ("PCI: Distribute available resources for root buses, too")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219540
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Carol Soto <csoto@nvidia.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
Date: Mon Feb 24 21:20:22 2025 +0530
PCI: xilinx-cpm: Fix IRQ domain leak in error path of probe
[ Upstream commit 57b0302240741e73fe51f88404b3866e0d2933ad ]
The IRQ domain allocated for the PCIe controller is not freed if
resource_list_first_type() returns NULL, leading to a resource leak.
This fix ensures properly cleaning up the allocated IRQ domain in
the error path.
Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'")
Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
[kwilczynski: added missing Fixes: tag, refactored to use one of the goto labels]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250224155025.782179-2-thippeswamy.havalige@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leo Yan <leo.yan@arm.com>
Date: Tue Mar 4 11:12:34 2025 +0000
perf arm-spe: Fix load-store operation checking
[ Upstream commit e1d47850bbf79a541c9b3bacdd562f5e0112274d ]
The ARM_SPE_OP_LD and ARM_SPE_OP_ST operations are secondary operation
type, they are overlapping with other second level's operation types
belonging to SVE and branch operations. As a result, a non load-store
operation can be parsed for data source and memory sample.
To fix the issue, this commit introduces a is_ldst_op() macro for
checking LDST operation, and apply the checking when synthesize data
source and memory samples.
Fixes: a89dbc9b988f ("perf arm-spe: Set sample's data source field")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250304111240.3378214-7-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Richter <tmricht@linux.ibm.com>
Date: Tue Mar 4 10:23:49 2025 +0100
perf bench: Fix perf bench syscall loop count
[ Upstream commit 957d194163bf983da98bf7ec7e4f86caff8cd0eb ]
Command 'perf bench syscall fork -l 100000' offers option -l to run for
a specified number of iterations. However this option is not always
observed. The number is silently limited to 10000 iterations as can be
seen:
Output before:
# perf bench syscall fork -l 100000
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
Total time: 23.388 [sec]
2338.809800 usecs/op
427 ops/sec
#
When explicitly specified with option -l or --loops, also observe
higher number of iterations:
Output after:
# perf bench syscall fork -l 100000
# Running 'syscall/fork' benchmark:
# Executed 100,000 fork() calls
Total time: 716.982 [sec]
7169.829510 usecs/op
139 ops/sec
#
This patch fixes the issue for basic execve fork and getpgid.
Fixes: ece7f7c0507c ("perf bench syscall: Add fork syscall benchmark")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/20250304092349.2618082-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namhyung Kim <namhyung@kernel.org>
Date: Fri Mar 7 14:09:21 2025 -0800
perf bpf-filter: Fix a parsing error with comma
[ Upstream commit 35d13f841a3d8159ef20d5e32a9ed3faa27875bc ]
The previous change to support cgroup filters introduced a bug that
pathname can include commas. It confused the lexer to treat an item and
the trailing comma as a single token. And it resulted in a parse error:
$ sudo perf record -e cycles:P --filter 'period > 0, ip > 64' -- true
perf_bpf_filter: Error: Unexpected item: 0,
perf_bpf_filter: syntax error, unexpected BFT_ERROR, expecting BFT_NUM
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--filter <filter>
event filter
It should get "0" and "," separately.
An easiest fix would be to remove "," from the possible pathname
characters. As it's for cgroup names, probably ok to assume it won't
have commas in the pathname.
I found that the existing BPF filtering test didn't have any complex
filter condition with commas. Let's update the group filter test which
is supposed to test filter combinations like this.
Link: https://lore.kernel.org/r/20250307220922.434319-1-namhyung@kernel.org
Fixes: 91e88437d5156b20 ("perf bpf-filter: Support filtering on cgroups")
Reported-by: Sally Shi <sshii@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Ceresoli <luca.ceresoli@bootlin.com>
Date: Fri Jan 24 14:06:08 2025 +0100
perf build: Fix in-tree build due to symbolic link
[ Upstream commit 75100d848ef4b8ca39bb6dd3a21181e37dea27e2 ]
Building perf in-tree is broken after commit 890a1961c812 ("perf tools:
Create source symlink in perf object dir") which added a 'source' symlink
in the output dir pointing to the source dir.
With in-tree builds, the added 'SOURCE = ...' line is executed multiple
times (I observed 2 during the build plus 2 during installation). This is a
minor inefficiency, in theory not harmful because symlink creation is
assumed to be idempotent. But it is not.
Considering with in-tree builds:
srctree=/absolute/path/to/linux
OUTPUT=/absolute/path/to/linux/tools/perf
here's what happens:
1. ln -sf $(srctree)/tools/perf $(OUTPUT)/source
-> creates /absolute/path/to/linux/tools/perf/source
link to /absolute/path/to/linux/tools/perf
=> OK, that's what was intended
2. ln -sf $(srctree)/tools/perf $(OUTPUT)/source # same command as 1
-> creates /absolute/path/to/linux/tools/perf/perf
link to /absolute/path/to/linux/tools/perf
=> Not what was intended, not idempotent
3. Now the build _should_ create the 'perf' executable, but it fails
The reason is the tricky 'ln' command line. At the first invocation 'ln'
uses the 1st form:
ln [OPTION]... [-T] TARGET LINK_NAME
and creates a link to TARGET *called LINK_NAME*.
At the second invocation $(OUTPUT)/source exists, so 'ln' uses the 3rd
form:
ln [OPTION]... TARGET... DIRECTORY
and creates a link to TARGET *called TARGET* inside DIRECTORY.
Fix by adding -n/--no-dereference to "treat LINK_NAME as a normal file
if it is a symbolic link to a directory", as the manpage says.
Closes: https://lore.kernel.org/all/20241125182506.38af9907@booty/
Fixes: 890a1961c812 ("perf tools: Create source symlink in perf object dir")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20250124-perf-fix-intree-build-v1-1-485dd7a855e4@bootlin.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Fri Feb 28 14:22:58 2025 -0800
perf debug: Avoid stack overflow in recursive error message
[ Upstream commit bda840191d2aae3b7cadc3ac21835dcf29487191 ]
In debug_file, pr_warning_once is called on error. As that function
calls debug_file the function will yield a stack overflow. Switch the
location of the call so the recursion is avoided.
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-2-irogers@google.com
Fixes: ec49230cf6dda704 ("perf debug: Expose debug file")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Brennan <stephen.s.brennan@oracle.com>
Date: Tue Mar 18 16:00:11 2025 -0700
perf dso: fix dso__is_kallsyms() check
[ Upstream commit ebf0b332732dcc64239119e554faa946562b0b93 ]
Kernel modules for which we cannot find a file on-disk will have a
dso->long_name that looks like "[module_name]". Prior to the commit
listed in the fixes, the dso->kernel field would be zero (for user
space), so dso__is_kallsyms() would return false. After the commit,
kernel module DSOs are correctly labeled, but the result is that
dso__is_kallsyms() erroneously returns true for those modules without a
filesystem path.
Later, build_id_cache__add() consults this value of is_kallsyms, and
when true, it copies /proc/kallsyms into the cache. Users with many
kernel modules without a filesystem path (e.g. ksplice or possibly
kernel live patch modules) have reported excessive disk space usage in
the build ID cache directory due to this behavior.
To reproduce the issue, it's enough to build a trivial out-of-tree hello
world kernel module, load it using insmod, and then use:
perf record -ag -- sleep 1
In the build ID directory, there will be a directory for your module
name containing a kallsyms file.
Fix this up by changing dso__is_kallsyms() to consult the
dso_binary_type enumeration, which is also symmetric to the above checks
for dso__is_vmlinux() and dso__is_kcore(). With this change, kallsyms is
not cached in the build-id cache for out-of-tree modules.
Fixes: 02213cec64bbe ("perf maps: Mark module DSOs with kernel type")
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250318230012.2038790-1-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Fri Feb 28 14:22:59 2025 -0800
perf evlist: Add success path to evlist__create_syswide_maps
[ Upstream commit fe0ce8a9d85a48642880c9b78944cb0d23e779c5 ]
Over various refactorings evlist__create_syswide_maps has been made to
only ever return with -ENOMEM. Fix this so that when
perf_evlist__set_maps is successfully called, 0 is returned.
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-3-irogers@google.com
Fixes: 8c0498b6891d7ca5 ("perf evlist: Fix create_syswide_maps() not propagating maps")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: James Clark <james.clark@linaro.org>
Date: Wed Feb 26 10:41:01 2025 +0000
perf pmu: Don't double count common sysfs and json events
[ Upstream commit c9d699e10fa6c0cdabcddcf991e7ff42af6b2503 ]
After pmu_add_cpu_aliases() is called, perf_pmu__num_events() returns an
incorrect value that double counts common events and doesn't match the
actual count of events in the alias list. This is because after
'cpu_aliases_added == true', the number of events returned is
'sysfs_aliases + cpu_json_aliases'. But when adding 'case
EVENT_SRC_SYSFS' events, 'sysfs_aliases' and 'cpu_json_aliases' are both
incremented together, failing to account that these ones overlap and
only add a single item to the list. Fix it by adding another counter for
overlapping events which doesn't influence 'cpu_json_aliases'.
There doesn't seem to be a current issue because it's used in perf list
before pmu_add_cpu_aliases() so the correct value is returned. Other
uses in tests may also miss it for other reasons like only looking at
uncore events. However it's marked as a fixes commit in case any new fix
with new uses of perf_pmu__num_events() is backported.
Fixes: d9c5f5f94c2d ("perf pmu: Count sys and cpuid JSON events separately")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250226104111.564443-3-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed Mar 12 17:31:41 2025 -0300
perf python: Check if there is space to copy all the event
[ Upstream commit 89aaeaf84231157288035b366cb6300c1c6cac64 ]
The pyrf_event__new() method copies the event obtained from the perf
ring buffer to a structure that will then be turned into a python object
for further consumption, so it copies perf_event.header.size bytes to
its 'event' member:
$ pahole -C pyrf_event /tmp/build/perf-tools-next/python/perf.cpython-312-x86_64-linux-gnu.so
struct pyrf_event {
PyObject ob_base; /* 0 16 */
struct evsel * evsel; /* 16 8 */
struct perf_sample sample; /* 24 312 */
/* XXX last struct has 7 bytes of padding, 2 holes */
/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
union perf_event event; /* 336 4168 */
/* size: 4504, cachelines: 71, members: 4 */
/* member types with holes: 1, total: 2 */
/* paddings: 1, sum paddings: 7 */
/* last cacheline: 24 bytes */
};
$
It was doing so without checking if the event just obtained has more
than that space, fix it.
This isn't a proper, final solution, as we need to support larger
events, but for the time being we at least bounds check and document it.
Fixes: 877108e42b1b9ba6 ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-7-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed Mar 12 17:31:39 2025 -0300
perf python: Decrement the refcount of just created event on failure
[ Upstream commit 3de5a2bf5b4847f7a59a184568f969f8fe05d57f ]
To avoid a leak if we have the python object but then something happens
and we need to return the operation, decrement the offset of the newly
created object.
Fixes: 377f698db12150a1 ("perf python: Add struct evsel into struct pyrf_event")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed Mar 12 17:31:40 2025 -0300
perf python: Don't keep a raw_data pointer to consumed ring buffer space
[ Upstream commit f3fed3ae34d606819d87a63d970cc3092a5be7ab ]
When processing tracepoints the perf python binding was parsing the
event before calling perf_mmap__consume(&md->core) in
pyrf_evlist__read_on_cpu().
But part of this event parsing was to set the perf_sample->raw_data
pointer to the payload of the event, which then could be overwritten by
other event before tracepoint fields were asked for via event.prev_comm
in a python program, for instance.
This also happened with other fields, but strings were were problems
were surfacing, as there is UTF-8 validation for the potentially garbled
data.
This ended up showing up as (with some added debugging messages):
( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 ) ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 ) ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 ) ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
( XXX '��' len=16, raw_size=68) ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
main()
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
event.next_comm,
^^^^^^^^^^^^^^^
AttributeError: 'perf.sample_event' object has no attribute 'next_comm'
When event.next_comm was asked for, the PyUnicode_FromString() python
API would fail and that tracepoint field wouldn't be available, stopping
the tools/perf/python/tracepoint.py test tool.
But, since we already do a copy of the whole event in pyrf_event__new,
just use it and while at it remove what was done in in e8968e654191390a
("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we
don't really need to wait for parsing the sample before declaring the
event as consumed.
This copy is questionable as is now, as it limits the maximum event +
sample_type and tracepoint payload to sizeof(union perf_event), this all
has been "working" because 'struct perf_event_mmap2', the largest entry
in 'union perf_event' is:
$ pahole -C perf_event ~/bin/perf | grep mmap2
struct perf_record_mmap2 mmap2; /* 0 4168 */
$
Fixes: bae57e3825a3dded ("perf python: Add support to resolve tracepoint fields")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-6-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed Mar 12 17:31:36 2025 -0300
perf python: Fixup description of sample.id event member
[ Upstream commit 1376c195e8ad327bb9f2d32e0acc5ac39e7cb30a ]
Some old cut'n'paste error, its "ip", so the description should be
"event ip", not "event type".
Fixes: 877108e42b1b9ba6 ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-2-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namhyung Kim <namhyung@kernel.org>
Date: Mon Feb 10 22:07:44 2025 -0800
perf report: Switch data file correctly in TUI
[ Upstream commit 43c2b6139b188d8a756130147f7efd5ddf99f88d ]
The 's' key is to switch to a new data file and load the data in the
same window. The switch_data_file() will show a popup menu to select
which data file user wants and update the 'input_name' global variable.
But in the cmd_report(), it didn't update the data.path using the new
'input_name' and keep usng the old file. This is fairly an old bug and
I assume people don't use this feature much. :)
Link: https://lore.kernel.org/r/20250211060745.294289-1-namhyung@kernel.org
Closes: https://lore.kernel.org/linux-perf-users/89e678bc-f0af-4929-a8a6-a2666f1294a4@linaro.org
Fixes: f5fc14124c5cefdd ("perf tools: Add data object to handle perf data file")
Reported-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Thu Jan 9 14:21:07 2025 -0800
perf stat: Fix find_stat for mixed legacy/non-legacy events
[ Upstream commit 8ce0d2da14d3fb62844dd0e95982c194326b1a5f ]
Legacy events typically don't have a PMU when added leading to
mismatched legacy/non-legacy cases in find_stat. Use evsel__find_pmu
to make sure the evsel PMU is looked up. Update the evsel__find_pmu
code to look for the PMU using the extended config type or, for legacy
hardware/hw_cache events on non-hybrid systems, just use the core PMU.
Before:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
215,309,764 cycles
44,326,491 cpu/instructions/
1.002555314 seconds time elapsed
```
After:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
990,676,332 cycles
1,235,762,487 cpu/instructions/ # 1.25 insn per cycle
1.002667198 seconds time elapsed
```
Fixes: 3612ca8e2935 ("perf stat: Fix the hard-coded metrics calculation on the hybrid")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marcus Meissner <meissner@suse.de>
Date: Sun Mar 23 09:53:45 2025 +0100
perf tools: annotate asm_pure_loop.S
[ Upstream commit 9a352a90e88a041f4b26d359493e12a7f5ae1a6a ]
Annotate so it is built with non-executable stack.
Fixes: 8b97519711c3 ("perf test: Add asm pureloop test tool")
Signed-off-by: Marcus Meissner <meissner@suse.de>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250323085410.23751-1-meissner@suse.de
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Mon Mar 10 16:45:32 2025 -0300
perf units: Fix insufficient array space
[ Upstream commit cf67629f7f637fb988228abdb3aae46d0c1748fe ]
No need to specify the array size, let the compiler figure that out.
This addresses this compiler warning that was noticed while build
testing on fedora rawhide:
31 15.81 fedora:rawhide : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
util/units.c: In function 'unit_number__scnprintf':
util/units.c:67:24: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
67 | char unit[4] = "BKMG";
| ^~~~~~
cc1: all warnings being treated as errors
Fixes: 9808143ba2e54818 ("perf tools: Add unit_number__scnprintf function")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250310194534.265487-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Date: Thu Mar 13 20:15:59 2025 +0000
perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
[ Upstream commit 182f12f3193341c3400ae719a34c00a8a1204cff ]
frontend_bound metrics was miscalculated due to different scaling in
a couple of metrics it depends on. Change the scaling to match with
AmpereOne.
Fixes: 16438b652b46 ("perf vendor events arm64 AmpereOneX: Add core PMU events and metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250313201559.11332-3-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yeoreum Yun <yeoreum.yun@arm.com>
Date: Wed Mar 26 08:20:03 2025 +0000
perf/core: Fix child_total_time_enabled accounting bug at task exit
[ Upstream commit a3c3c66670cee11eb13aa43905904bf29cb92d32 ]
The perf events code fails to account for total_time_enabled of
inactive events.
Here is a failure case for accounting total_time_enabled for
CPU PMU events:
sudo ./perf stat -vvv -e armv8_pmuv3_0/event=0x08/ -e armv8_pmuv3_1/event=0x08/ -- stress-ng --pthread=2 -t 2s
...
armv8_pmuv3_0/event=0x08/: 1138698008 2289429840 2174835740
armv8_pmuv3_1/event=0x08/: 1826791390 1950025700 847648440
` ` `
` ` > total_time_running with child
` > total_time_enabled with child
> count with child
Performance counter stats for 'stress-ng --pthread=2 -t 2s':
1,138,698,008 armv8_pmuv3_0/event=0x08/ (94.99%)
1,826,791,390 armv8_pmuv3_1/event=0x08/ (43.47%)
The two events above are opened on two different CPU PMUs, for example,
each event is opened for a cluster in an Arm big.LITTLE system, they
will never run on the same CPU. In theory, the total enabled time should
be same for both events, as two events are opened and closed together.
As the result show, the two events' total enabled time including
child event is different (2289429840 vs 1950025700).
This is because child events are not accounted properly
if a event is INACTIVE state when the task exits:
perf_event_exit_event()
`> perf_remove_from_context()
`> __perf_remove_from_context()
`> perf_child_detach() -> Accumulate child_total_time_enabled
`> list_del_event() -> Update child event's time
The problem is the time accumulation happens prior to child event's
time updating. Thus, it misses to account the last period's time when
the event exits.
The perf core layer follows the rule that timekeeping is tied to state
change. To address the issue, make __perf_remove_from_context()
handle the task exit case by passing 'DETACH_EXIT' to it and
invoke perf_event_state() for state alongside with accounting the time.
Then, perf_child_detach() populates the time into the parent's time metrics.
After this patch, the bug is fixed:
sudo ./perf stat -vvv -e armv8_pmuv3_0/event=0x08/ -e armv8_pmuv3_1/event=0x08/ -- stress-ng --pthread=2 -t 10s
...
armv8_pmuv3_0/event=0x08/: 15396770398 32157963940 21898169000
armv8_pmuv3_1/event=0x08/: 22428964974 32157963940 10259794940
Performance counter stats for 'stress-ng --pthread=2 -t 10s':
15,396,770,398 armv8_pmuv3_0/event=0x08/ (68.10%)
22,428,964,974 armv8_pmuv3_1/event=0x08/ (31.90%)
[ mingo: Clarified the changelog. ]
Fixes: ef54c1a476aef ("perf: Rework perf_event_exit_event()")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250326082003.1630986-1-yeoreum.yun@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Nov 4 14:39:12 2024 +0100
perf/core: Fix perf_pmu_register() vs. perf_init_event()
[ Upstream commit 003659fec9f6d8c04738cb74b5384398ae8a7e88 ]
There is a fairly obvious race between perf_init_event() doing
idr_find() and perf_pmu_register() doing idr_alloc() with an
incompletely initialized PMU pointer.
Avoid by doing idr_alloc() on a NULL pointer to register the id, and
swizzling the real struct pmu pointer at the end using idr_replace().
Also making sure to not set struct pmu members after publishing
the struct pmu, duh.
[ introduce idr_cmpxchg() in order to better handle the idr_replace()
error case -- if it were to return an unexpected pointer, it will
already have replaced the value and there is no going back. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20241104135517.858805880@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tao Chen <chen.dylane@linux.dev>
Date: Fri Mar 14 11:00:36 2025 +0800
perf/ring_buffer: Allow the EPOLLRDNORM flag for poll
[ Upstream commit c96fff391c095c11dc87dab35be72dee7d217cde ]
The poll man page says POLLRDNORM is equivalent to POLLIN. For poll(),
it seems that if user sets pollfd with POLLRDNORM in userspace, perf_poll
will not return until timeout even if perf_output_wakeup called,
whereas POLLIN returns.
Fixes: 76369139ceb9 ("perf: Split up buffer handling from core code")
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250314030036.2543180-1-chen.dylane@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra (Intel) <peterz@infradead.org>
Date: Tue Jan 21 07:23:00 2025 -0800
perf/x86/intel: Apply static call for drain_pebs
commit 314dfe10576912e1d786b13c5d4eee8c51b63caa upstream.
The x86_pmu_drain_pebs static call was introduced in commit 7c9903c9bf71
("x86/perf, static_call: Optimize x86_pmu methods"), but it's not really
used to replace the old method.
Apply the static call for drain_pebs.
Fixes: 7c9903c9bf71 ("x86/perf, static_call: Optimize x86_pmu methods")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20250121152303.3128733-1-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kan Liang <kan.liang@linux.intel.com>
Date: Tue Jan 21 07:23:01 2025 -0800
perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read
commit f9bdf1f953392c9edd69a7f884f78c0390127029 upstream.
The WARN_ON(this_cpu_read(cpu_hw_events.enabled)) in the
intel_pmu_save_and_restart_reload() is triggered, when sampling read
topdown events.
In a NMI handler, the cpu_hw_events.enabled is set and used to indicate
the status of core PMU. The generic pmu->pmu_disable_count, updated in
the perf_pmu_disable/enable pair, is not touched.
However, the perf_pmu_disable/enable pair is invoked when sampling read
in a NMI handler. The cpuc->enabled is mistakenly set by the
perf_pmu_enable().
Avoid disabling PMU if the core PMU is already disabled.
Merge the logic together.
Fixes: 7b2c05a15d29 ("perf/x86/intel: Generic support for hardware TopDown metrics")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20250121152303.3128733-2-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: James Clark <james.clark@linaro.org>
Date: Wed Jan 29 15:44:05 2025 +0000
perf: Always feature test reallocarray
[ Upstream commit 4c4c0724d6521a8092b7c16f8f210c5869d95b17 ]
This is also used in util/comm.c now, so instead of selectively doing
the feature test, always do it. If it's ever used anywhere else it's
less likely to cause another build failure.
This doesn't remove the need to manually include libc_compat.h, and
missing that will still cause an error for glibc < 2.26. There isn't a
way to fix that without poisoning reallocarray like libbpf did, but that
has other downsides like making memory debugging tools less useful. So
for Perf keep it like this and we'll have to fix up any missed includes.
Fixes the following build error:
util/comm.c:152:31: error: implicit declaration of function
'reallocarray' [-Wimplicit-function-declaration]
152 | tmp = reallocarray(comm_strs->strs,
| ^~~~~~~~~~~~
Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Ali Utku Selen <ali.utku.selen@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250129154405.777533-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: James Clark <james.clark@linaro.org>
Date: Wed Mar 19 10:16:10 2025 +0000
perf: intel-tpebs: Fix incorrect usage of zfree()
[ Upstream commit 6d2dcd635204c023eb5328ad7d38b198a5558c9b ]
zfree() requires an address otherwise it frees what's in name, rather
than name itself. Pass the address of name to fix it.
This was the only incorrect occurrence in Perf found using a search.
Fixes: 8db5cabcf1b6 ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.")
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heiko Stuebner <heiko.stuebner@cherry.de>
Date: Fri Dec 6 11:34:01 2024 +0100
phy: phy-rockchip-samsung-hdptx: Don't use dt aliases to determine phy-id
[ Upstream commit f08d1c08563846f9be79a4859e912c8795d690fd ]
The phy needs to know its identity in the system (phy0 or phy1 on rk3588)
for some actions and the driver currently contains code abusing of_alias
for that.
Devicetree aliases are always optional and should not be used for core
device functionality, so instead keep a list of phys on a soc in the
of_device_data and find the phy-id by comparing against the mapped
register-base.
Fixes: c4b09c562086 ("phy: phy-rockchip-samsung-hdptx: Add clock provider support")
Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20241206103401.1780416-3-heiko@sntech.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon Feb 10 21:44:51 2025 +0200
pinctrl: intel: Fix wrong bypass assignment in intel_pinctrl_probe_pwm()
[ Upstream commit 0eee258cdf172763502f142d85e967f27a573be0 ]
When instantiating PWM, the bypass should be set to false. The field
is used for the selected Intel SoCs that do not have PWM feature enabled
in their pin control IPs.
Fixes: eb78d3604d6b ("pinctrl: intel: Enumerate PWM device when community has a capability")
Reported-by: Alexis GUILLEMET <alexis.guillemet@dunasys.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Alexis GUILLEMET <alexis.guillemet@dunasys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Tue Mar 18 12:57:14 2025 +0200
pinctrl: npcm8xx: Fix incorrect struct npcm8xx_pincfg assignment
[ Upstream commit 113ec87b0f26a17b02c58aa2714a9b8f1020eed9 ]
Sparse is not happy about implementation of the NPCM8XX_PINCFG()
pinctrl-npcm8xx.c:1314:9: warning: obsolete array initializer, use C99 syntax
pinctrl-npcm8xx.c:1315:9: warning: obsolete array initializer, use C99 syntax
...
pinctrl-npcm8xx.c:1412:9: warning: obsolete array initializer, use C99 syntax
pinctrl-npcm8xx.c:1413:9: warning: too many warnings
which uses index-based assignment in a wrong way, i.e. it missed
the equal sign and hence the index is simply ignored, while the
entries are indexed naturally. This is not a problem as the pin
numbering repeats the natural order, but it might be in case of
shuffling the entries. Fix this by adding missed equal sign and
reformat a bit for better readability.
Fixes: acf4884a5717 ("pinctrl: nuvoton: add NPCM8XX pinctrl and GPIO driver")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/20250318105932.2090926-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yue Haibing <yuehaibing@huawei.com>
Date: Sat Jan 18 11:13:34 2025 +0800
pinctrl: nuvoton: npcm8xx: Fix error handling in npcm8xx_gpio_fw()
[ Upstream commit d6c6fd77e5816e3f6689a2767cdd777797506f24 ]
fwnode_irq_get() was changed to not return 0, fix this by checking
for negative error, also update the error log.
Fixes: acf4884a5717 ("pinctrl: nuvoton: add NPCM8XX pinctrl and GPIO driver")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/20250118031334.243324-1-yuehaibing@huawei.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Date: Wed Mar 5 16:37:53 2025 +0000
pinctrl: renesas: rza2: Fix missing of_node_put() call
[ Upstream commit abcdeb4e299a11ecb5a3ea0cce00e68e8f540375 ]
of_parse_phandle_with_fixed_args() requires its caller to
call into of_node_put() on the node pointer from the output
structure, but such a call is currently missing.
Call into of_node_put() to rectify that.
Fixes: b59d0e782706 ("pinctrl: Add RZ/A2 pin and gpio controller")
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250305163753.34913-5-fabrizio.castro.jz@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Date: Wed Mar 5 16:37:51 2025 +0000
pinctrl: renesas: rzg2l: Fix missing of_node_put() call
[ Upstream commit a5779e625e2b377f16a6675c432aaf299ce5028c ]
of_parse_phandle_with_fixed_args() requires its caller to
call into of_node_put() on the node pointer from the output
structure, but such a call is currently missing.
Call into of_node_put() to rectify that.
Fixes: c4c4637eb57f ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver")
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250305163753.34913-3-fabrizio.castro.jz@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Date: Sat Feb 15 15:12:35 2025 +0200
pinctrl: renesas: rzg2l: Suppress binding attributes
[ Upstream commit ea4065345643f3163e812e58ed8add2c75c3ee46 ]
Suppress binding attributes for the rzg2l pinctrl driver, as it is an
essential block for Renesas SoCs. Unbinding the driver leads to
warnings from __device_links_no_driver() and can eventually render the
system inaccessible.
Fixes: c4c4637eb57f ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250215131235.228274-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Date: Wed Mar 5 16:37:52 2025 +0000
pinctrl: renesas: rzv2m: Fix missing of_node_put() call
[ Upstream commit 5a550b00704d3a2cd9d766a9427b0f8166da37df ]
of_parse_phandle_with_fixed_args() requires its caller to
call into of_node_put() on the node pointer from the output
structure, but such a call is currently missing.
Call into of_node_put() to rectify that.
Fixes: 92a9b8252576 ("pinctrl: renesas: Add RZ/V2M pin and gpio controller driver")
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250305163753.34913-4-fabrizio.castro.jz@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Prathamesh Shete <pshete@nvidia.com>
Date: Thu Mar 6 10:35:42 2025 +0530
pinctrl: tegra: Set SFIO mode to Mux Register
[ Upstream commit 17013f0acb322e5052ff9b9d0fab0ab5a4bfd828 ]
Tegra devices have an 'sfsel' bit field that determines whether a pin
operates in SFIO (Special Function I/O) or GPIO mode. Currently,
tegra_pinctrl_gpio_disable_free() sets this bit when releasing a GPIO.
However, tegra_pinctrl_set_mux() can be called independently in certain
code paths where gpio_disable_free() is not invoked. In such cases, failing
to set the SFIO mode could lead to incorrect pin configurations, resulting
in functional issues for peripherals relying on SFIO.
This patch ensures that whenever set_mux() is called, the SFIO mode is
correctly set in the Mux Register if the 'sfsel' bit is present. This
prevents situations where the pin remains in GPIO mode despite being
configured for SFIO use.
Fixes: 971dac7123c7 ("pinctrl: add a driver for NVIDIA Tegra")
Signed-off-by: Prathamesh Shete <pshete@nvidia.com>
Link: https://lore.kernel.org/20250306050542.16335-1-pshete@nvidia.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Mon Mar 10 22:48:29 2025 +0300
platform/x86/amd/pmf: fix cleanup in amd_pmf_init_smart_pc()
commit 5b1122fc4995f308b21d7cfc64ef9880ac834d20 upstream.
There are a few problems in this code:
First, if amd_pmf_tee_init() fails then the function returns directly
instead of cleaning up. We cannot simply do a "goto error;" because
the amd_pmf_tee_init() cleanup calls tee_shm_free(dev->fw_shm_pool);
and amd_pmf_tee_deinit() calls it as well leading to a double free.
I have re-written this code to use an unwind ladder to free the
allocations.
Second, if amd_pmf_start_policy_engine() fails on every iteration though
the loop then the code calls amd_pmf_tee_deinit() twice which is also a
double free. Call amd_pmf_tee_deinit() inside the loop for each failed
iteration. Also on that path the error codes are not necessarily
negative kernel error codes. Set the error code to -EINVAL.
There is a very subtle third bug which is that if the call to
input_register_device() in amd_pmf_register_input_device() fails then
we call input_unregister_device() on an input device that wasn't
registered. This will lead to a reference counting underflow
because of the device_del(&dev->dev) in __input_unregister_device().
It's unlikely that anyone would ever hit this bug in real life.
Fixes: 376a8c2a1443 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/232231fc-6a71-495e-971b-be2a76f6db4c@stanley.mountain
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Date: Wed Mar 5 10:28:41 2025 +0530
platform/x86/amd/pmf: Propagate PMF-TA return codes
[ Upstream commit 9ba93cb8212d62bccd8b41b8adb6656abf37280a ]
In the amd_pmf_invoke_cmd_init() function within the PMF driver ensure
that the actual result from the PMF-TA is returned rather than a generic
EIO. This change allows for proper handling of errors originating from the
PMF-TA.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20250305045842.4117767-1-Shyam-sundar.S-k@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Date: Wed Mar 5 10:28:42 2025 +0530
platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA
[ Upstream commit 376a8c2a144397d9cf2a67d403dd64f4a7ff9104 ]
The PMF driver allocates a shared memory buffer using
tee_shm_alloc_kernel_buf() for communication with the PMF-TA.
The latest PMF-TA version introduces new structures with OEM debug
information and additional policy input conditions for evaluating the
policy binary. Consequently, the shared memory size must be increased to
ensure compatibility between the PMF driver and the updated PMF-TA.
To do so, introduce the new PMF-TA UUID and update the PMF shared memory
configuration to ensure compatibility with the latest PMF-TA version.
Additionally, export the TA UUID.
These updates will result in modifications to the prototypes of
amd_pmf_tee_init() and amd_pmf_ta_open_session().
Link: https://lore.kernel.org/all/55ac865f-b1c7-fa81-51c4-d211c7963e7e@linux.intel.com/
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20250305045842.4117767-2-Shyam-sundar.S-k@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David E. Box <david.e.box@linux.intel.com>
Date: Wed Feb 26 13:47:27 2025 -0800
platform/x86/intel/vsec: Add Diamond Rapids support
[ Upstream commit f317f38e7fbb15a0d8329289fef8cf034938fb4f ]
Add PCI ID for the Diamond Rapids Platforms
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20250226214728.1256747-1-david.e.box@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Armin Wolf <W_Armin@gmx.de>
Date: Wed Mar 5 06:30:07 2025 +0100
platform/x86: dell-ddv: Fix temperature calculation
[ Upstream commit 7a248294a3145bc65eb0d8980a0a8edbb1b92db4 ]
On the Dell Inspiron 3505 the battery temperature is always
0.1 degrees larger than the temperature show inside the OEM
application.
Emulate this behaviour to avoid showing strange looking values
like 29.1 degrees.
Fixes: 0331b1b0ba653 ("platform/x86: dell-ddv: Fix temperature scaling")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20250305053009.378609-2-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Tue Mar 4 18:06:39 2025 +0200
platform/x86: dell-uart-backlight: Make dell_uart_bl_serdev_driver static
[ Upstream commit 4878e0b14c3e31a87ab147bd2dae443394cb5a2c ]
Sparse reports:
dell-uart-backlight.c:328:29: warning: symbol
'dell_uart_bl_serdev_driver' was not declared. Should it be static?
Fix it by making the symbol static.
Fixes: 484bae9e4d6ac ("platform/x86: Add new Dell UART backlight driver")
Reviewed-by: Mario Limonciello <maroi.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250304160639.4295-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Panchenko <dmitry@d-systems.ee>
Date: Thu Feb 20 17:39:31 2025 +0200
platform/x86: intel-hid: fix volume buttons on Microsoft Surface Go 4 tablet
[ Upstream commit 2738d06fb4f01145b24c542fb06de538ffc56430 ]
Volume buttons on Microsoft Surface Go 4 tablet didn't send any events.
Add Surface Go 4 DMI match to button_array_table to fix this.
Signed-off-by: Dmitry Panchenko <dmitry@d-systems.ee>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250220154016.3620917-1-dmitry@d-systems.ee
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date: Fri Mar 28 15:47:49 2025 -0700
platform/x86: ISST: Correct command storage data length
commit 9462e74c5c983cce34019bfb27f734552bebe59f upstream.
After resume/online turbo limit ratio (TRL) is restored partially if
the admin explicitly changed TRL from user space.
A hash table is used to store SST mail box and MSR settings when modified
to restore those settings after resume or online. This uses a struct
isst_cmd field "data" to store these settings. This is a 64 bit field.
But isst_store_new_cmd() is only assigning as u32. This results in
truncation of 32 bits.
Change the argument to u64 from u32.
Fixes: f607874f35cb ("platform/x86: ISST: Restore state on resume")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250328224749.2691272-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Tue Mar 4 18:06:38 2025 +0200
platform/x86: lenovo-yoga-tab2-pro-1380-fastcharger: Make symbol static
[ Upstream commit 886ca11a0c70efe5627a18557062e8a44370d78f ]
Sparse reports:
lenovo-yoga-tab2-pro-1380-fastcharger.c:222:29: warning: symbol
'yt2_1380_fc_serdev_driver' was not declared. Should it be static?
Fix that by making the symbol static.
Fixes: b2ed33e8d486a ("platform/x86: Add lenovo-yoga-tab2-pro-1380-fastcharger driver")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250304160639.4295-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eduard Christian Dumitrescu <eduard.c.dumitrescu@gmail.com>
Date: Mon Mar 24 11:24:42 2025 -0400
platform/x86: thinkpad_acpi: disable ACPI fan access for T495* and E560
commit 2b9f84e7dc863afd63357b867cea246aeedda036 upstream.
T495, T495s, and E560 laptops have the FANG+FANW ACPI methods
(therefore fang_handle and fanw_handle are not NULL) but they do not
actually work, which results in a "No such device or address" error.
The DSDT table code for the FANG+FANW methods doesn't seem to do
anything special regarding the fan being secondary. The bug was
introduced in commit 57d0557dfa49 ("platform/x86: thinkpad_acpi: Add
Thinkpad Edge E531 fan support"), which added a new fan control method
via the FANG+FANW ACPI methods.
Add a quirk for T495, T495s, and E560 to avoid the FANG+FANW methods.
Fan access and control is restored after forcing the legacy non-ACPI
fan control method by setting both fang_handle and fanw_handle to NULL.
Reported-by: Vlastimil Holer <vlastimil.holer@gmail.com>
Fixes: 57d0557dfa49 ("platform/x86: thinkpad_acpi: Add Thinkpad Edge E531 fan support")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219643
Cc: stable@vger.kernel.org
Tested-by: Alireza Elikahi <scr0lll0ck1s4b0v3h0m3k3y@gmail.com>
Reviewed-by: Kurt Borja <kuurtb@gmail.com>
Signed-off-by: Eduard Christian Dumitrescu <eduard.c.dumitrescu@gmail.com>
Co-developed-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://lore.kernel.org/r/20250324152442.106113-1-ImanDevel@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Thu Feb 27 11:53:50 2025 +0100
PM: sleep: Adjust check before setting power.must_resume
[ Upstream commit eeb87d17aceab7803a5a5bcb6cf2817b745157cf ]
The check before setting power.must_resume in device_suspend_noirq()
does not take power.child_count into account, but it should do that, so
use pm_runtime_need_not_resume() in it for this purpose and adjust the
comment next to it accordingly.
Fixes: 107d47b2b95e ("PM: sleep: core: Simplify the SMART_SUSPEND flag handling")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/3353728.44csPzL39Z@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Thu Mar 13 17:00:00 2025 +0100
PM: sleep: Fix handling devices with direct_complete set on errors
[ Upstream commit 03f1444016b71feffa1dfb8a51f15ba592f94b13 ]
When dpm_suspend() fails, some devices with power.direct_complete set
may not have been handled by device_suspend() yet, so runtime PM has
not been disabled for them yet even though power.direct_complete is set.
Since device_resume() expects that runtime PM has been disabled for all
devices with power.direct_complete set, it will attempt to reenable
runtime PM for the devices that have not been processed by device_suspend()
which does not make sense. Had those devices had runtime PM disabled
before device_suspend() had run, device_resume() would have inadvertently
enable runtime PM for them, but this is not expected to happen because
it would require ->prepare() callbacks to return positive values for
devices with runtime PM disabled, which would be invalid.
In practice, this issue is most likely benign because pm_runtime_enable()
will not allow the "disable depth" counter to underflow, but it causes a
warning message to be printed for each affected device.
To allow device_resume() to distinguish the "direct complete" devices
that have been processed by device_suspend() from those which have not
been handled by it, make device_suspend() set power.is_suspended for
"direct complete" devices.
Next, move the power.is_suspended check in device_resume() before the
power.direct_complete check in it to make it skip the "direct complete"
devices that have not been handled by device_suspend().
This change is based on a preliminary patch from Saravana Kannan.
Fixes: aae4518b3124 ("PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily")
Link: https://lore.kernel.org/linux-pm/20241114220921.2529905-2-saravanak@google.com/
Reported-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Link: https://patch.msgid.link/12627587.O9o76ZdvQC@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sicelo A. Mhlongo <absicsz@gmail.com>
Date: Mon Nov 25 17:29:30 2024 +0200
power: supply: bq27xxx_battery: do not update cached flags prematurely
[ Upstream commit 45291874a762dbb12a619dc2efaf84598859007a ]
Commit 243f8ffc883a1 ("power: supply: bq27xxx_battery: Notify also about
status changes") intended to notify userspace when the status changes,
based on the flags register. However, the cached state is updated too
early, before the flags are tested for any changes. Remove the premature
update.
Fixes: 243f8ffc883a1 ("power: supply: bq27xxx_battery: Notify also about status changes")
Signed-off-by: Sicelo A. Mhlongo <absicsz@gmail.com>
Link: https://lore.kernel.org/r/20241125152945.47937-1-absicsz@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Artur Weber <aweber.kernel@gmail.com>
Date: Sun Mar 16 21:11:49 2025 +0100
power: supply: max77693: Fix wrong conversion of charge input threshold value
[ Upstream commit 30cc7b0d0e9341d419eb7da15fb5c22406dbe499 ]
The charge input threshold voltage register on the MAX77693 PMIC accepts
four values: 0x0 for 4.3v, 0x1 for 4.7v, 0x2 for 4.8v and 0x3 for 4.9v.
Due to an oversight, the driver calculated the values for 4.7v and above
starting from 0x0, rather than from 0x1 ([(4700000 - 4700000) / 100000]
gives 0).
Add 1 to the calculation to ensure that 4.7v is converted to a register
value of 0x1 and that the other two voltages are converted correctly as
well.
Fixes: 87c2d9067893 ("power: max77693: Add charger driver for Maxim 77693")
Signed-off-by: Artur Weber <aweber.kernel@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250316-max77693-charger-input-threshold-fix-v1-1-2b037d0ac722@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date: Thu Mar 6 11:24:28 2025 +0100
powerpc/kexec: fix physical address calculation in clear_utlb_entry()
[ Upstream commit 861efb8a48ee8b73ae4e8817509cd4e82fd52bc4 ]
In relocate_32.S, function clear_utlb_entry() goes into real mode. To
do so, it has to calculate the physical address based on the virtual
address. To get the virtual address it uses 'bl' which is problematic
(see commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()")). In addition, the calculation is done on a
wrong address because 'bl' loads LR with the address of the following
instruction, not the address of the target. So when the target is not
the instruction following the 'bl' instruction, it may lead to
unexpected behaviour.
Fix it by re-writing the code so that is goes via another path which
is based 'bcl 20,31,.+4' which is the right instruction to use for that.
Fixes: 683430200315 ("powerpc/47x: Kernel support for KEXEC")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/dc4f9616fba9c05c5dbf9b4b5480eb1c362adc17.1741256651.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 31 21:26:46 2025 -0700
rcu-tasks: Always inline rcu_irq_work_resched()
[ Upstream commit 6309a5c43b0dc629851f25b2e5ef8beff61d08e5 ]
Thanks to CONFIG_DEBUG_SECTION_MISMATCH, empty functions can be
generated out of line. rcu_irq_work_resched() can be called from
noinstr code, so make sure it's always inlined.
Fixes: 564506495ca9 ("rcu/context-tracking: Move deferred nocb resched to context tracking")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/e84f15f013c07e4c410d972e75620c53b62c1b3e.1743481539.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/d1eca076-fdde-484a-b33e-70e0d167c36d@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Roman Gushchin <roman.gushchin@linux.dev>
Date: Thu Feb 27 16:54:20 2025 +0000
RDMA/core: Don't expose hw_counters outside of init net namespace
[ Upstream commit a1ecb30f90856b0be4168ad51b8875148e285c1f ]
Commit 467f432a521a ("RDMA/core: Split port and device counter sysfs
attributes") accidentally almost exposed hw counters to non-init net
namespaces. It didn't expose them fully, as an attempt to read any of
those counters leads to a crash like this one:
[42021.807566] BUG: kernel NULL pointer dereference, address: 0000000000000028
[42021.814463] #PF: supervisor read access in kernel mode
[42021.819549] #PF: error_code(0x0000) - not-present page
[42021.824636] PGD 0 P4D 0
[42021.827145] Oops: 0000 [#1] SMP PTI
[42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded Tainted: G S W I XXX
[42021.841697] Hardware name: XXX
[42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff <48> 8b 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
[42021.873931] RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287
[42021.879108] RAX: ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
[42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI: ffff940c7517aef0
[42021.893230] RBP: ffff97fe90f03e70 R08: ffff94085f1aa000 R09: 0000000000000000
[42021.900294] R10: ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
[42021.907355] R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
[42021.914418] FS: 00007fda1a3b9700(0000) GS:ffff94453fb80000(0000) knlGS:0000000000000000
[42021.922423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42021.928130] CR2: 0000000000000028 CR3: 00000042dcfb8003 CR4: 00000000003726f0
[42021.935194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42021.949324] Call Trace:
[42021.951756] <TASK>
[42021.953842] [<ffffffff86c58674>] ? show_regs+0x64/0x70
[42021.959030] [<ffffffff86c58468>] ? __die+0x78/0xc0
[42021.963874] [<ffffffff86c9ef75>] ? page_fault_oops+0x2b5/0x3b0
[42021.969749] [<ffffffff87674b92>] ? exc_page_fault+0x1a2/0x3c0
[42021.975549] [<ffffffff87801326>] ? asm_exc_page_fault+0x26/0x30
[42021.981517] [<ffffffffc0775680>] ? __pfx_show_hw_stats+0x10/0x10 [ib_core]
[42021.988482] [<ffffffffc077564e>] ? hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.995438] [<ffffffff86ac7f8e>] dev_attr_show+0x1e/0x50
[42022.000803] [<ffffffff86a3eeb1>] sysfs_kf_seq_show+0x81/0xe0
[42022.006508] [<ffffffff86a11134>] seq_read_iter+0xf4/0x410
[42022.011954] [<ffffffff869f4b2e>] vfs_read+0x16e/0x2f0
[42022.017058] [<ffffffff869f50ee>] ksys_read+0x6e/0xe0
[42022.022073] [<ffffffff8766f1ca>] do_syscall_64+0x6a/0xa0
[42022.027441] [<ffffffff8780013b>] entry_SYSCALL_64_after_hwframe+0x78/0xe2
The problem can be reproduced using the following steps:
ip netns add foo
ip netns exec foo bash
cat /sys/class/infiniband/mlx4_0/hw_counters/*
The panic occurs because of casting the device pointer into an
ib_device pointer using container_of() in hw_stat_device_show() is
wrong and leads to a memory corruption.
However the real problem is that hw counters should never been exposed
outside of the non-init net namespace.
Fix this by saving the index of the corresponding attribute group
(it might be 1 or 2 depending on the presence of driver-specific
attributes) and zeroing the pointer to hw_counters group for compat
devices during the initialization.
With this fix applied hw_counters are not available in a non-init
net namespace:
find /sys/class/infiniband/mlx4_0/ -name hw_counters
/sys/class/infiniband/mlx4_0/ports/1/hw_counters
/sys/class/infiniband/mlx4_0/ports/2/hw_counters
/sys/class/infiniband/mlx4_0/hw_counters
ip netns add foo
ip netns exec foo bash
find /sys/class/infiniband/mlx4_0/ -name hw_counters
Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs attributes")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Maher Sanalla <msanalla@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://patch.msgid.link/20250227165420.3430301-1-roman.gushchin@linux.dev
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wang Liang <wangliang74@huawei.com>
Date: Thu Mar 13 17:24:21 2025 +0800
RDMA/core: Fix use-after-free when rename device name
[ Upstream commit 1d6a9e7449e2a0c1e2934eee7880ba8bd1e464cd ]
Syzbot reported a slab-use-after-free with the following call trace:
==================================================================
BUG: KASAN: slab-use-after-free in nla_put+0xd3/0x150 lib/nlattr.c:1099
Read of size 5 at addr ffff888140ea1c60 by task syz.0.988/10025
CPU: 0 UID: 0 PID: 10025 Comm: syz.0.988
Not tainted 6.14.0-rc4-syzkaller-00859-gf77f12010f67 #0
Hardware name: Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0x16e/0x5b0 mm/kasan/report.c:521
kasan_report+0x143/0x180 mm/kasan/report.c:634
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
__asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
nla_put+0xd3/0x150 lib/nlattr.c:1099
nla_put_string include/net/netlink.h:1621 [inline]
fill_nldev_handle+0x16e/0x200 drivers/infiniband/core/nldev.c:265
rdma_nl_notify_event+0x561/0xef0 drivers/infiniband/core/nldev.c:2857
ib_device_notify_register+0x22/0x230 drivers/infiniband/core/device.c:1344
ib_register_device+0x1292/0x1460 drivers/infiniband/core/device.c:1460
rxe_register_device+0x233/0x350 drivers/infiniband/sw/rxe/rxe_verbs.c:1540
rxe_net_add+0x74/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:550
rxe_newlink+0xde/0x1a0 drivers/infiniband/sw/rxe/rxe.c:212
nldev_newlink+0x5ea/0x680 drivers/infiniband/core/nldev.c:1795
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x6dd/0x9e0 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:709 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:724
____sys_sendmsg+0x53a/0x860 net/socket.c:2564
___sys_sendmsg net/socket.c:2618 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2650
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f42d1b8d169
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 ...
RSP: 002b:00007f42d2960038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f42d1da6320 RCX: 00007f42d1b8d169
RDX: 0000000000000000 RSI: 00004000000002c0 RDI: 000000000000000c
RBP: 00007f42d1c0e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f42d1da6320 R15: 00007ffe399344a8
</TASK>
Allocated by task 10025:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__do_kmalloc_node mm/slub.c:4294 [inline]
__kmalloc_node_track_caller_noprof+0x28b/0x4c0 mm/slub.c:4313
__kmemdup_nul mm/util.c:61 [inline]
kstrdup+0x42/0x100 mm/util.c:81
kobject_set_name_vargs+0x61/0x120 lib/kobject.c:274
dev_set_name+0xd5/0x120 drivers/base/core.c:3468
assign_name drivers/infiniband/core/device.c:1202 [inline]
ib_register_device+0x178/0x1460 drivers/infiniband/core/device.c:1384
rxe_register_device+0x233/0x350 drivers/infiniband/sw/rxe/rxe_verbs.c:1540
rxe_net_add+0x74/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:550
rxe_newlink+0xde/0x1a0 drivers/infiniband/sw/rxe/rxe.c:212
nldev_newlink+0x5ea/0x680 drivers/infiniband/core/nldev.c:1795
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x6dd/0x9e0 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:709 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:724
____sys_sendmsg+0x53a/0x860 net/socket.c:2564
___sys_sendmsg net/socket.c:2618 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2650
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 10035:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:576
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2353 [inline]
slab_free mm/slub.c:4609 [inline]
kfree+0x196/0x430 mm/slub.c:4757
kobject_rename+0x38f/0x410 lib/kobject.c:524
device_rename+0x16a/0x200 drivers/base/core.c:4525
ib_device_rename+0x270/0x710 drivers/infiniband/core/device.c:402
nldev_set_doit+0x30e/0x4c0 drivers/infiniband/core/nldev.c:1146
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x6dd/0x9e0 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:709 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:724
____sys_sendmsg+0x53a/0x860 net/socket.c:2564
___sys_sendmsg net/socket.c:2618 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2650
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
This is because if rename device happens, the old name is freed in
ib_device_rename() with lock, but ib_device_notify_register() may visit
the dev name locklessly by event RDMA_REGISTER_EVENT or
RDMA_NETDEV_ATTACH_EVENT.
Fix this by hold devices_rwsem in ib_device_notify_register().
Reported-by: syzbot+f60349ba1f9f08df349f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=25bc6f0ed2b88b9eb9b8
Fixes: 9cbed5aab5ae ("RDMA/nldev: Add support for RDMA monitoring")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250313092421.944658-1-wangliang74@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cheng Xu <chengyou@linux.alibaba.com>
Date: Thu Mar 6 20:04:40 2025 +0800
RDMA/erdma: Prevent use-after-free in erdma_accept_newconn()
[ Upstream commit 83437689249e6a17b25e27712fbee292e42e7855 ]
After the erdma_cep_put(new_cep) being called, new_cep will be freed,
and the following dereference will cause a UAF problem. Fix this issue.
Fixes: 920d93eac8b9 ("RDMA/erdma: Add connection management (CM) support")
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kees Bakker <kees@ijzerbout.nl>
Date: Fri Feb 21 20:39:03 2025 +0100
RDMA/mana_ib: Ensure variable err is initialized
[ Upstream commit be35a3127d60964b338da95c7bfaaf4a01b330d4 ]
In the function mana_ib_gd_create_dma_region if there are no dma blocks
to process the variable `err` remains uninitialized.
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Signed-off-by: Kees Bakker <kees@ijzerbout.nl>
Link: https://patch.msgid.link/20250221195833.7516C16290A@bout3.ijzerbout.nl
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date: Thu Mar 13 16:29:54 2025 +0200
RDMA/mlx5: Fix calculation of total invalidated pages
[ Upstream commit 79195147644653ebffadece31a42181e4c48c07d ]
When invalidating an address range in mlx5, there is an optimization to
do UMR operations in chunks.
Previously, the invalidation counter was incorrectly updated for the
same indexes within a chunk. Now, the invalidation counter is updated
only when a chunk is complete and mlx5r_umr_update_xlt() is called.
This ensures that the counter accurately represents the number of pages
invalidated using UMR.
Fixes: a3de94e3d61e ("IB/mlx5: Introduce ODP diagnostic counters")
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://patch.msgid.link/560deb2433318e5947282b070c915f3c81fef77f.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Patrisious Haddad <phaddad@nvidia.com>
Date: Thu Mar 13 16:29:53 2025 +0200
RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow
[ Upstream commit 5ed3b0cb3f827072e93b4c5b6e2b8106fd7cccbd ]
When cur_qp isn't NULL, in order to avoid fetching the QP from
the radix tree again we check if the next cqe QP is identical to
the one we already have.
The bug however is that we are checking if the QP is identical by
checking the QP number inside the CQE against the QP number inside the
mlx5_ib_qp, but that's wrong since the QP number from the CQE is from
FW so it should be matched against mlx5_core_qp which is our FW QP
number.
Otherwise we could use the wrong QP when handling a CQE which could
cause the kernel trace below.
This issue is mainly noticeable over QPs 0 & 1, since for now they are
the only QPs in our driver whereas the QP number inside mlx5_ib_qp
doesn't match the QP number inside mlx5_core_qp.
BUG: kernel NULL pointer dereference, address: 0000000000000012
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP
CPU: 0 UID: 0 PID: 7927 Comm: kworker/u62:1 Not tainted 6.14.0-rc3+ #189
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
RIP: 0010:mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
Code: 03 00 00 8d 58 ff 21 cb 66 39 d3 74 39 48 c7 c7 3c 89 6e a0 0f b7 db e8 b7 d2 b3 e0 49 8b 86 60 03 00 00 48 c7 c7 4a 89 6e a0 <0f> b7 5c 98 02 e8 9f d2 b3 e0 41 0f b7 86 78 03 00 00 83 e8 01 21
RSP: 0018:ffff88810511bd60 EFLAGS: 00010046
RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88885fa1b3c0 RDI: ffffffffa06e894a
RBP: 00000000000000b0 R08: 0000000000000000 R09: ffff88810511bc10
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88810d593000
R13: ffff88810e579108 R14: ffff888105146000 R15: 00000000000000b0
FS: 0000000000000000(0000) GS:ffff88885fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 00000001077e6001 CR4: 0000000000370eb0
Call Trace:
<TASK>
? __die+0x20/0x60
? page_fault_oops+0x150/0x3e0
? exc_page_fault+0x74/0x130
? asm_exc_page_fault+0x22/0x30
? mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
__ib_process_cq+0x5a/0x150 [ib_core]
ib_cq_poll_work+0x31/0x90 [ib_core]
process_one_work+0x169/0x320
worker_thread+0x288/0x3a0
? work_busy+0xb0/0xb0
kthread+0xd7/0x1f0
? kthreads_online_cpu+0x130/0x130
? kthreads_online_cpu+0x130/0x130
ret_from_fork+0x2d/0x50
? kthreads_online_cpu+0x130/0x130
ret_from_fork_asm+0x11/0x20
</TASK>
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/4ada09d41f1e36db62c44a9b25c209ea5f054316.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Guralnik <michaelgur@nvidia.com>
Date: Thu Mar 13 16:29:48 2025 +0200
RDMA/mlx5: Fix MR cache initialization error flow
[ Upstream commit a0130ef84b00c68ba0b79ee974a0f01459741421 ]
Destroy all previously created cache entries and work queue when rolling
back the MR cache initialization upon an error.
Fixes: 73d09b2fe833 ("RDMA/mlx5: Introduce mlx5r_cache_rb_key")
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/c41d525fb3c72e28dd38511bf3aaccb5d584063e.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Guralnik <michaelgur@nvidia.com>
Date: Thu Mar 13 16:29:51 2025 +0200
RDMA/mlx5: Fix page_size variable overflow
[ Upstream commit f0c2427412b43cdf1b7b0944749ea17ddb97d5a5 ]
Change all variables storing mlx5_umem_mkc_find_best_pgsz() result to
unsigned long to support values larger than 31 and avoid overflow.
For example: If we try to register 4GB of memory that is contiguous in
physical memory, the driver will optimize the page_size and try to use
an mkey with 4GB entity size. The 'unsigned int' page_size variable will
overflow to '0' and we'll hit the WARN_ON() in alloc_cacheable_mr().
WARNING: CPU: 2 PID: 1203 at drivers/infiniband/hw/mlx5/mr.c:1124 alloc_cacheable_mr+0x22/0x580 [mlx5_ib]
Modules linked in: mlx5_ib mlx5_core bonding ip6_gre ip6_tunnel tunnel6 ip_gre gre rdma_rxe rdma_ucm ib_uverbs ib_ipoib ib_umad rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm fuse ib_core [last unloaded: mlx5_core]
CPU: 2 UID: 70878 PID: 1203 Comm: rdma_resource_l Tainted: G W 6.14.0-rc4-dirty #43
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:alloc_cacheable_mr+0x22/0x580 [mlx5_ib]
Code: 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 41 52 53 48 83 ec 30 f6 46 28 04 4c 8b 77 08 75 21 <0f> 0b 49 c7 c2 ea ff ff ff 48 8d 65 d0 4c 89 d0 5b 41 5a 41 5c 41
RSP: 0018:ffffc900006ffac8 EFLAGS: 00010246
RAX: 0000000004c0d0d0 RBX: ffff888217a22000 RCX: 0000000000100001
RDX: 00007fb7ac480000 RSI: ffff8882037b1240 RDI: ffff8882046f0600
RBP: ffffc900006ffb28 R08: 0000000000000001 R09: 0000000000000000
R10: 00000000000007e0 R11: ffffea0008011d40 R12: ffff8882037b1240
R13: ffff8882046f0600 R14: ffff888217a22000 R15: ffffc900006ffe00
FS: 00007fb7ed013340(0000) GS:ffff88885fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb7ed1d8000 CR3: 00000001fd8f6006 CR4: 0000000000772eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0x81/0x130
? alloc_cacheable_mr+0x22/0x580 [mlx5_ib]
? report_bug+0xfc/0x1e0
? handle_bug+0x55/0x90
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? alloc_cacheable_mr+0x22/0x580 [mlx5_ib]
create_real_mr+0x54/0x150 [mlx5_ib]
ib_uverbs_reg_mr+0x17f/0x2a0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xca/0x140 [ib_uverbs]
ib_uverbs_run_method+0x6d0/0x780 [ib_uverbs]
? __pfx_ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x10/0x10 [ib_uverbs]
ib_uverbs_cmd_verbs+0x19b/0x360 [ib_uverbs]
? walk_system_ram_range+0x79/0xd0
? ___pte_offset_map+0x1b/0x110
? __pte_offset_map_lock+0x80/0x100
ib_uverbs_ioctl+0xac/0x110 [ib_uverbs]
__x64_sys_ioctl+0x94/0xb0
do_syscall_64+0x50/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fb7ecf0737b
Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 2a 0f 00 f7 d8 64 89 01 48
RSP: 002b:00007ffdbe03ecc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffdbe03edb8 RCX: 00007fb7ecf0737b
RDX: 00007ffdbe03eda0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffdbe03ed80 R08: 00007fb7ecc84010 R09: 00007ffdbe03eed4
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffdbe03eed4
R13: 000000000000000c R14: 000000000000000c R15: 00007fb7ecc84150
</TASK>
Fixes: cef7dde8836a ("net/mlx5: Expand mkey page size to support 6 bits")
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/2479a4a3f6fd9bd032e1b6d396274a89c4c5e22f.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frieder Schrempf <frieder.schrempf@kontron.de>
Date: Wed Dec 18 16:27:28 2024 +0100
regulator: pca9450: Fix enable register for LDO5
[ Upstream commit f5aab0438ef17f01c5ecd25e61ae6a03f82a4586 ]
The LDO5 regulator has two configuration registers, but only
LDO5CTRL_L contains the bits for enabling/disabling the regulator.
Fixes: 0935ff5f1f0a ("regulator: pca9450: add pca9450 pmic driver")
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://patch.msgid.link/20241218152842.97483-6-frieder@fris.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Wed Mar 19 18:01:05 2025 +0800
remoteproc: core: Clear table_sz when rproc_shutdown
[ Upstream commit efdde3d73ab25cef4ff2d06783b0aad8b093c0e4 ]
There is case as below could trigger kernel dump:
Use U-Boot to start remote processor(rproc) with resource table
published to a fixed address by rproc. After Kernel boots up,
stop the rproc, load a new firmware which doesn't have resource table
,and start rproc.
When starting rproc with a firmware not have resource table,
`memcpy(loaded_table, rproc->cached_table, rproc->table_sz)` will
trigger dump, because rproc->cache_table is set to NULL during the last
stop operation, but rproc->table_sz is still valid.
This issue is found on i.MX8MP and i.MX9.
Dump as below:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000010af63000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 UID: 0 PID: 1060 Comm: sh Not tainted 6.14.0-rc7-next-20250317-dirty #38
Hardware name: NXP i.MX8MPlus EVK board (DT)
pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __pi_memcpy_generic+0x110/0x22c
lr : rproc_start+0x88/0x1e0
Call trace:
__pi_memcpy_generic+0x110/0x22c (P)
rproc_boot+0x198/0x57c
state_store+0x40/0x104
dev_attr_store+0x18/0x2c
sysfs_kf_write+0x7c/0x94
kernfs_fop_write_iter+0x120/0x1cc
vfs_write+0x240/0x378
ksys_write+0x70/0x108
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x48/0x10c
el0_svc_common.constprop.0+0xc0/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x30/0xcc
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x198/0x19c
Clear rproc->table_sz to address the issue.
Fixes: 9dc9507f1880 ("remoteproc: Properly deal with the resource table when detaching")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250319100106.3622619-1-peng.fan@oss.nxp.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca.weiss@fairphone.com>
Date: Fri Mar 14 09:24:31 2025 +0100
remoteproc: qcom: pas: add minidump_id to SC7280 WPSS
[ Upstream commit d2909538bff0189d4d038f4e903c70be5f5c2bfc ]
Add the minidump ID to the wpss resources, based on msm-5.4 devicetree.
Fixes: 300ed425dfa9 ("remoteproc: qcom_q6v5_pas: Add SC7280 ADSP, CDSP & WPSS")
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/r/20250314-sc7280-wpss-minidump-v1-1-d869d53fd432@fairphone.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca@lucaweiss.eu>
Date: Mon Feb 17 23:05:18 2025 +0100
remoteproc: qcom_q6v5_mss: Handle platforms with one power domain
[ Upstream commit 4641840341f37dc8231e0840ec1514b4061b4322 ]
For example MSM8974 has mx voltage rail exposed as regulator and only cx
voltage rail is exposed as power domain. This power domain (cx) is
attached internally in power domain and cannot be attached in this driver.
Fixes: 8750cf392394 ("remoteproc: qcom_q6v5_mss: Allow replacing regulators with power domains")
Co-developed-by: Matti Lehtimäki <matti.lehtimaki@gmail.com>
Signed-off-by: Matti Lehtimäki <matti.lehtimaki@gmail.com>
Reviewed-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Luca Weiss <luca@lucaweiss.eu>
Link: https://lore.kernel.org/r/20250217-msm8226-modem-v5-4-2bc74b80e0ae@lucaweiss.eu
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca@lucaweiss.eu>
Date: Tue Jan 28 22:54:00 2025 +0100
remoteproc: qcom_q6v5_pas: Make single-PD handling more robust
[ Upstream commit e917b73234b02aa4966325e7380d2559bf127ba9 ]
Only go into the if condition for single-PD handling when there's
actually just one power domain specified there. Otherwise it'll be an
issue in the dts and we should fail in the regular code path.
This also mirrors the latest changes in the qcom_q6v5_mss driver.
Suggested-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Fixes: 17ee2fb4e856 ("remoteproc: qcom: pas: Vote for active/proxy power domains")
Signed-off-by: Luca Weiss <luca@lucaweiss.eu>
Reviewed-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Link: https://lore.kernel.org/r/20250128-pas-singlepd-v1-2-85d9ae4b0093@lucaweiss.eu
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca@lucaweiss.eu>
Date: Tue Jan 28 22:53:59 2025 +0100
remoteproc: qcom_q6v5_pas: Use resource with CX PD for MSM8226
[ Upstream commit ba785ff4162a65f18ed501019637a998b752b5ad ]
MSM8226 requires the CX power domain, so use the msm8996_adsp_resource
which has cx under proxy_pd_names and is otherwise equivalent.
Suggested-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Fixes: fb4f07cc9399 ("remoteproc: qcom: pas: Add MSM8226 ADSP support")
Signed-off-by: Luca Weiss <luca@lucaweiss.eu>
Reviewed-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Link: https://lore.kernel.org/r/20250128-pas-singlepd-v1-1-85d9ae4b0093@lucaweiss.eu
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Candice Li <candice.li@amd.com>
Date: Wed Mar 26 13:41:01 2025 +0800
Remove unnecessary firmware version check for gc v9_4_2
commit 5b3c08ae9ed324743f5f7286940d45caeb656e6e upstream.
GC v9_4_2 uses a new versioning scheme for CP firmware, making
the warning ("CP firmware version too old, please update!") irrelevant.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Feng Yang <yangfeng@kylinos.cn>
Date: Sun Feb 23 15:01:06 2025 +0800
ring-buffer: Fix bytes_dropped calculation issue
[ Upstream commit c73f0b69648501978e8b3e8fa7eef7f4197d0481 ]
The calculation of bytes-dropped and bytes_dropped_nested is reversed.
Although it does not affect the final calculation of total_dropped,
it should still be modified.
Link: https://lore.kernel.org/20250223070106.6781-1-yangfeng59949@163.com
Fixes: 6c43e554a2a5 ("ring-buffer: Add ring buffer startup selftest")
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Palmer Dabbelt <palmer@rivosinc.com>
Date: Wed Mar 26 15:45:07 2025 -0700
RISC-V: errata: Use medany for relocatable builds
[ Upstream commit bb58e1579f431d42469b6aed0f03eff383ba6db5 ]
We're trying to mix non-PIC/PIE objects into the otherwise-PIE
relocatable kernels, to avoid GOT/PLT references during early boot
alternative resolution (which happens before the GOT/PLT are set up).
riscv64-unknown-linux-gnu-ld: arch/riscv/errata/sifive/errata.o: relocation R_RISCV_HI20 against `tlb_flush_all_threshold' can not be used when making a shared object; recompile with -fPIC
riscv64-unknown-linux-gnu-ld: arch/riscv/errata/thead/errata.o: relocation R_RISCV_HI20 against `riscv_cbom_block_size' can not be used when making a shared object; recompile with -fPIC
Fixes: 8dc2a7e8027f ("riscv: Fix relocatable kernels with early alternatives using -fno-pie")
Link: https://lore.kernel.org/r/20250326224506.27165-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Atish Patra <atishp@rivosinc.com>
Date: Mon Mar 3 14:53:06 2025 -0800
RISC-V: KVM: Disable the kernel perf counter during configure
[ Upstream commit bbb622488749478955485765ddff9d56be4a7e4b ]
The perf event should be marked disabled during the creation as
it is not ready to be scheduled until there is SBI PMU start call
or config matching is called with auto start. Otherwise, event add/start
gets called during perf_event_create_kernel_counter function.
It will be enabled and scheduled to run via perf_event_enable during
either the above mentioned scenario.
Fixes: 0cb74b65d2e5 ("RISC-V: KVM: Implement perf support without sampling")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-1-41d177e45929@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yao Zi <ziyao@disroot.org>
Date: Wed Mar 26 05:14:46 2025 +0000
riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
[ Upstream commit 28093cfef5dd62f4cbd537f2bdf6f0bf85309c45 ]
Commit 58ff537109ac ("riscv: Omit optimized string routines when
using KASAN") introduced calls to EXPORT_SYMBOL() in assembly string
routines, which result in R_RISCV_64 relocations against
.export_symbol section. As these rountines are reused by RISC-V
purgatory and our relocator doesn't recognize these relocations, this
fails kexec-file-load with dmesg like
[ 11.344251] kexec_image: Unknown rela relocation: 2
[ 11.345972] kexec_image: Error loading purgatory ret=-8
Let's support R_RISCV_64 relocation to fix kexec on 64-bit RISC-V.
32-bit variant isn't covered since KEXEC_FILE and KEXEC_PURGATORY isn't
available.
Fixes: 58ff537109ac ("riscv: Omit optimized string routines when using KASAN")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250326051445.55131-2-ziyao@disroot.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Björn Töpel <bjorn@rivosinc.com>
Date: Fri Mar 28 09:53:11 2025 +0100
riscv/purgatory: 4B align purgatory_start
[ Upstream commit 3f7023171df43641a8a8a1c9a12124501e589010 ]
When a crashkernel is launched on RISC-V, the entry to purgatory is
done by trapping via the stvec CSR. From riscv_kexec_norelocate():
| ...
| /*
| * Switch to physical addressing
| * This will also trigger a jump to CSR_STVEC
| * which in this case is the address of the new
| * kernel.
| */
| csrw CSR_STVEC, a2
| csrw CSR_SATP, zero
stvec requires that the address is 4B aligned, which was not the case,
e.g.:
| Loaded purgatory at 0xffffc000
| kexec_file: kexec_file_load: type:1, start:0xffffd232 head:0x4 flags:0x6
The address 0xffffd232 not 4B aligned.
Correct by adding proper function alignment.
With this change, crashkernels loaded with kexec-file will be able to
properly enter the purgatory.
Fixes: 736e30af583fb ("RISC-V: Add purgatory")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250328085313.1193815-1-bjorn@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date: Mon Mar 17 08:25:51 2025 +0100
riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
[ Upstream commit 83d78ac677b9fdd8ea763507c6fe02d6bf415f3a ]
Ryan sent a fix [1] for arm64 that applies to riscv too: in some hugetlb
functions, we must not use the pte value to get the size of a mapping
because the pte may not be present.
So use the already present size parameter for huge_pte_clear() and the
newly introduced size parameter for huge_ptep_get_and_clear(). And make
sure to gather A/D bits only on present ptes.
Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
Link: https://lore.kernel.org/all/20250217140419.1702389-1-ryan.roberts@arm.com/ [1]
Link: https://lore.kernel.org/r/20250317072551.572169-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Juhan Jin <juhan.jin@foxmail.com>
Date: Thu Feb 6 13:28:36 2025 -0600
riscv: ftrace: Add parentheses in macro definitions of make_call_t0 and make_call_ra
[ Upstream commit 5f1a58ed91a040d4625d854f9bb3dd4995919202 ]
This patch adds parentheses to parameters caller and callee of macros
make_call_t0 and make_call_ra. Every existing invocation of these two
macros uses a single variable for each argument, so the absence of the
parentheses seems okay. However, future invocations might use more
complex expressions as arguments. For example, a future invocation might
look like this: make_call_t0(a - b, c, call). Without parentheses in the
macro definition, the macro invocation expands to:
...
unsigned int offset = (unsigned long) c - (unsigned long) a - b;
...
which is clearly wrong.
The use of parentheses ensures arguments are correctly evaluated and
potentially saves future users of make_call_t0 and make_call_ra debugging
trouble.
Fixes: 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half")
Signed-off-by: Juhan Jin <juhan.jin@foxmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/tencent_AE90AA59903A628E87E9F80E563DA5BA5508@qq.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lubomir Rintel <lkundrak@v3.sk>
Date: Tue Mar 25 10:58:41 2025 +0100
rndis_host: Flag RNDIS modems as WWAN devices
[ Upstream commit 67d1a8956d2d62fe6b4c13ebabb57806098511d8 ]
Set FLAG_WWAN instead of FLAG_ETHERNET for RNDIS interfaces on Mobile
Broadband Modems, as opposed to regular Ethernet adapters.
Otherwise NetworkManager gets confused, misjudges the device type,
and wouldn't know it should connect a modem to get the device to work.
What would be the result depends on ModemManager version -- older
ModemManager would end up disconnecting a device after an unsuccessful
probe attempt (if it connected without needing to unlock a SIM), while
a newer one might spawn a separate PPP connection over a tty interface
instead, resulting in a general confusion and no end of chaos.
The only way to get this work reliably is to fix the device type
and have good enough version ModemManager (or equivalent).
Fixes: 63ba395cd7a5 ("rndis_host: support Novatel Verizon USB730L")
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://patch.msgid.link/20250325095842.1567999-1-lkundrak@v3.sk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Zhang <markzhang@nvidia.com>
Date: Tue Mar 25 11:02:26 2025 +0200
rtnetlink: Allocate vfinfo size for VF GUIDs when supported
[ Upstream commit 23f00807619d15063d676218f36c5dfeda1eb420 ]
Commit 30aad41721e0 ("net/core: Add support for getting VF GUIDs")
added support for getting VF port and node GUIDs in netlink ifinfo
messages, but their size was not taken into consideration in the
function that allocates the netlink message, causing the following
warning when a netlink message is filled with many VF port and node
GUIDs:
# echo 64 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
# ip link show dev ib0
RTNETLINK answers: Message too long
Cannot send link get request: Message too long
Kernel warning:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1930 at net/core/rtnetlink.c:4151 rtnl_getlink+0x586/0x5a0
Modules linked in: xt_conntrack xt_MASQUERADE nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay mlx5_ib macsec mlx5_core tls rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm iw_cm ib_ipoib fuse ib_cm ib_core
CPU: 2 UID: 0 PID: 1930 Comm: ip Not tainted 6.14.0-rc2+ #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:rtnl_getlink+0x586/0x5a0
Code: cb 82 e8 3d af 0a 00 4d 85 ff 0f 84 08 ff ff ff 4c 89 ff 41 be ea ff ff ff e8 66 63 5b ff 49 c7 07 80 4f cb 82 e9 36 fc ff ff <0f> 0b e9 16 fe ff ff e8 de a0 56 00 66 66 2e 0f 1f 84 00 00 00 00
RSP: 0018:ffff888113557348 EFLAGS: 00010246
RAX: 00000000ffffffa6 RBX: ffff88817e87aa34 RCX: dffffc0000000000
RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff88817e87afb8
RBP: 0000000000000009 R08: ffffffff821f44aa R09: 0000000000000000
R10: ffff8881260f79a8 R11: ffff88817e87af00 R12: ffff88817e87aa00
R13: ffffffff8563d300 R14: 00000000ffffffa6 R15: 00000000ffffffff
FS: 00007f63a5dbf280(0000) GS:ffff88881ee00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f63a5ba4493 CR3: 00000001700fe002 CR4: 0000000000772eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0xa5/0x230
? rtnl_getlink+0x586/0x5a0
? report_bug+0x22d/0x240
? handle_bug+0x53/0xa0
? exc_invalid_op+0x14/0x50
? asm_exc_invalid_op+0x16/0x20
? skb_trim+0x6a/0x80
? rtnl_getlink+0x586/0x5a0
? __pfx_rtnl_getlink+0x10/0x10
? rtnetlink_rcv_msg+0x1e5/0x860
? __pfx___mutex_lock+0x10/0x10
? rcu_is_watching+0x34/0x60
? __pfx_lock_acquire+0x10/0x10
? stack_trace_save+0x90/0xd0
? filter_irq_stacks+0x1d/0x70
? kasan_save_stack+0x30/0x40
? kasan_save_stack+0x20/0x40
? kasan_save_track+0x10/0x30
rtnetlink_rcv_msg+0x21c/0x860
? entry_SYSCALL_64_after_hwframe+0x76/0x7e
? __pfx_rtnetlink_rcv_msg+0x10/0x10
? arch_stack_walk+0x9e/0xf0
? rcu_is_watching+0x34/0x60
? lock_acquire+0xd5/0x410
? rcu_is_watching+0x34/0x60
netlink_rcv_skb+0xe0/0x210
? __pfx_rtnetlink_rcv_msg+0x10/0x10
? __pfx_netlink_rcv_skb+0x10/0x10
? rcu_is_watching+0x34/0x60
? __pfx___netlink_lookup+0x10/0x10
? lock_release+0x62/0x200
? netlink_deliver_tap+0xfd/0x290
? rcu_is_watching+0x34/0x60
? lock_release+0x62/0x200
? netlink_deliver_tap+0x95/0x290
netlink_unicast+0x31f/0x480
? __pfx_netlink_unicast+0x10/0x10
? rcu_is_watching+0x34/0x60
? lock_acquire+0xd5/0x410
netlink_sendmsg+0x369/0x660
? lock_release+0x62/0x200
? __pfx_netlink_sendmsg+0x10/0x10
? import_ubuf+0xb9/0xf0
? __import_iovec+0x254/0x2b0
? lock_release+0x62/0x200
? __pfx_netlink_sendmsg+0x10/0x10
____sys_sendmsg+0x559/0x5a0
? __pfx_____sys_sendmsg+0x10/0x10
? __pfx_copy_msghdr_from_user+0x10/0x10
? rcu_is_watching+0x34/0x60
? do_read_fault+0x213/0x4a0
? rcu_is_watching+0x34/0x60
___sys_sendmsg+0xe4/0x150
? __pfx____sys_sendmsg+0x10/0x10
? do_fault+0x2cc/0x6f0
? handle_pte_fault+0x2e3/0x3d0
? __pfx_handle_pte_fault+0x10/0x10
? preempt_count_sub+0x14/0xc0
? __down_read_trylock+0x150/0x270
? __handle_mm_fault+0x404/0x8e0
? __pfx___handle_mm_fault+0x10/0x10
? lock_release+0x62/0x200
? __rcu_read_unlock+0x65/0x90
? rcu_is_watching+0x34/0x60
__sys_sendmsg+0xd5/0x150
? __pfx___sys_sendmsg+0x10/0x10
? __up_read+0x192/0x480
? lock_release+0x62/0x200
? __rcu_read_unlock+0x65/0x90
? rcu_is_watching+0x34/0x60
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f63a5b13367
Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
RSP: 002b:00007fff8c726bc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000067b687c2 RCX: 00007f63a5b13367
RDX: 0000000000000000 RSI: 00007fff8c726c30 RDI: 0000000000000004
RBP: 00007fff8c726cb8 R08: 0000000000000000 R09: 0000000000000034
R10: 00007fff8c726c7c R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 00007fff8c726cd0 R15: 00007fff8c726cd0
</TASK>
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff813f9e58>] copy_process+0xd08/0x2830
softirqs last enabled at (0): [<ffffffff813f9e58>] copy_process+0xd08/0x2830
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---
Thus, when calculating ifinfo message size, take VF GUIDs sizes into
account when supported.
Fixes: 30aad41721e0 ("net/core: Add support for getting VF GUIDs")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20250325090226.749730-1-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: WANG Rui <wangrui@loongson.cn>
Date: Sun Mar 30 16:30:20 2025 +0800
rust: Fix enabling Rust and building with GCC for LoongArch
commit 13c23cb4ed09466d73f1beae8956810b95add6ef upstream.
This patch fixes a build issue on LoongArch when Rust is enabled and
compiled with GCC by explicitly setting the bindgen target and skipping
C flags that Clang doesn't support.
Cc: stable@vger.kernel.org
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alice Ryhl <aliceryhl@google.com>
Date: Mon Mar 3 08:45:12 2025 +0000
rust: fix signature of rust_fmt_argument
[ Upstream commit 901b3290bd4dc35e613d13abd03c129e754dd3dd ]
Without this change, the rest of this series will emit the following
error message:
error[E0308]: `if` and `else` have incompatible types
--> <linux>/rust/kernel/print.rs:22:22
|
21 | #[export]
| --------- expected because of this
22 | unsafe extern "C" fn rust_fmt_argument(
| ^^^^^^^^^^^^^^^^^ expected `u8`, found `i8`
|
= note: expected fn item `unsafe extern "C" fn(*mut u8, *mut u8, *mut c_void) -> *mut u8 {bindings::rust_fmt_argument}`
found fn item `unsafe extern "C" fn(*mut i8, *mut i8, *const c_void) -> *mut i8 {print::rust_fmt_argument}`
The error may be different depending on the architecture.
To fix this, change the void pointer argument to use a const pointer,
and change the imports to use crate::ffi instead of core::ffi for
integer types.
Fixes: 787983da7718 ("vsprintf: add new `%pA` format specifier")
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250303-export-macro-v3-1-41fbad85a27f@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sven Schnelle <svens@linux.ibm.com>
Date: Thu Mar 20 13:25:38 2025 +0100
s390/entry: Fix setting _CIF_MCCK_GUEST with lowcore relocation
[ Upstream commit 121df45b37a1016ee6828c2ca3ba825f3e18a8c1 ]
When lowcore relocation is enabled, the machine check handler doesn't
use the lowcore address when setting _CIF_MCCK_GUEST. Fix this by
adding the missing base register.
Fixes: 0001b7bbc53a ("s390/entry: Make mchk_int_handler() ready for lowcore relocation")
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Schnelle <schnelle@linux.ibm.com>
Date: Fri Feb 21 12:51:48 2025 +0100
s390: Remove ioremap_wt() and pgprot_writethrough()
[ Upstream commit c94bff63e49302d4ce36502a85a2710a67332a4f ]
It turns out that while s390 architecture calls its memory-I/O mapping
variants write-through and write-back the implementation of ioremap_wt()
and pgprot_writethrough() does not match Linux notion of ioremap_wt().
In particular Linux expects ioremap_wt() to be weaker still than
ioremap_wc(), allowing not just gathering and re-ordering but also reads
to be served from cache. Instead s390's implementation is equivalent to
normal ioremap() while its ioremap_wc() allows re-ordering.
Note that there are no known users of ioremap_wt() on s390 and the
resulting behavior is in line with asm-generic defining ioremap_wt() as
ioremap(), if undefined, so no breakage is expected.
As s390 does not have a mapping type matching the Linux notion of
ioremap_wt() and pgprot_writethrough(), simply drop them and rely on the
asm-generic fallbacks instead.
Fixes: b02002cc4c0f ("s390/pci: Implement ioremap_wc/prot() with MIO")
Fixes: b43b3fff042d ("s390: mm: convert to GENERIC_IOREMAP")
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shrikanth Hegde <sshegde@linux.ibm.com>
Date: Thu Mar 6 10:59:53 2025 +0530
sched/deadline: Use online cpus for validating runtime
[ Upstream commit 14672f059d83f591afb2ee1fff56858efe055e5a ]
The ftrace selftest reported a failure because writing -1 to
sched_rt_runtime_us returns -EBUSY. This happens when the possible
CPUs are different from active CPUs.
Active CPUs are part of one root domain, while remaining CPUs are part
of def_root_domain. Since active cpumask is being used, this results in
cpus=0 when a non active CPUs is used in the loop.
Fix it by looping over the online CPUs instead for validating the
bandwidth calculations.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/r/20250306052954.452005-2-sshegde@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tianchen Ding <dtcccc@linux.alibaba.com>
Date: Tue Feb 11 14:36:59 2025 +0800
sched/eevdf: Force propagating min_slice of cfs_rq when {en,de}queue tasks
[ Upstream commit 563bc2161b94571ea425bbe2cf69fd38e24cdedf ]
When a task is enqueued and its parent cgroup se is already on_rq, this
parent cgroup se will not be enqueued again, and hence the root->min_slice
leaves unchanged. The same issue happens when a task is dequeued and its
parent cgroup se has other runnable entities, and the parent cgroup se
will not be dequeued.
Force propagating min_slice when se doesn't need to be enqueued or
dequeued. Ensure the se hierarchy always get the latest min_slice.
Fixes: aef6987d8954 ("sched/eevdf: Propagate min_slice up the cgroup hierarchy")
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250211063659.7180-1-dtcccc@linux.alibaba.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 31 21:26:44 2025 -0700
sched/smt: Always inline sched_smt_active()
[ Upstream commit 09f37f2d7b21ff35b8b533f9ab8cfad2fe8f72f6 ]
sched_smt_active() can be called from noinstr code, so it should always
be inlined. The CONFIG_SCHED_SMT version already has __always_inline.
Do the same for its !CONFIG_SCHED_SMT counterpart.
Fixes the following warning:
vmlinux.o: error: objtool: intel_idle_ibrs+0x13: call to sched_smt_active() leaves .noinstr.text section
Fixes: 321a874a7ef8 ("sched/smt: Expose sched_smt_present static key")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/1d03907b0a247cf7fb5c1d518de378864f603060.1743481539.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/r/202503311434.lyw2Tveh-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: zihan zhou <15645113830zzh@gmail.com>
Date: Sat Feb 8 16:08:52 2025 +0800
sched: Cancel the slice protection of the idle entity
[ Upstream commit f553741ac8c0e467a3b873e305f34b902e50b86d ]
A wakeup non-idle entity should preempt idle entity at any time,
but because of the slice protection of the idle entity, the non-idle
entity has to wait, so just cancel it.
This patch is aimed at minimizing the impact of SCHED_IDLE on
SCHED_NORMAL. For example, a task with SCHED_IDLE policy that sleeps for
1s and then runs for 3 ms, running cyclictest on the same cpu, has a
maximum latency of 3 ms, which is caused by the slice protection of the
idle entity. It is unreasonable. With this patch, the cyclictest latency
under the same conditions is basically the same on the cpu with idle
processes and on empty cpu.
[peterz: add helpers]
Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption")
Signed-off-by: zihan zhou <15645113830zzh@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20250208080850.16300-1-15645113830zzh@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Mon Mar 31 09:15:32 2025 +0000
sctp: add mutual exclusion in proc_sctp_do_udp_port()
[ Upstream commit 10206302af856791fbcc27a33ed3c3eb09b2793d ]
We must serialize calls to sctp_udp_sock_stop() and sctp_udp_sock_start()
or risk a crash as syzbot reported:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
CPU: 1 UID: 0 PID: 6551 Comm: syz.1.44 Not tainted 6.14.0-syzkaller-g7f2ff7b62617 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3653
Call Trace:
<TASK>
udp_tunnel_sock_release+0x68/0x80 net/ipv4/udp_tunnel_core.c:181
sctp_udp_sock_stop+0x71/0x160 net/sctp/protocol.c:930
proc_sctp_do_udp_port+0x264/0x450 net/sctp/sysctl.c:553
proc_sys_call_handler+0x3d0/0x5b0 fs/proc/proc_sysctl.c:601
iter_file_splice_write+0x91c/0x1150 fs/splice.c:738
do_splice_from fs/splice.c:935 [inline]
direct_splice_actor+0x18f/0x6c0 fs/splice.c:1158
splice_direct_to_actor+0x342/0xa30 fs/splice.c:1102
do_splice_direct_actor fs/splice.c:1201 [inline]
do_splice_direct+0x174/0x240 fs/splice.c:1227
do_sendfile+0xafd/0xe50 fs/read_write.c:1368
__do_sys_sendfile64 fs/read_write.c:1429 [inline]
__se_sys_sendfile64 fs/read_write.c:1415 [inline]
__x64_sys_sendfile64+0x1d8/0x220 fs/read_write.c:1415
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
Reported-by: syzbot+fae49d997eb56fa7c74d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67ea5c01.050a0220.1547ec.012b.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250331091532.224982-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tengda Wu <wutengda@huaweicloud.com>
Date: Wed Jan 22 10:28:38 2025 +0800
selftests/bpf: Fix freplace_link segfault in tailcalls prog test
[ Upstream commit a63a631c9b5cb25a1c17dd2cb18c63df91e978b1 ]
There are two bpf_link__destroy(freplace_link) calls in
test_tailcall_bpf2bpf_freplace(). After the first bpf_link__destroy()
is called, if the following bpf_map_{update,delete}_elem() throws an
exception, it will jump to the "out" label and call bpf_link__destroy()
again, causing double free and eventually leading to a segfault.
Fix it by directly resetting freplace_link to NULL after the first
bpf_link__destroy() call.
Fixes: 021611d33e78 ("selftests/bpf: Add test to verify tailcall and freplace restrictions")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/bpf/20250122022838.1079157-1-wutengda@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viktor Malik <vmalik@redhat.com>
Date: Thu Mar 13 13:28:52 2025 +0100
selftests/bpf: Fix string read in strncmp benchmark
[ Upstream commit de07b182899227d5fd1ca7a1a7d495ecd453d49c ]
The strncmp benchmark uses the bpf_strncmp helper and a hand-written
loop to compare two strings. The values of the strings are filled from
userspace. One of the strings is non-const (in .bss) while the other is
const (in .rodata) since that is the requirement of bpf_strncmp.
The problem is that in the hand-written loop, Clang optimizes the reads
from the const string to always return 0 which breaks the benchmark.
Use barrier_var to prevent the optimization.
The effect can be seen on the strncmp-no-helper variant.
Before this change:
# ./bench strncmp-no-helper
Setting up benchmark 'strncmp-no-helper'...
Benchmark 'strncmp-no-helper' started.
Iter 0 (112.309us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 1 (-23.238us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 2 ( 58.994us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 3 (-30.466us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 4 ( 29.996us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 5 ( 16.949us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Iter 6 (-60.035us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s
Summary: hits 0.000 ± 0.000M/s ( 0.000M/prod), drops 0.000 ± 0.000M/s, total operations 0.000 ± 0.000M/s
After this change:
# ./bench strncmp-no-helper
Setting up benchmark 'strncmp-no-helper'...
Benchmark 'strncmp-no-helper' started.
Iter 0 ( 77.711us): hits 5.534M/s ( 5.534M/prod), drops 0.000M/s, total operations 5.534M/s
Iter 1 ( 11.215us): hits 6.006M/s ( 6.006M/prod), drops 0.000M/s, total operations 6.006M/s
Iter 2 (-14.253us): hits 5.931M/s ( 5.931M/prod), drops 0.000M/s, total operations 5.931M/s
Iter 3 ( 59.087us): hits 6.005M/s ( 6.005M/prod), drops 0.000M/s, total operations 6.005M/s
Iter 4 (-21.379us): hits 6.010M/s ( 6.010M/prod), drops 0.000M/s, total operations 6.010M/s
Iter 5 (-20.310us): hits 5.861M/s ( 5.861M/prod), drops 0.000M/s, total operations 5.861M/s
Iter 6 ( 53.937us): hits 6.004M/s ( 6.004M/prod), drops 0.000M/s, total operations 6.004M/s
Summary: hits 5.969 ± 0.061M/s ( 5.969M/prod), drops 0.000 ± 0.000M/s, total operations 5.969 ± 0.061M/s
Fixes: 9c42652f8be3 ("selftests/bpf: Add benchmark for bpf_strncmp() helper")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20250313122852.1365202-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Date: Fri Jan 31 12:35:22 2025 +0530
selftests/bpf: Select NUMA_NO_NODE to create map
[ Upstream commit 4107a1aeb20ed4cdad6a0d49de92ea0f933c71b7 ]
On powerpc, a CPU does not necessarily originate from NUMA node 0.
This contrasts with architectures like x86, where CPU 0 is not
hot-pluggable, making NUMA node 0 a consistently valid node.
This discrepancy can lead to failures when creating a map on NUMA
node 0, which is initialized by default, if no CPUs are allocated
from NUMA node 0.
This patch fixes the issue by setting NUMA_NO_NODE (-1) for map
creation for this selftest.
Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation")
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/cf1f61468b47425ecf3728689bc9636ddd1d910e.1738302337.git.skb99@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cyan Yang <cyan.yang@sifive.com>
Date: Wed Mar 12 12:38:40 2025 +0800
selftests/mm/cow: fix the incorrect error handling
[ Upstream commit f841ad9ca5007167c02de143980c9dc703f90b3d ]
Error handling doesn't check the correct return value. This patch will
fix it.
Link: https://lkml.kernel.org/r/20250312043840.71799-1-cyan.yang@sifive.com
Fixes: f4b5fd6946e2 ("selftests/vm: anon_cow: THP tests")
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Tue Mar 11 12:52:45 2025 +0100
selftests: netfilter: skip br_netfilter queue tests if kernel is tainted
[ Upstream commit c21b02fd9cbf15aed6e32c89e0fd70070281e3d1 ]
These scripts fail if the kernel is tainted which leads to wrong test
failure reports in CI environments when an unrelated test triggers some
splat.
Check taint state at start of script and SKIP if its already dodgy.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tim Schumacher <tim.schumacher1@huawei.com>
Date: Fri Mar 7 10:56:43 2025 +0100
selinux: Chain up tool resolving errors in install_policy.sh
[ Upstream commit 6ae0042f4d3f331e841495eb0a3d51598e593ec2 ]
Subshell evaluations are not exempt from errexit, so if a command is
not available, `which` will fail and exit the script as a whole.
This causes the helpful error messages to not be printed if they are
tacked on using a `$?` comparison.
Resolve the issue by using chains of logical operators, which are not
subject to the effects of errexit.
Fixes: e37c1877ba5b1 ("scripts/selinux: modernize mdp")
Signed-off-by: Tim Schumacher <tim.schumacher1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Andreev <andreev@swemel.ru>
Date: Fri Jan 17 19:36:42 2025 +0300
smack: dont compile ipv6 code unless ipv6 is configured
[ Upstream commit bfcf4004bcbce2cb674b4e8dbd31ce0891766bac ]
I want to be sure that ipv6-specific code
is not compiled in kernel binaries
if ipv6 is not configured.
[1] was getting rid of "unused variable" warning, but,
with that, it also mandated compilation of a handful ipv6-
specific functions in ipv4-only kernel configurations:
smk_ipv6_localhost, smack_ipv6host_label, smk_ipv6_check.
Their compiled bodies are likely to be removed by compiler
from the resulting binary, but, to be on the safe side,
I remove them from the compiler view.
[1]
Fixes: 00720f0e7f28 ("smack: avoid unused 'sip' variable warning")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Andreev <andreev@swemel.ru>
Date: Sun Jan 26 17:07:27 2025 +0300
smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label
[ Upstream commit 6cce0cc3861337b3ad8d4ac131d6e47efa0954ec ]
Since inception [1], SMACK initializes ipv* child socket security
for connection-oriented communications (tcp/sctp/dccp)
during accept() syscall, in the security_sock_graft() hook:
| void smack_sock_graft(struct sock *sk, ...)
| {
| // only ipv4 and ipv6 are eligible here
| // ...
| ssp = sk->sk_security; // socket security
| ssp->smk_in = skp; // process label: smk_of_current()
| ssp->smk_out = skp; // process label: smk_of_current()
| }
This approach is incorrect for two reasons:
A) initialization occurs too late for child socket security:
The child socket is created by the kernel once the handshake
completes (e.g., for tcp: after receiving ack for syn+ack).
Data can legitimately start arriving to the child socket
immediately, long before the application calls accept()
on the socket.
Those data are (currently — were) processed by SMACK using
incorrect child socket security attributes.
B) Incoming connection requests are handled using the listening
socket's security, hence, the child socket must inherit the
listening socket's security attributes.
smack_sock_graft() initilizes the child socket's security with
a process label, as is done for a new socket()
But ... the process label is not necessarily the same as the
listening socket label. A privileged application may legitimately
set other in/out labels for a listening socket.
When this happens, SMACK processes incoming packets using
incorrect socket security attributes.
In [2] Michael Lontke noticed (A) and fixed it in [3] by adding
socket initialization into security_sk_clone_security() hook like
| void smack_sk_clone_security(struct sock *oldsk, struct sock *newsk)
| {
| *(struct socket_smack *)newsk->sk_security =
| *(struct socket_smack *)oldsk->sk_security;
| }
This initializes the child socket security with the parent (listening)
socket security at the appropriate time.
I was forced to revisit this old story because
smack_sock_graft() was left in place by [3] and continues overwriting
the child socket's labels with the process label,
and there might be a reason for this, so I undertook a study.
If the process label differs from the listening socket's labels,
the following occurs for ipv4:
assigning the smk_out is not accompanied by netlbl_sock_setattr,
so the outgoing packet's cipso label does not change.
So, the only effect of this assignment for interhost communications
is a divergence between the program-visible “out” socket label and
the cipso network label. For intrahost communications this label,
however, becomes visible via secmark netfilter marking, and is
checked for access rights by the client, receiving side.
Assigning the smk_in affects both interhost and intrahost
communications: the server begins to check access rights against
an wrong label.
Access check against wrong label (smk_in or smk_out),
unsurprisingly fails, breaking the connection.
The above affects protocols that calls security_sock_graft()
during accept(), namely: {tcp,dccp,sctp}/{ipv4,ipv6}
One extra security_sock_graft() caller, crypto/af_alg.c`af_alg_accept
is not affected, because smack_sock_graft() does nothing for PF_ALG.
To reproduce, assign non-default in/out labels to a listening socket,
setup rules between these labels and client label, attempt to connect
and send some data.
Ipv6 specific: ipv6 packets do not convey SMACK labels. To reproduce
the issue in interhost communications set opposite labels in
/smack/ipv6host on both hosts.
Ipv6 intrahost communications do not require tricking, because SMACK
labels are conveyed via secmark netfilter marking.
So, currently smack_sock_graft() is not useful, but harmful,
therefore, I have removed it.
This fixes the issue for {tcp,dccp}/{ipv4,ipv6},
but not sctp/{ipv4,ipv6}.
Although this change is necessary for sctp+smack to function
correctly, it is not sufficient because:
sctp/ipv4 does not call security_sk_clone() and
sctp/ipv6 ignores SMACK completely.
These are separate issues, belong to other subsystem,
and should be addressed separately.
[1] 2008-02-04,
Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel")
[2] Michael Lontke, 2022-08-31, SMACK LSM checks wrong object label
during ingress network traffic
Link: https://lore.kernel.org/linux-security-module/6324997ce4fc092c5020a4add075257f9c5f6442.camel@elektrobit.com/
[3] 2022-08-31, michael.lontke,
commit 4ca165fc6c49 ("SMACK: Add sk_clone_security LSM hook")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wang Zhaolong <wangzhaolong1@huawei.com>
Date: Tue Feb 18 22:30:05 2025 +0800
smb: client: Fix netns refcount imbalance causing leaks and use-after-free
[ Upstream commit 4e7f1644f2ac6d01dc584f6301c3b1d5aac4eaef ]
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
namespace.") attempted to fix a netns use-after-free issue by manually
adjusting reference counts via sk->sk_net_refcnt and sock_inuse_add().
However, a later commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock
after rmmod") pointed out that the approach of manually setting
sk->sk_net_refcnt in the first commit was technically incorrect, as
sk->sk_net_refcnt should only be set for user sockets. It led to issues
like TCP timers not being cleared properly on close. The second commit
moved to a model of just holding an extra netns reference for
server->ssocket using get_net(), and dropping it when the server is torn
down.
But there remain some gaps in the get_net()/put_net() balancing added by
these commits. The incomplete reference handling in these fixes results
in two issues:
1. Netns refcount leaks[1]
The problem process is as follows:
```
mount.cifs cifsd
cifs_do_mount
cifs_mount
cifs_mount_get_session
cifs_get_tcp_session
get_net() /* First get net. */
ip_connect
generic_ip_connect /* Try port 445 */
get_net()
->connect() /* Failed */
put_net()
generic_ip_connect /* Try port 139 */
get_net() /* Missing matching put_net() for this get_net().*/
cifs_get_smb_ses
cifs_negotiate_protocol
smb2_negotiate
SMB2_negotiate
cifs_send_recv
wait_for_response
cifs_demultiplex_thread
cifs_read_from_socket
cifs_readv_from_socket
cifs_reconnect
cifs_abort_connection
sock_release();
server->ssocket = NULL;
/* Missing put_net() here. */
generic_ip_connect
get_net()
->connect() /* Failed */
put_net()
sock_release();
server->ssocket = NULL;
free_rsp_buf
...
clean_demultiplex_info
/* It's only called once here. */
put_net()
```
When cifs_reconnect() is triggered, the server->ssocket is released
without a corresponding put_net() for the reference acquired in
generic_ip_connect() before. it ends up calling generic_ip_connect()
again to retry get_net(). After that, server->ssocket is set to NULL
in the error path of generic_ip_connect(), and the net count cannot be
released in the final clean_demultiplex_info() function.
2. Potential use-after-free
The current refcounting scheme can lead to a potential use-after-free issue
in the following scenario:
```
cifs_do_mount
cifs_mount
cifs_mount_get_session
cifs_get_tcp_session
get_net() /* First get net */
ip_connect
generic_ip_connect
get_net()
bind_socket
kernel_bind /* failed */
put_net()
/* after out_err_crypto_release label */
put_net()
/* after out_err label */
put_net()
```
In the exception handling process where binding the socket fails, the
get_net() and put_net() calls are unbalanced, which may cause the
server->net reference count to drop to zero and be prematurely released.
To address both issues, this patch ties the netns reference counting to
the server->ssocket and server lifecycles. The extra reference is now
acquired when the server or socket is created, and released when the
socket is destroyed or the server is torn down.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=219792
Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod")
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Wed Feb 12 23:26:09 2025 +0900
smb: common: change the data type of num_aces to le16
[ Upstream commit 62e7dd0a39c2d0d7ff03274c36df971f1b3d2d0d ]
2.4.5 in [MS-DTYP].pdf describe the data type of num_aces as le16.
AceCount (2 bytes): An unsigned 16-bit integer that specifies the count
of the number of ACE records in the ACL.
Change it to le16 and add reserved field to smb_acl struct.
Reported-by: Igor Leite Ladessa <igor-ladessa@hotmail.com>
Tested-by: Igor Leite Ladessa <igor-ladessa@hotmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Thu Dec 5 12:48:44 2024 +0900
soundwire: slave: fix an OF node reference leak in soundwire slave device
[ Upstream commit aac2f8363f773ae1f65aab140e06e2084ac6b787 ]
When initializing a soundwire slave device, an OF node is stored to the
device with refcount incremented. However, the refcount is not
decremented in .release(), thus call of_node_put() in
sdw_slave_release().
Fixes: a2e484585ad3 ("soundwire: core: add device tree support for slave devices")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20241205034844.2784964-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Fainelli <florian.fainelli@broadcom.com>
Date: Tue Apr 1 15:42:38 2025 -0700
spi: bcm2835: Do not call gpiod_put() on invalid descriptor
[ Upstream commit d6691010523fe1016f482a1e1defcc6289eeea48 ]
If we are unable to lookup the chip-select GPIO, the error path will
call bcm2835_spi_cleanup() which unconditionally calls gpiod_put() on
the cs->gpio variable which we just determined was invalid.
Fixes: 21f252cd29f0 ("spi: bcm2835: reduce the abuse of the GPIO API")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250401224238.2854256-1-florian.fainelli@broadcom.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Fainelli <florian.fainelli@broadcom.com>
Date: Tue Apr 1 16:36:03 2025 -0700
spi: bcm2835: Restore native CS probing when pinctrl-bcm2835 is absent
[ Upstream commit e19c1272c80a5ecce387c1b0c3b995f4edf9c525 ]
The lookup table forces the use of the "pinctrl-bcm2835" GPIO chip
provider and essentially assumes that there is going to be such a
provider, and if not, we will fail to set-up the SPI device.
While this is true on Raspberry Pi based systems (2835/36/37, 2711,
2712), this is not true on 7712/77122 Broadcom STB systems which use the
SPI driver, but not the GPIO driver.
There used to be an early check:
chip = gpiochip_find("pinctrl-bcm2835", chip_match_name);
if (!chip)
return 0;
which would accomplish that nicely, bring something similar back by
checking for the compatible strings matched by the pinctrl-bcm2835.c
driver, if there is no Device Tree node matching those compatible
strings, then we won't find any GPIO provider registered by the
"pinctrl-bcm2835" driver.
Fixes: 21f252cd29f0 ("spi: bcm2835: reduce the abuse of the GPIO API")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250401233603.2938955-1-florian.fainelli@broadcom.com
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Mon Mar 31 08:33:32 2025 -0700
spi: cadence: Fix out-of-bounds array access in cdns_mrvl_xspi_setup_clock()
[ Upstream commit 7ba0847fa1c22e7801cebfe5f7b75aee4fae317e ]
If requested_clk > 128, cdns_mrvl_xspi_setup_clock() iterates over the
entire cdns_mrvl_xspi_clk_div_list array without breaking out early,
causing 'i' to go beyond the array bounds.
Fix that by stopping the loop when it gets to the last entry, clamping
the clock to the minimum 6.25 MHz.
Fixes the following warning with an UBSAN kernel:
vmlinux.o: warning: objtool: cdns_mrvl_xspi_setup_clock: unexpected end of section .text.cdns_mrvl_xspi_setup_clock
Fixes: 26d34fdc4971 ("spi: cadence: Add clock configuration for Marvell xSPI overlay")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503282236.UhfRsF3B-lkp@intel.com/
Link: https://lore.kernel.org/r/gs2ooxfkblnee6cc5yfcxh7nu4wvoqnuv4lrllkhccxgcac2jg@7snmwd73jkhs
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://patch.msgid.link/h6bef6wof6zpjfp3jbhrkigqsnykdfy6j4qmmvb6gsabhianhj@k57a7hwpa3bj
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Wed Mar 12 19:38:28 2025 -0400
spufs: fix a leak in spufs_create_context()
[ Upstream commit 0f5cce3fc55b08ee4da3372baccf4bcd36a98396 ]
Leak fixes back in 2008 missed one case - if we are trying to set affinity
and spufs_mkdir() fails, we need to drop the reference to neighbor.
Fixes: 58119068cb27 "[POWERPC] spufs: Fix memory leak on SPU affinity"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sat Mar 8 19:26:31 2025 -0500
spufs: fix a leak on spufs_new_file() failure
[ Upstream commit d1ca8698ca1332625d83ea0d753747be66f9906d ]
It's called from spufs_fill_dir(), and caller of that will do
spufs_rmdir() in case of failure. That does remove everything
we'd managed to create, but... the problem dentry is still
negative. IOW, it needs to be explicitly dropped.
Fixes: 3f51dd91c807 "[PATCH] spufs: fix spufs_fill_dir error path"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Wed Mar 12 19:18:39 2025 -0400
spufs: fix gang directory lifetimes
[ Upstream commit c134deabf4784e155d360744d4a6a835b9de4dd4 ]
prior to "[POWERPC] spufs: Fix gang destroy leaks" we used to have
a problem with gang lifetimes - creation of a gang returns opened
gang directory, which normally gets removed when that gets closed,
but if somebody has created a context belonging to that gang and
kept it alive until the gang got closed, removal failed and we
ended up with a leak.
Unfortunately, it had been fixed the wrong way. Dentry of gang
directory was no longer pinned, and rmdir on close was gone.
One problem was that failure of open kept calling simple_rmdir()
as cleanup, which meant an unbalanced dput(). Another bug was
in the success case - gang creation incremented link count on
root directory, but that was no longer undone when gang got
destroyed.
Fix consists of
* reverting the commit in question
* adding a counter to gang, protected by ->i_rwsem
of gang directory inode.
* having it set to 1 at creation time, dropped
in both spufs_dir_close() and spufs_gang_close() and bumped
in spufs_create_context(), provided that it's not 0.
* using simple_recursive_removal() to take the gang
directory out when counter reaches zero.
Fixes: 877907d37da9 "[POWERPC] spufs: Fix gang destroy leaks"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Date: Sat Feb 22 19:36:17 2025 +0000
staging: rtl8723bs: select CONFIG_CRYPTO_LIB_AES
[ Upstream commit b2a9a6a26b7e954297e51822e396572026480bad ]
This fixes the following issue:
ERROR: modpost: "aes_expandkey" [drivers/staging/rtl8723bs/r8723bs.ko]
undefined!
ERROR: modpost: "aes_encrypt" [drivers/staging/rtl8723bs/r8723bs.ko]
undefined!
Fixes: 7d40753d8820 ("staging: rtl8723bs: use in-kernel aes encryption in OMAC1 routines")
Fixes: 3d3a170f6d80 ("staging: rtl8723bs: use in-kernel aes encryption")
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/tencent_0BDDF3A721708D16A2E7C3DAFF0FEC79A105@qq.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Wahren <wahrenst@gmx.net>
Date: Sun Mar 9 13:50:11 2025 +0100
staging: vchiq_arm: Fix possible NPR of keep-alive thread
[ Upstream commit 3db89bc6d973e2bcaa852f6409c98c228f39a926 ]
In case vchiq_platform_conn_state_changed() is never called or fails before
driver removal, ka_thread won't be a valid pointer to a task_struct. So
do the necessary checks before calling kthread_stop to avoid a crash.
Fixes: 863a756aaf49 ("staging: vc04_services: vchiq_core: Stop kthreads on vchiq module unload")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20250309125014.37166-3-wahrenst@gmx.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Wahren <wahrenst@gmx.net>
Date: Sun Mar 9 13:50:10 2025 +0100
staging: vchiq_arm: Register debugfs after cdev
[ Upstream commit 63f4dbb196db60a8536ba3d1b835d597a83f6cbb ]
The commit 2a4d15a4ae98 ("staging: vchiq: Refactor vchiq cdev code")
moved the debugfs directory creation before vchiq character device
registration. In case the latter fails, the debugfs directory won't
be cleaned up.
Fixes: 2a4d15a4ae98 ("staging: vchiq: Refactor vchiq cdev code")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20250309125014.37166-2-wahrenst@gmx.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: xueqin Luo <luoxueqin@kylinos.cn>
Date: Thu Feb 6 16:14:36 2025 +0800
thermal: core: Remove duplicate struct declaration
[ Upstream commit 9e6ec8cf64e2973f0ec74f09023988cabd218426 ]
The struct thermal_zone_device is already declared on line 32, so the
duplicate declaration has been removed.
Fixes: b1ae92dcfa8e ("thermal: core: Make struct thermal_zone_device definition internal")
Signed-off-by: xueqin Luo <luoxueqin@kylinos.cn>
Link: https://lore.kernel.org/r/20250206081436.51785-1-luoxueqin@kylinos.cn
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chenyuan Yang <chenyuan0y@gmail.com>
Date: Wed Mar 12 23:36:11 2025 -0500
thermal: int340x: Add NULL check for adev
[ Upstream commit 2542a3f70e563a9e70e7ded314286535a3321bdb ]
Not all devices have an ACPI companion fwnode, so adev might be NULL.
This is similar to the commit cd2fd6eab480
("platform/x86: int3472: Check for adev == NULL").
Add a check for adev not being set and return -ENODEV in that case to
avoid a possible NULL pointer deref in int3402_thermal_probe().
Note, under the same directory, int3400_thermal_probe() has such a
check.
Fixes: 77e337c6e23e ("Thermal: introduce INT3402 thermal driver")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250313043611.1212116-1-chenyuan0y@gmail.com
[ rjw: Subject edit, added Fixes: ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Len Brown <len.brown@intel.com>
Date: Sun Apr 6 11:18:39 2025 -0400
tools/power turbostat: report CoreThr per measurement interval
[ Upstream commit f729775f79a9c942c6c82ed6b44bd030afe10423 ]
The CoreThr column displays total thermal throttling events
since boot time.
Change it to report events during the measurement interval.
This is more useful for showing a user the current conditions.
Total events since boot time are still available to the user via
/sys/devices/system/cpu/cpu*/thermal_throttle/*
Document CoreThr on turbostat.8
Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print")
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Feb 25 11:36:00 2025 -0800
tools/x86: Fix linux/unaligned.h include path in lib/insn.c
[ Upstream commit fad07a5c0f07ad0884e1cb4362fe28c083b5b811 ]
tools/arch/x86/include/linux doesn't exist but building is working by
virtue of a -I. Building using bazel this fails. Use angle brackets to
include unaligned.h so there isn't an invalid relative include.
Fixes: 5f60d5f6bbc1 ("move asm/unaligned.h to linux/unaligned.h")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20250225193600.90037-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: Fri Dec 27 13:07:57 2024 +0900
tracing/hist: Add poll(POLLIN) support on hist file
[ Upstream commit 1bd13edbbed6e7e396f1aab92b224a4775218e68 ]
Add poll syscall support on the `hist` file. The Waiter will be waken
up when the histogram is updated with POLLIN.
Currently, there is no way to wait for a specific event in userspace.
So user needs to peek the `trace` periodicaly, or wait on `trace_pipe`.
But it is not a good idea to peek at the `trace` for an event that
randomly happens. And `trace_pipe` is not coming back until a page is
filled with events.
This allows a user to wait for a specific event on the `hist` file. User
can set a histogram trigger on the event which they want to monitor
and poll() on its `hist` file. Since this poll() returns POLLIN, the next
poll() will return soon unless a read() happens on that hist file.
NOTE: To read the hist file again, you must set the file offset to 0,
but just for monitoring the event, you may not need to read the
histogram.
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173527247756.464571.14236296701625509931.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Stable-dep-of: 0b4ffbe4888a ("tracing: Correct the refcount if the hist/hist_debug file fails to open")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: Fri Dec 27 13:08:07 2024 +0900
tracing/hist: Support POLLPRI event for poll on histogram
[ Upstream commit 66fc6f521a0b91051ce6968a216a30bc52267bf8 ]
Since POLLIN will not be flushed until the hist file is read, the user
needs to repeatedly read() and poll() on the hist file for monitoring the
event continuously. But the read() is somewhat redundant when the user is
only monitoring for event updates.
Add POLLPRI poll event on the hist file so the event returns when a
histogram is updated after open(), poll() or read(). Thus it is possible
to wait for the next event without having to issue a read().
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173527248770.464571.2536902137325258133.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Stable-dep-of: 0b4ffbe4888a ("tracing: Correct the refcount if the hist/hist_debug file fails to open")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Date: Fri Mar 21 09:52:49 2025 +0000
tracing/osnoise: Fix possible recursive locking for cpus_read_lock()
commit 7e6b3fcc9c5294aeafed0dbe1a09a1bc899bd0f2 upstream.
Lockdep reports this deadlock log:
osnoise: could not start sampling thread
============================================
WARNING: possible recursive locking detected
--------------------------------------------
CPU0
----
lock(cpu_hotplug_lock);
lock(cpu_hotplug_lock);
Call Trace:
<TASK>
print_deadlock_bug+0x282/0x3c0
__lock_acquire+0x1610/0x29a0
lock_acquire+0xcb/0x2d0
cpus_read_lock+0x49/0x120
stop_per_cpu_kthreads+0x7/0x60
start_kthread+0x103/0x120
osnoise_hotplug_workfn+0x5e/0x90
process_one_work+0x44f/0xb30
worker_thread+0x33e/0x5e0
kthread+0x206/0x3b0
ret_from_fork+0x31/0x50
ret_from_fork_asm+0x11/0x20
</TASK>
This is the deadlock scenario:
osnoise_hotplug_workfn()
guard(cpus_read_lock)(); // first lock call
start_kthread(cpu)
if (IS_ERR(kthread)) {
stop_per_cpu_kthreads(); {
cpus_read_lock(); // second lock call. Cause the AA deadlock
}
}
It is not necessary to call stop_per_cpu_kthreads() which stops osnoise
kthread for every other CPUs in the system if a failure occurs during
hotplug of a certain CPU.
For start_per_cpu_kthreads(), if the start_kthread() call fails,
this function calls stop_per_cpu_kthreads() to handle the error.
Therefore, similarly, there is no need to call stop_per_cpu_kthreads()
again within start_kthread().
So just remove stop_per_cpu_kthreads() from start_kthread to solve this issue.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250321095249.2739397-1-ranxiaokai627@163.com
Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tengda Wu <wutengda@huaweicloud.com>
Date: Fri Mar 14 06:53:35 2025 +0000
tracing: Correct the refcount if the hist/hist_debug file fails to open
[ Upstream commit 0b4ffbe4888a2c71185eaf5c1a02dd3586a9bc04 ]
The function event_{hist,hist_debug}_open() maintains the refcount of
'file->tr' and 'file' through tracing_open_file_tr(). However, it does
not roll back these counts on subsequent failure paths, resulting in a
refcount leak.
A very obvious case is that if the hist/hist_debug file belongs to a
specific instance, the refcount leak will prevent the deletion of that
instance, as it relies on the condition 'tr->ref == 1' within
__remove_instance().
Fix this by calling tracing_release_file_tr() on all failure paths in
event_{hist,hist_debug}_open() to correct the refcount.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Link: https://lore.kernel.org/20250314065335.1202817-1-wutengda@huaweicloud.com
Fixes: 1cc111b9cddc ("tracing: Fix uaf issue when open the hist or hist_debug file")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Steven Rostedt <rostedt@goodmis.org>
Date: Sun Mar 23 15:21:51 2025 -0400
tracing: Do not use PERF enums when perf is not defined
commit 8eb1518642738c6892bd629b46043513a3bf1a6a upstream.
An update was made to up the module ref count when a synthetic event is
registered for both trace and perf events. But if perf is not configured
in, the perf enums used will cause the kernel to fail to build.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Link: https://lore.kernel.org/20250323152151.528b5ced@batman.local.home
Fixes: 21581dd4e7ff ("tracing: Ensure module defining synth event cannot be unloaded while tracing")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503232230.TeREVy8R-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Douglas Raillard <douglas.raillard@arm.com>
Date: Tue Mar 18 18:09:05 2025 +0000
tracing: Ensure module defining synth event cannot be unloaded while tracing
commit 21581dd4e7ff6c07d0ab577e3c32b13a74b31522 upstream.
Currently, using synth_event_delete() will fail if the event is being
used (tracing in progress), but that is normally done in the module exit
function. At that stage, failing is problematic as returning a non-zero
status means the module will become locked (impossible to unload or
reload again).
Instead, ensure the module exit function does not get called in the
first place by increasing the module refcnt when the event is enabled.
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 35ca5207c2d11 ("tracing: Add synthetic event command generation functions")
Link: https://lore.kernel.org/20250318180906.226841-1-douglas.raillard@arm.com
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Douglas Raillard <douglas.raillard@arm.com>
Date: Tue Mar 25 16:52:02 2025 +0000
tracing: Fix synth event printk format for str fields
commit 4d38328eb442dc06aec4350fd9594ffa6488af02 upstream.
The printk format for synth event uses "%.*s" to print string fields,
but then only passes the pointer part as var arg.
Replace %.*s with %s as the C string is guaranteed to be null-terminated.
The output in print fmt should never have been updated as __get_str()
handles the string limit because it can access the length of the string in
the string meta data that is saved in the ring buffer.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 8db4d6bfbbf92 ("tracing: Change synthetic event string format to limit printed length")
Link: https://lore.kernel.org/20250325165202.541088-1-douglas.raillard@arm.com
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tengda Wu <wutengda@huaweicloud.com>
Date: Thu Mar 20 12:21:37 2025 +0000
tracing: Fix use-after-free in print_graph_function_flags during tracer switching
commit 7f81f27b1093e4895e87b74143c59c055c3b1906 upstream.
Kairui reported a UAF issue in print_graph_function_flags() during
ftrace stress testing [1]. This issue can be reproduced if puting a
'mdelay(10)' after 'mutex_unlock(&trace_types_lock)' in s_start(),
and executing the following script:
$ echo function_graph > current_tracer
$ cat trace > /dev/null &
$ sleep 5 # Ensure the 'cat' reaches the 'mdelay(10)' point
$ echo timerlat > current_tracer
The root cause lies in the two calls to print_graph_function_flags
within print_trace_line during each s_show():
* One through 'iter->trace->print_line()';
* Another through 'event->funcs->trace()', which is hidden in
print_trace_fmt() before print_trace_line returns.
Tracer switching only updates the former, while the latter continues
to use the print_line function of the old tracer, which in the script
above is print_graph_function_flags.
Moreover, when switching from the 'function_graph' tracer to the
'timerlat' tracer, s_start only calls graph_trace_close of the
'function_graph' tracer to free 'iter->private', but does not set
it to NULL. This provides an opportunity for 'event->funcs->trace()'
to use an invalid 'iter->private'.
To fix this issue, set 'iter->private' to NULL immediately after
freeing it in graph_trace_close(), ensuring that an invalid pointer
is not passed to other tracers. Additionally, clean up the unnecessary
'iter->private = NULL' during each 'cat trace' when using wakeup and
irqsoff tracers.
[1] https://lore.kernel.org/all/20231112150030.84609-1-ryncsn@gmail.com/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Link: https://lore.kernel.org/20250320122137.23635-1-wutengda@huaweicloud.com
Fixes: eecb91b9f98d ("tracing: Fix memleak due to race between current_tracer and trace")
Closes: https://lore.kernel.org/all/CAMgjq7BW79KDSCyp+tZHjShSzHsScSiJxn5ffskp-QzVM06fxw@mail.gmail.com/
Reported-by: Kairui Song <kasong@tencent.com>
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Steven Rostedt <rostedt@goodmis.org>
Date: Thu Dec 19 15:12:05 2024 -0500
tracing: Switch trace_events_hist.c code over to use guard()
[ Upstream commit 2b36a97aeeb71b1e4a48bfedc7f21f44aeb1e6fb ]
There are a couple functions in trace_events_hist.c that have "goto out" or
equivalent on error in order to release locks that were taken. This can be
error prone or just simply make the code more complex.
Switch every location that ends with unlocking a mutex on error over to
using the guard(mutex)() infrastructure to let the compiler worry about
releasing locks. This makes the code easier to read and understand.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/20241219201345.694601480@goodmis.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Stable-dep-of: 0b4ffbe4888a ("tracing: Correct the refcount if the hist/hist_debug file fails to open")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Date: Mon Mar 17 08:00:20 2025 +0100
tty: n_tty: use uint for space returned by tty_write_room()
[ Upstream commit d97aa066678bd1e2951ee93db9690835dfe57ab6 ]
tty_write_room() returns an "unsigned int". So in case some insane
driver (like my tty test driver) returns (legitimate) UINT_MAX from its
tty_operations::write_room(), n_tty is confused on several places.
For example, in process_output_block(), the result of tty_write_room()
is stored into (signed) "int". So this UINT_MAX suddenly becomes -1. And
that is extended to ssize_t and returned from process_output_block().
This causes a write() to such a node to receive -EPERM (which is -1).
Fix that by using proper "unsigned int" and proper "== 0" test. And
return 0 constant directly in that "if", so that it is immediately clear
what is returned ("space" equals to 0 at that point).
Similarly for process_output() and __process_echoes().
Note this does not fix any in-tree driver as of now.
If you want "Fixes: something", it would be commit 03b3b1a2405c ("tty:
make tty_operations::write_room return uint"). I intentionally do not
mark this patch by a real tag below.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20250317070046.24386-6-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sherry Sun <sherry.sun@nxp.com>
Date: Mon Mar 24 10:10:51 2025 +0800
tty: serial: fsl_lpuart: Fix unused variable 'sport' build warning
commit 9f8fe348ac9544f6855f82565e754bf085d81f88 upstream.
Remove the unused variable 'sport' to avoid the kernel build warning.
Fixes: 3cc16ae096f1 ("tty: serial: fsl_lpuart: use port struct directly to simply code")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503210614.2qGlnbIq-lkp@intel.com/
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20250324021051.162676-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sherry Sun <sherry.sun@nxp.com>
Date: Wed Mar 12 10:39:03 2025 +0800
tty: serial: fsl_lpuart: use port struct directly to simply code
commit 3cc16ae096f164ae0c6b98416c25a01db5f3a529 upstream.
Most lpuart functions have the parameter struct uart_port *port, but
still use the &sport->port to get the uart_port instead of use it
directly, let's simply the code logic, directly use this struct instead
of covert it from struct sport.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20250312023904.1343351-3-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sherry Sun <sherry.sun@nxp.com>
Date: Wed Mar 12 10:39:02 2025 +0800
tty: serial: fsl_lpuart: Use u32 and u8 for register variables
[ Upstream commit b6a8f6ab2c53e5ea3c7f2a3978db378a89bb7595 ]
Use u32 and u8 rather than unsigned long or unsigned char for register
variables for clarity and consistency.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20250312023904.1343351-2-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: e98ab45ec518 ("tty: serial: lpuart: only disable CTS instead of overwriting the whole UARTMODIR register")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sherry Sun <sherry.sun@nxp.com>
Date: Fri Mar 7 14:54:46 2025 +0800
tty: serial: lpuart: only disable CTS instead of overwriting the whole UARTMODIR register
[ Upstream commit e98ab45ec5182605d2e00114cba3bbf46b0ea27f ]
No need to overwrite the whole UARTMODIR register before waiting the
transmit engine complete, actually our target here is only to disable
CTS flow control to avoid the dirty data in TX FIFO may block the
transmit engine complete.
Also delete the following duplicate CTS disable configuration.
Fixes: d5a2e0834364 ("tty: serial: lpuart: disable flow control while waiting for the transmit engine to complete")
Cc: stable <stable@kernel.org>
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20250307065446.1122482-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guillaume Nault <gnault@redhat.com>
Date: Sat Mar 29 01:33:44 2025 +0100
tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu().
[ Upstream commit 8930424777e43257f5bf6f0f0f53defd0d30415c ]
Because skb_tunnel_check_pmtu() doesn't handle PACKET_HOST packets,
commit 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper
pmtud support.") forced skb->pkt_type to PACKET_OUTGOING for
openvswitch packets that are sent using the OVS_ACTION_ATTR_OUTPUT
action. This allowed such packets to invoke the
iptunnel_pmtud_check_icmp() or iptunnel_pmtud_check_icmpv6() helpers
and thus trigger PMTU update on the input device.
However, this also broke other parts of PMTU discovery. Since these
packets don't have the PACKET_HOST type anymore, they won't trigger the
sending of ICMP Fragmentation Needed or Packet Too Big messages to
remote hosts when oversized (see the skb_in->pkt_type condition in
__icmp_send() for example).
These two skb->pkt_type checks are therefore incompatible as one
requires skb->pkt_type to be PACKET_HOST, while the other requires it
to be anything but PACKET_HOST.
It makes sense to not trigger ICMP messages for non-PACKET_HOST packets
as these messages should be generated only for incoming l2-unicast
packets. However there doesn't seem to be any reason for
skb_tunnel_check_pmtu() to ignore PACKET_HOST packets.
Allow both cases to work by allowing skb_tunnel_check_pmtu() to work on
PACKET_HOST packets and not overriding skb->pkt_type in openvswitch
anymore.
Fixes: 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper pmtud support.")
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/eac941652b86fddf8909df9b3bf0d97bc9444793.1743208264.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Lei <ming.lei@redhat.com>
Date: Thu Mar 27 17:51:10 2025 +0800
ublk: make sure ubq->canceling is set when queue is frozen
[ Upstream commit 8741d0737921ec1c03cf59aebf4d01400c2b461a ]
Now ublk driver depends on `ubq->canceling` for deciding if the request
can be dispatched via uring_cmd & io_uring_cmd_complete_in_task().
Once ubq->canceling is set, the uring_cmd can be done via ublk_cancel_cmd()
and io_uring_cmd_done().
So set ubq->canceling when queue is frozen, this way makes sure that the
flag can be observed from ublk_queue_rq() reliably, and avoids
use-after-free on uring_cmd.
Fixes: 216c8f5ef0f2 ("ublk: replace monitor with cancelable uring_cmd")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250327095123.179113-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mario Limonciello <mario.limonciello@amd.com>
Date: Thu Feb 20 23:40:03 2025 -0600
ucsi_ccg: Don't show failed to get FW build information error
[ Upstream commit c16006852732dc4fe37c14b81f9b4458df05b832 ]
The error `failed to get FW build information` is added for what looks
to be for misdetection of the device property firmware-name.
If the property is missing (such as on non-nvidia HW) this error shows up.
Move the error into the scope of the property parser for "firmware-name"
to avoid showing errors on systems without the firmware-name property.
Fixes: 5c9ae5a87573d ("usb: typec: ucsi: ccg: add firmware flashing support")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250221054137.1631765-2-superm1@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Tue Apr 1 11:44:43 2025 -0700
udp: Fix memory accounting leak.
[ Upstream commit df207de9d9e7a4d92f8567e2c539d9c8c12fd99d ]
Matt Dowling reported a weird UDP memory usage issue.
Under normal operation, the UDP memory usage reported in /proc/net/sockstat
remains close to zero. However, it occasionally spiked to 524,288 pages
and never dropped. Moreover, the value doubled when the application was
terminated. Finally, it caused intermittent packet drops.
We can reproduce the issue with the script below [0]:
1. /proc/net/sockstat reports 0 pages
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 1 mem 0
2. Run the script till the report reaches 524,288
# python3 test.py & sleep 5
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 3 mem 524288 <-- (INT_MAX + 1) >> PAGE_SHIFT
3. Kill the socket and confirm the number never drops
# pkill python3 && sleep 5
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 1 mem 524288
4. (necessary since v6.0) Trigger proto_memory_pcpu_drain()
# python3 test.py & sleep 1 && pkill python3
5. The number doubles
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 1 mem 1048577
The application set INT_MAX to SO_RCVBUF, which triggered an integer
overflow in udp_rmem_release().
When a socket is close()d, udp_destruct_common() purges its receive
queue and sums up skb->truesize in the queue. This total is calculated
and stored in a local unsigned integer variable.
The total size is then passed to udp_rmem_release() to adjust memory
accounting. However, because the function takes a signed integer
argument, the total size can wrap around, causing an overflow.
Then, the released amount is calculated as follows:
1) Add size to sk->sk_forward_alloc.
2) Round down sk->sk_forward_alloc to the nearest lower multiple of
PAGE_SIZE and assign it to amount.
3) Subtract amount from sk->sk_forward_alloc.
4) Pass amount >> PAGE_SHIFT to __sk_mem_reduce_allocated().
When the issue occurred, the total in udp_destruct_common() was 2147484480
(INT_MAX + 833), which was cast to -2147482816 in udp_rmem_release().
At 1) sk->sk_forward_alloc is changed from 3264 to -2147479552, and
2) sets -2147479552 to amount. 3) reverts the wraparound, so we don't
see a warning in inet_sock_destruct(). However, udp_memory_allocated
ends up doubling at 4).
Since commit 3cd3399dd7a8 ("net: implement per-cpu reserves for
memory_allocated"), memory usage no longer doubles immediately after
a socket is close()d because __sk_mem_reduce_allocated() caches the
amount in udp_memory_per_cpu_fw_alloc. However, the next time a UDP
socket receives a packet, the subtraction takes effect, causing UDP
memory usage to double.
This issue makes further memory allocation fail once the socket's
sk->sk_rmem_alloc exceeds net.ipv4.udp_rmem_min, resulting in packet
drops.
To prevent this issue, let's use unsigned int for the calculation and
call sk_forward_alloc_add() only once for the small delta.
Note that first_packet_length() also potentially has the same problem.
[0]:
from socket import *
SO_RCVBUFFORCE = 33
INT_MAX = (2 ** 31) - 1
s = socket(AF_INET, SOCK_DGRAM)
s.bind(('', 0))
s.setsockopt(SOL_SOCKET, SO_RCVBUFFORCE, INT_MAX)
c = socket(AF_INET, SOCK_DGRAM)
c.connect(s.getsockname())
data = b'a' * 100
while True:
c.send(data)
Fixes: f970bd9e3a06 ("udp: implement memory accounting helpers")
Reported-by: Matt Dowling <madowlin@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250401184501.67377-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Tue Apr 1 11:44:42 2025 -0700
udp: Fix multiple wraparounds of sk->sk_rmem_alloc.
[ Upstream commit 5a465a0da13ee9fbd7d3cd0b2893309b0fe4b7e3 ]
__udp_enqueue_schedule_skb() has the following condition:
if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
goto drop;
sk->sk_rcvbuf is initialised by net.core.rmem_default and later can
be configured by SO_RCVBUF, which is limited by net.core.rmem_max,
or SO_RCVBUFFORCE.
If we set INT_MAX to sk->sk_rcvbuf, the condition is always false
as sk->sk_rmem_alloc is also signed int.
Then, the size of the incoming skb is added to sk->sk_rmem_alloc
unconditionally.
This results in integer overflow (possibly multiple times) on
sk->sk_rmem_alloc and allows a single socket to have skb up to
net.core.udp_mem[1].
For example, if we set a large value to udp_mem[1] and INT_MAX to
sk->sk_rcvbuf and flood packets to the socket, we can see multiple
overflows:
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 3 mem 7956736 <-- (7956736 << 12) bytes > INT_MAX * 15
^- PAGE_SHIFT
# ss -uam
State Recv-Q ...
UNCONN -1757018048 ... <-- flipping the sign repeatedly
skmem:(r2537949248,rb2147483646,t0,tb212992,f1984,w0,o0,bl0,d0)
Previously, we had a boundary check for INT_MAX, which was removed by
commit 6a1f12dd85a8 ("udp: relax atomic operation on sk->sk_rmem_alloc").
A complete fix would be to revert it and cap the right operand by
INT_MAX:
rmem = atomic_add_return(size, &sk->sk_rmem_alloc);
if (rmem > min(size + (unsigned int)sk->sk_rcvbuf, INT_MAX))
goto uncharge_drop;
but we do not want to add the expensive atomic_add_return() back just
for the corner case.
Casting rmem to unsigned int prevents multiple wraparounds, but we still
allow a single wraparound.
# cat /proc/net/sockstat | grep UDP:
UDP: inuse 3 mem 524288 <-- (INT_MAX + 1) >> 12
# ss -uam
State Recv-Q ...
UNCONN -2147482816 ... <-- INT_MAX + 831 bytes
skmem:(r2147484480,rb2147483646,t0,tb212992,f3264,w0,o0,bl0,d14468947)
So, let's define rmem and rcvbuf as unsigned int and check skb->truesize
only when rcvbuf is large enough to lower the overflow possibility.
Note that we still have a small chance to see overflow if multiple skbs
to the same socket are processed on different core at the same time and
each size does not exceed the limit but the total size does.
Note also that we must ignore skb->truesize for a small buffer as
explained in commit 363dc73acacb ("udp: be less conservative with
sock rmem accounting").
Fixes: 6a1f12dd85a8 ("udp: relax atomic operation on sk->sk_rmem_alloc")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250401184501.67377-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Berg <benjamin.berg@intel.com>
Date: Fri Feb 14 10:28:22 2025 +0100
um: hostfs: avoid issues on inode number reuse by host
[ Upstream commit 0bc754d1e31f40f4a343b692096d9e092ccc0370 ]
Some file systems (e.g. ext4) may reuse inode numbers once the inode is
not in use anymore. Usually hostfs will keep an FD open for each inode,
but this is not always the case. In the case of sockets, this cannot
even be done properly.
As such, the following sequence of events was possible:
* application creates and deletes a socket
* hostfs creates/deletes the socket on the host
* inode is still in the hostfs cache
* hostfs creates a new file
* ext4 on the outside reuses the inode number
* hostfs finds the socket inode for the newly created file
* application receives -ENXIO when opening the file
As mentioned, this can only happen if the deleted file is a special file
that is never opened on the host (i.e. no .open fop).
As such, to prevent issues, it is sufficient to check that the inode
has the expected type. That said, also add a check for the inode birth
time, just to be on the safe side.
Fixes: 74ce793bcbde ("hostfs: Fix ephemeral inodes")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Tested-by: Mickaël Salaün <mic@digikod.net>
Link: https://patch.msgid.link/20250214092822.1241575-1-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Gow <davidgow@google.com>
Date: Mon Feb 10 18:53:51 2025 +0800
um: Pass the correct Rust target and options with gcc
[ Upstream commit 5550187c4c21740942c32a9ae56f9f472a104cb4 ]
In order to work around some issues with disabling SSE on older versions
of gcc (compilation would fail upon seeing a function declaration
containing a float, even if it was never called or defined), the
corresponding CFLAGS and RUSTFLAGS were only set when using clang.
However, this led to two problems:
- Newer gcc versions also wouldn't get the correct flags, despite not
having the bug.
- The RUSTFLAGS for setting the rust target definition were not set,
despite being unrelated. This works by chance for x86_64, as the
built-in default target is close enough, but not for 32-bit x86.
Move the target definition outside the conditional block, and update the
condition to take into account the gcc version.
Fixes: a3046a618a28 ("um: Only disable SSE on clang to work around old GCC bugs")
Signed-off-by: David Gow <davidgow@google.com>
Link: https://patch.msgid.link/20250210105353.2238769-2-davidgow@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Berg <benjamin.berg@intel.com>
Date: Mon Feb 10 17:09:26 2025 +0100
um: remove copy_from_kernel_nofault_allowed
[ Upstream commit 84a6fc378471fbeaf48f8604566a5a33a3d63c18 ]
There is no need to override the default version of this function
anymore as UML now has proper _nofault memory access functions.
Doing this also fixes the fact that the implementation was incorrect as
using mincore() will incorrectly flag pages as inaccessible if they were
swapped out by the host.
Fixes: f75b1b1bedfb ("um: Implement probe_kernel_read()")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20250210160926.420133-3-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiri Olsa <jolsa@kernel.org>
Date: Wed Feb 12 23:04:33 2025 +0100
uprobes/x86: Harden uretprobe syscall trampoline check
commit fa6192adc32f4fdfe5b74edd5b210e12afd6ecc0 upstream.
Jann reported a possible issue when trampoline_check_ip returns
address near the bottom of the address space that is allowed to
call into the syscall if uretprobes are not set up:
https://lore.kernel.org/bpf/202502081235.5A6F352985@keescook/T/#m9d416df341b8fbc11737dacbcd29f0054413cbbf
Though the mmap minimum address restrictions will typically prevent
creating mappings there, let's make sure uretprobe syscall checks
for that.
Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250212220433.3624297-1-jolsa@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niklas Neronin <niklas.neronin@linux.intel.com>
Date: Thu Mar 6 16:49:47 2025 +0200
usb: xhci: correct debug message page size calculation
[ Upstream commit 55741c723318905e6d5161bf1e12749020b161e3 ]
The ffs() function returns the index of the first set bit, starting from 1.
If no bits are set, it returns zero. This behavior causes an off-by-one
page size in the debug message, as the page size calculation [1]
is zero-based, while ffs() is one-based.
Fix this by subtracting one from the result of ffs(). Note that since
variable 'val' is unsigned, subtracting one from zero will result in the
maximum unsigned integer value. Consequently, the condition 'if (val < 16)'
will still function correctly.
[1], Page size: (2^(n+12)), where 'n' is the set page size bit.
Fixes: 81720ec5320c ("usb: host: xhci: use ffs() in xhci_mem_init()")
Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-9-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ying Lu <luying1@xiaomi.com>
Date: Wed Apr 2 16:58:59 2025 +0800
usbnet:fix NPE during rx_complete
commit 51de3600093429e3b712e5f091d767babc5dd6df upstream.
Missing usbnet_going_away Check in Critical Path.
The usb_submit_urb function lacks a usbnet_going_away
validation, whereas __usbnet_queue_skb includes this check.
This inconsistency creates a race condition where:
A URB request may succeed, but the corresponding SKB data
fails to be queued.
Subsequent processes:
(e.g., rx_complete → defer_bh → __skb_unlink(skb, list))
attempt to access skb->next, triggering a NULL pointer
dereference (Kernel Panic).
Fixes: 04e906839a05 ("usbnet: fix cyclical race on disconnect with work queue")
Cc: stable@vger.kernel.org
Signed-off-by: Ying Lu <luying1@xiaomi.com>
Link: https://patch.msgid.link/4c9ef2efaa07eb7f9a5042b74348a67e5a3a7aea.1743584159.git.luying1@xiaomi.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mike Christie <michael.christie@oracle.com>
Date: Wed Jan 29 15:09:22 2025 -0600
vhost-scsi: Fix handling of multiple calls to vhost_scsi_set_endpoint
[ Upstream commit 5dd639a1646ef5fe8f4bf270fad47c5c3755b9b6 ]
If vhost_scsi_set_endpoint is called multiple times without a
vhost_scsi_clear_endpoint between them, we can hit multiple bugs
found by Haoran Zhang:
1. Use-after-free when no tpgs are found:
This fixes a use after free that occurs when vhost_scsi_set_endpoint is
called more than once and calls after the first call do not find any
tpgs to add to the vs_tpg. When vhost_scsi_set_endpoint first finds
tpgs to add to the vs_tpg array match=true, so we will do:
vhost_vq_set_backend(vq, vs_tpg);
...
kfree(vs->vs_tpg);
vs->vs_tpg = vs_tpg;
If vhost_scsi_set_endpoint is called again and no tpgs are found
match=false so we skip the vhost_vq_set_backend call leaving the
pointer to the vs_tpg we then free via:
kfree(vs->vs_tpg);
vs->vs_tpg = vs_tpg;
If a scsi request is then sent we do:
vhost_scsi_handle_vq -> vhost_scsi_get_req -> vhost_vq_get_backend
which sees the vs_tpg we just did a kfree on.
2. Tpg dir removal hang:
This patch fixes an issue where we cannot remove a LIO/target layer
tpg (and structs above it like the target) dir due to the refcount
dropping to -1.
The problem is that if vhost_scsi_set_endpoint detects a tpg is already
in the vs->vs_tpg array or if the tpg has been removed so
target_depend_item fails, the undepend goto handler will do
target_undepend_item on all tpgs in the vs_tpg array dropping their
refcount to 0. At this time vs_tpg contains both the tpgs we have added
in the current vhost_scsi_set_endpoint call as well as tpgs we added in
previous calls which are also in vs->vs_tpg.
Later, when vhost_scsi_clear_endpoint runs it will do
target_undepend_item on all the tpgs in the vs->vs_tpg which will drop
their refcount to -1. Userspace will then not be able to remove the tpg
and will hang when it tries to do rmdir on the tpg dir.
3. Tpg leak:
This fixes a bug where we can leak tpgs and cause them to be
un-removable because the target name is overwritten when
vhost_scsi_set_endpoint is called multiple times but with different
target names.
The bug occurs if a user has called VHOST_SCSI_SET_ENDPOINT and setup
a vhost-scsi device to target/tpg mapping, then calls
VHOST_SCSI_SET_ENDPOINT again with a new target name that has tpgs we
haven't seen before (target1 has tpg1 but target2 has tpg2). When this
happens we don't teardown the old target tpg mapping and just overwrite
the target name and the vs->vs_tpg array. Later when we do
vhost_scsi_clear_endpoint, we are passed in either target1 or target2's
name and we will only match that target's tpgs when we loop over the
vs->vs_tpg. We will then return from the function without doing
target_undepend_item on the tpgs.
Because of all these bugs, it looks like being able to call
vhost_scsi_set_endpoint multiple times was never supported. The major
user, QEMU, already has checks to prevent this use case. So to fix the
issues, this patch prevents vhost_scsi_set_endpoint from being called
if it's already successfully added tpgs. To add, remove or change the
tpg config or target name, you must do a vhost_scsi_clear_endpoint
first.
Fixes: 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session")
Fixes: 4f7f46d32c98 ("tcm_vhost: Use vq->private_data to indicate if the endpoint is setup")
Reported-by: Haoran Zhang <wh1sper@zju.edu.cn>
Closes: https://lore.kernel.org/virtualization/e418a5ee-45ca-4d18-9b5d-6f8b6b1add8e@oracle.com/T/#me6c0041ce376677419b9b2563494172a01487ecb
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20250129210922.121533-1-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefano Garzarella <sgarzare@redhat.com>
Date: Fri Mar 28 15:15:28 2025 +0100
vsock: avoid timeout during connect() if the socket is closing
[ Upstream commit fccd2b711d9628c7ce0111d5e4938652101ee30a ]
When a peer attempts to establish a connection, vsock_connect() contains
a loop that waits for the state to be TCP_ESTABLISHED. However, the
other peer can be fast enough to accept the connection and close it
immediately, thus moving the state to TCP_CLOSING.
When this happens, the peer in the vsock_connect() is properly woken up,
but since the state is not TCP_ESTABLISHED, it goes back to sleep
until the timeout expires, returning -ETIMEDOUT.
If the socket state is TCP_CLOSING, waiting for the timeout is pointless.
vsock_connect() can return immediately without errors or delay since the
connection actually happened. The socket will be in a closing state,
but this is not an issue, and subsequent calls will fail as expected.
We discovered this issue while developing a test that accepts and
immediately closes connections to stress the transport switch between
two connect() calls, where the first one was interrupted by a signal
(see Closes link).
Reported-by: Luigi Leonardi <leonardi@redhat.com>
Closes: https://lore.kernel.org/virtualization/bq6hxrolno2vmtqwcvb5bljfpb7mvwb3kohrvaed6auz5vxrfv@ijmd2f3grobn/
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://patch.msgid.link/20250328141528.420719-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chenyuan Yang <chenyuan0y@gmail.com>
Date: Sat Jan 11 12:18:03 2025 -0600
w1: fix NULL pointer dereference in probe
[ Upstream commit 0dd6770a72f138dabea9eae87f3da6ffa68f0d06 ]
The w1_uart_probe() function calls w1_uart_serdev_open() (which includes
devm_serdev_device_open()) before setting the client ops via
serdev_device_set_client_ops(). This ordering can trigger a NULL pointer
dereference in the serdev controller's receive_buf handler, as it assumes
serdev->ops is valid when SERPORT_ACTIVE is set.
This is similar to the issue fixed in commit 5e700b384ec1
("platform/chrome: cros_ec_uart: properly fix race condition") where
devm_serdev_device_open() was called before fully initializing the
device.
Fix the race by ensuring client ops are set before enabling the port via
w1_uart_serdev_open().
Fixes: a3c08804364e ("w1: add UART w1 bus driver")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Acked-by: Christoph Winklhofer <cj.winklhofer@gmail.com>
Link: https://lore.kernel.org/r/20250111181803.2283611-1-chenyuan0y@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Sandeen <sandeen@redhat.com>
Date: Thu Feb 27 11:41:08 2025 -0600
watch_queue: fix pipe accounting mismatch
[ Upstream commit f13abc1e8e1a3b7455511c4e122750127f6bc9b0 ]
Currently, watch_queue_set_size() modifies the pipe buffers charged to
user->pipe_bufs without updating the pipe->nr_accounted on the pipe
itself, due to the if (!pipe_has_watch_queue()) test in
pipe_resize_ring(). This means that when the pipe is ultimately freed,
we decrement user->pipe_bufs by something other than what than we had
charged to it, potentially leading to an underflow. This in turn can
cause subsequent too_many_pipe_buffers_soft() tests to fail with -EPERM.
To remedy this, explicitly account for the pipe usage in
watch_queue_set_size() to match the number set via account_pipe_buffers()
(It's unclear why watch_queue_set_size() does not update nr_accounted;
it may be due to intentional overprovisioning in watch_queue_set_size()?)
Fixes: e95aada4cb93d ("pipe: wakeup wr_wait after setting max_usage")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Link: https://lore.kernel.org/r/206682a8-0604-49e5-8224-fdbe0c12b460@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Huafei <lihuafei1@huawei.com>
Date: Tue Oct 22 03:30:03 2024 +0800
watchdog/hardlockup/perf: Fix perf_event memory leak
[ Upstream commit d6834d9c990333bfa433bc1816e2417f268eebbe ]
During stress-testing, we found a kmemleak report for perf_event:
unreferenced object 0xff110001410a33e0 (size 1328):
comm "kworker/4:11", pid 288, jiffies 4294916004
hex dump (first 32 bytes):
b8 be c2 3b 02 00 11 ff 22 01 00 00 00 00 ad de ...;....".......
f0 33 0a 41 01 00 11 ff f0 33 0a 41 01 00 11 ff .3.A.....3.A....
backtrace (crc 24eb7b3a):
[<00000000e211b653>] kmem_cache_alloc_node_noprof+0x269/0x2e0
[<000000009d0985fa>] perf_event_alloc+0x5f/0xcf0
[<00000000084ad4a2>] perf_event_create_kernel_counter+0x38/0x1b0
[<00000000fde96401>] hardlockup_detector_event_create+0x50/0xe0
[<0000000051183158>] watchdog_hardlockup_enable+0x17/0x70
[<00000000ac89727f>] softlockup_start_fn+0x15/0x40
...
Our stress test includes CPU online and offline cycles, and updating the
watchdog configuration.
After reading the code, I found that there may be a race between cleaning up
perf_event after updating watchdog and disabling event when the CPU goes offline:
CPU0 CPU1 CPU2
(update watchdog) (hotplug offline CPU1)
... _cpu_down(CPU1)
cpus_read_lock() // waiting for cpu lock
softlockup_start_all
smp_call_on_cpu(CPU1)
softlockup_start_fn
...
watchdog_hardlockup_enable(CPU1)
perf create E1
watchdog_ev[CPU1] = E1
cpus_read_unlock()
cpus_write_lock()
cpuhp_kick_ap_work(CPU1)
cpuhp_thread_fun
...
watchdog_hardlockup_disable(CPU1)
watchdog_ev[CPU1] = NULL
dead_event[CPU1] = E1
__lockup_detector_cleanup
for each dead_events_mask
release each dead_event
/*
* CPU1 has not been added to
* dead_events_mask, then E1
* will not be released
*/
CPU1 -> dead_events_mask
cpumask_clear(&dead_events_mask)
// dead_events_mask is cleared, E1 is leaked
In this case, the leaked perf_event E1 matches the perf_event leak
reported by kmemleak. Due to the low probability of problem recurrence
(only reported once), I added some hack delays in the code:
static void __lockup_detector_reconfigure(void)
{
...
watchdog_hardlockup_start();
cpus_read_unlock();
+ mdelay(100);
/*
* Must be called outside the cpus locked section to prevent
* recursive locking in the perf code.
...
}
void watchdog_hardlockup_disable(unsigned int cpu)
{
...
perf_event_disable(event);
this_cpu_write(watchdog_ev, NULL);
this_cpu_write(dead_event, event);
+ mdelay(100);
cpumask_set_cpu(smp_processor_id(), &dead_events_mask);
atomic_dec(&watchdog_cpus);
...
}
void hardlockup_detector_perf_cleanup(void)
{
...
perf_event_release_kernel(event);
per_cpu(dead_event, cpu) = NULL;
}
+ mdelay(100);
cpumask_clear(&dead_events_mask);
}
Then, simultaneously performing CPU on/off and switching watchdog, it is
almost certain to reproduce this leak.
The problem here is that releasing perf_event is not within the CPU
hotplug read-write lock. Commit:
941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")
introduced deferred release to solve the deadlock caused by calling
get_online_cpus() when releasing perf_event. Later, commit:
efe951d3de91 ("perf/x86: Fix perf,x86,cpuhp deadlock")
removed the get_online_cpus() call on the perf_event release path to solve
another deadlock problem.
Therefore, it is now possible to move the release of perf_event back
into the CPU hotplug read-write lock, and release the event immediately
after disabling it.
Fixes: 941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241021193004.308303-1-lihuafei1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthias Proske <email@matthias-proske.de>
Date: Wed Feb 12 19:59:35 2025 +0100
wifi: brcmfmac: keep power during suspend if board requires it
[ Upstream commit 8c3170628a9ce24a59647bd24f897e666af919b8 ]
After commit 92cadedd9d5f ("brcmfmac: Avoid keeping power to SDIO card
unless WOWL is used"), the wifi adapter by default is turned off on
suspend and then re-probed on resume.
This conflicts with some embedded boards that require to remain powered.
They will fail on resume with:
brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
ieee80211 phy1: brcmf_bus_started: failed: -110
ieee80211 phy1: brcmf_attach: dongle is not responding: err=-110
brcmfmac: brcmf_sdio_firmware_callback: brcmf_attach failed
This commit checks for the Device Tree property 'cap-power-off-cards'.
If this property is not set, it means that we do not have the capability
to power off and should therefore remain powered.
Signed-off-by: Matthias Proske <email@matthias-proske.de>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20250212185941.146958-2-email@matthias-proske.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johannes Berg <johannes.berg@intel.com>
Date: Sun Feb 9 14:34:45 2025 +0200
wifi: iwlwifi: fw: allocate chained SG tables for dump
[ Upstream commit 7774e3920029398ad49dc848b23840593f14d515 ]
The firmware dumps can be pretty big, and since we use single
pages for each SG table entry, even the table itself may end
up being an order-5 allocation. Build chained tables so that
we need not allocate a higher-order table here.
This could be improved and cleaned up, e.g. by using the SG
pool code or simply kvmalloc(), but all of that would require
also updating the devcoredump first since that frees it all,
so we need to be more careful. SG pool might also run against
the CONFIG_ARCH_NO_SG_CHAIN limitation, which is irrelevant
here.
Also use _devcd_free_sgtable() for the error paths now, much
simpler especially since it's in two places now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250209143303.697c7a465ac9.Iea982df46b5c075bfb77ade36f187d99a70c63db@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date: Sun Feb 9 14:34:50 2025 +0200
wifi: iwlwifi: mvm: use the right version of the rate API
[ Upstream commit a03e2082e678ea10d0d8bdf3ed933eb05a8ddbb0 ]
The firmware uses the newer version of the API in recent devices. For
older devices, we translate the rate to the new format.
Don't parse the rate with old parsing macros.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250209143303.13d70cdcbb4e.Ic92193bce4013b70a823cfef250ee79c16cf7c17@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Wetzel <Alexander@wetzel-home.de>
Date: Tue Feb 4 13:31:29 2025 +0100
wifi: mac80211: Cleanup sta TXQs on flush
[ Upstream commit 5b999006e35ea9c11116ddff7e375b256421d0af ]
Drop the sta TXQs on flush when the drivers is not supporting
flush.
ieee80211_set_disassoc() tries to clean up everything for the sta.
But it ignored queued frames in the sta TX queues when the driver
isn't supporting the flush driver ops.
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Link: https://patch.msgid.link/20250204123129.9162-1-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johannes Berg <johannes.berg@intel.com>
Date: Thu Mar 6 12:37:58 2025 +0200
wifi: mac80211: fix SA Query processing in MLO
[ Upstream commit 9a267ce4a3fca93a34a8881046f97bcf472228c8 ]
When MLO is used and SA Query processing isn't done by
userspace (e.g. wpa_supplicant w/o CONFIG_OCV), then
the mac80211 code kicks in but uses the wrong addresses.
Fix them.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250306123626.bab48bb49061.I9391b22f1360d20ac8c4e92604de23f27696ba8f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Wetzel <Alexander@wetzel-home.de>
Date: Thu Feb 13 22:43:30 2025 +0100
wifi: mac80211: Fix sparse warning for monitor_sdata
commit 861d0445e72e9e33797f2ceef882c74decb16a87 upstream.
Use rcu_access_pointer() to avoid sparse warning in
drv_remove_interface().
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502130534.bVrZZBK0-lkp@intel.com/
Fixes: 646262c71aca ("wifi: mac80211: remove debugfs dir for virtual monitor")
Link: https://patch.msgid.link/20250213214330.6113-1-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date: Thu Mar 6 12:37:55 2025 +0200
wifi: mac80211: flush the station before moving it to UN-AUTHORIZED state
[ Upstream commit 43e04077170799d0e6289f3e928f727e401b3d79 ]
We first want to flush the station to make sure we no longer have any
frames being Tx by the station before the station is moved to
un-authorized state. Failing to do that will lead to races: a frame may
be sent after the station's state has been changed.
Since the API clearly states that the driver can't fail the sta_state()
transition down the list of state, we can easily flush the station
first, and only then call the driver's sta_state().
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250306123626.450bc40e8b04.I636ba96843c77f13309c15c9fd6eb0c5a52a7976@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Wetzel <Alexander@wetzel-home.de>
Date: Tue Feb 4 17:42:40 2025 +0100
wifi: mac80211: remove debugfs dir for virtual monitor
[ Upstream commit 646262c71aca87bb66945933abe4e620796d6c5a ]
Don't call ieee80211_debugfs_recreate_netdev() for virtual monitor
interface when deleting it.
The virtual monitor interface shouldn't have debugfs entries and trying
to update them will *create* them on deletion.
And when the virtual monitor interface is created/destroyed multiple
times we'll get warnings about debugfs name conflicts.
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Link: https://patch.msgid.link/20250204164240.370153-1-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Date: Tue Feb 18 11:33:42 2025 +0800
wifi: mt76: mt7921: fix kernel panic due to null pointer dereference
commit adc3fd2a2277b7cc0b61692463771bf9bd298036 upstream.
Address a kernel panic caused by a null pointer dereference in the
`mt792x_rx_get_wcid` function. The issue arises because the `deflink` structure
is not properly initialized with the `sta` context. This patch ensures that the
`deflink` structure is correctly linked to the `sta` context, preventing the
null pointer dereference.
BUG: kernel NULL pointer dereference, address: 0000000000000400
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 UID: 0 PID: 470 Comm: mt76-usb-rx phy Not tainted 6.12.13-gentoo-dist #1
Hardware name: /AMD HUDSON-M1, BIOS 4.6.4 11/15/2011
RIP: 0010:mt792x_rx_get_wcid+0x48/0x140 [mt792x_lib]
RSP: 0018:ffffa147c055fd98 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8e9ecb652000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8e9ecb652000
RBP: 0000000000000685 R08: ffff8e9ec6570000 R09: 0000000000000000
R10: ffff8e9ecd2ca000 R11: ffff8e9f22a217c0 R12: 0000000038010119
R13: 0000000080843801 R14: ffff8e9ec6570000 R15: ffff8e9ecb652000
FS: 0000000000000000(0000) GS:ffff8e9f22a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000400 CR3: 000000000d2ea000 CR4: 00000000000006f0
Call Trace:
<TASK>
? __die_body.cold+0x19/0x27
? page_fault_oops+0x15a/0x2f0
? search_module_extables+0x19/0x60
? search_bpf_extables+0x5f/0x80
? exc_page_fault+0x7e/0x180
? asm_exc_page_fault+0x26/0x30
? mt792x_rx_get_wcid+0x48/0x140 [mt792x_lib]
mt7921_queue_rx_skb+0x1c6/0xaa0 [mt7921_common]
mt76u_alloc_queues+0x784/0x810 [mt76_usb]
? __pfx___mt76_worker_fn+0x10/0x10 [mt76]
__mt76_worker_fn+0x4f/0x80 [mt76]
kthread+0xd2/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---
Reported-by: Nick Morrow <usbwifi2024@gmail.com>
Closes: https://github.com/morrownr/USB-WiFi/issues/577
Cc: stable@vger.kernel.org
Fixes: 90c10286b176 ("wifi: mt76: mt7925: Update mt792x_rx_get_wcid for per-link STA")
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Tested-by: Salah Coronya <salah.coronya@gmail.com>
Link: https://patch.msgid.link/20250218033343.1999648-1-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Date: Tue Mar 4 19:36:47 2025 +0800
wifi: mt76: mt7925: remove unused acpi function for clc
commit b4ea6fdfc08375aae59c7e7059653b9877171fe4 upstream.
The code for handling ACPI configuration in CLC was copied from the mt7921
driver but is not utilized in the mt7925 implementation. So removes the
unused functionality to clean up the codebase.
Cc: stable@vger.kernel.org
Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips")
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250304113649.867387-4-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jann Horn <jannh@google.com>
Date: Tue Mar 25 03:01:23 2025 +0100
x86/dumpstack: Fix inaccurate unwinding from exception stacks due to misplaced assignment
[ Upstream commit 2c118f50d7fd4d9aefc4533a26f83338b2906b7a ]
Commit:
2e4be0d011f2 ("x86/show_trace_log_lvl: Ensure stack pointer is aligned, again")
was intended to ensure alignment of the stack pointer; but it also moved
the initialization of the "stack" variable down into the loop header.
This was likely intended as a no-op cleanup, since the commit
message does not mention it; however, this caused a behavioral change
because the value of "regs" is different between the two places.
Originally, get_stack_pointer() used the regs provided by the caller; after
that commit, get_stack_pointer() instead uses the regs at the top of the
stack frame the unwinder is looking at. Often, there are no such regs at
all, and "regs" is NULL, causing get_stack_pointer() to fall back to the
task's current stack pointer, which is not what we want here, but probably
happens to mostly work. Other times, the original regs will point to
another regs frame - in that case, the linear guess unwind logic in
show_trace_log_lvl() will start unwinding too far up the stack, causing the
first frame found by the proper unwinder to never be visited, resulting in
a stack trace consisting purely of guess lines.
Fix it by moving the "stack = " assignment back where it belongs.
Fixes: 2e4be0d011f2 ("x86/show_trace_log_lvl: Ensure stack pointer is aligned, again")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250325-2025-03-unwind-fixes-v1-2-acd774364768@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Tue Dec 10 16:16:50 2024 +0100
x86/entry: Add __init to ia32_emulation_override_cmdline()
[ Upstream commit d55f31e29047f2f987286d55928ae75775111fe7 ]
ia32_emulation_override_cmdline() is an early_param() arg and these
are only needed at boot time. In fact, all other early_param() functions
in arch/x86 seem to have '__init' annotation and
ia32_emulation_override_cmdline() is the only exception.
Fixes: a11e097504ac ("x86: Make IA32_EMULATION boot time configurable")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/all/20241210151650.1746022-1-vkuznets%40redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jann Horn <jannh@google.com>
Date: Tue Mar 25 03:01:22 2025 +0100
x86/entry: Fix ORC unwinder for PUSH_REGS with save_ret=1
[ Upstream commit 57e2428f8df8263275344566e02c277648a4b7f1 ]
PUSH_REGS with save_ret=1 is used by interrupt entry helper functions that
initially start with a UNWIND_HINT_FUNC ORC state.
However, save_ret=1 means that we clobber the helper function's return
address (and then later restore the return address further down on the
stack); after that point, the only thing on the stack we can unwind through
is the IRET frame, so use UNWIND_HINT_IRET_REGS until we have a full
pt_regs frame.
( An alternate approach would be to move the pt_regs->di overwrite down
such that it is the final step of pt_regs setup; but I don't want to
rearrange entry code just to make unwinding a tiny bit more elegant. )
Fixes: 9e809d15d6b6 ("x86/entry: Reduce the code footprint of the 'idtentry' macro")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250325-2025-03-unwind-fixes-v1-1-acd774364768@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Gao <chao.gao@intel.com>
Date: Mon Mar 17 22:06:11 2025 +0800
x86/fpu/xstate: Fix inconsistencies in guest FPU xfeatures
[ Upstream commit dda366083e5ff307a4a728757db874bbfe7550be ]
Guest FPUs manage vCPU FPU states. They are allocated via
fpu_alloc_guest_fpstate() and are resized in fpstate_realloc() when XFD
features are enabled.
Since the introduction of guest FPUs, there have been inconsistencies in
the kernel buffer size and xfeatures:
1. fpu_alloc_guest_fpstate() uses fpu_user_cfg since its introduction. See:
69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup")
36487e6228c4 ("x86/fpu: Prepare guest FPU for dynamically enabled FPU features")
2. __fpstate_reset() references fpu_kernel_cfg to set storage attributes.
3. fpu->guest_perm uses fpu_kernel_cfg, affecting fpstate_realloc().
A recent commit in the tip:x86/fpu tree partially addressed the inconsistency
between (1) and (3) by using fpu_kernel_cfg for size calculation in (1),
but left fpu_guest->xfeatures and fpu_guest->perm still referencing
fpu_user_cfg:
https://lore.kernel.org/all/20250218141045.85201-1-stanspas@amazon.de/
1937e18cc3cf ("x86/fpu: Fix guest FPU state buffer allocation size")
The inconsistencies within fpu_alloc_guest_fpstate() and across the
mentioned functions cause confusion.
Fix them by using fpu_kernel_cfg consistently in fpu_alloc_guest_fpstate(),
except for fields related to the UABI buffer. Referencing fpu_kernel_cfg
won't impact functionalities, as:
1. fpu_guest->perm is overwritten shortly in fpu_init_guest_permissions()
with fpstate->guest_perm, which already uses fpu_kernel_cfg.
2. fpu_guest->xfeatures is solely used to check if XFD features are enabled.
Including supervisor xfeatures doesn't affect the check.
Fixes: 36487e6228c4 ("x86/fpu: Prepare guest FPU for dynamically enabled FPU features")
Suggested-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Link: https://lore.kernel.org/r/20250317140613.1761633-1-chao.gao@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Berg <benjamin.berg@intel.com>
Date: Wed Feb 26 14:31:36 2025 +0100
x86/fpu: Avoid copying dynamic FP state from init_task in arch_dup_task_struct()
[ Upstream commit 5d3b81d4d8520efe888536b6906dc10fd1a228a8 ]
The init_task instance of struct task_struct is statically allocated and
may not contain the full FP state for userspace. As such, limit the copy
to the valid area of both init_task and 'dst' and ensure all memory is
initialized.
Note that the FP state is only needed for userspace, and as such it is
entirely reasonable for init_task to not contain parts of it.
Fixes: 5aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250226133136.816901-1-benjamin@sipsolutions.net
----
v2:
- Fix code if arch_task_struct_size < sizeof(init_task) by using
memcpy_and_pad.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stanislav Spassov <stanspas@amazon.de>
Date: Tue Feb 18 14:10:45 2025 +0000
x86/fpu: Fix guest FPU state buffer allocation size
[ Upstream commit 1937e18cc3cf27e2b3ef70e8c161437051ab7608 ]
Ongoing work on an optimization to batch-preallocate vCPU state buffers
for KVM revealed a mismatch between the allocation sizes used in
fpu_alloc_guest_fpstate() and fpstate_realloc(). While the former
allocates a buffer sized to fit the default set of XSAVE features
in UABI form (as per fpu_user_cfg), the latter uses its ksize argument
derived (for the requested set of features) in the same way as the sizes
found in fpu_kernel_cfg, i.e. using the compacted in-kernel
representation.
The correct size to use for guest FPU state should indeed be the
kernel one as seen in fpstate_realloc(). The original issue likely
went unnoticed through a combination of UABI size typically being
larger than or equal to kernel size, and/or both amounting to the
same number of allocated 4K pages.
Fixes: 69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup")
Signed-off-by: Stanislav Spassov <stanspas@amazon.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250218141045.85201-1-stanspas@amazon.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Naman Jain <namjain@linux.microsoft.com>
Date: Thu Jan 16 06:12:24 2025 +0000
x86/hyperv/vtl: Stop kernel from probing VTL0 low memory
[ Upstream commit 59115e2e25f42924181055ed7cc1d123af7598b7 ]
For Linux, running in Hyper-V VTL (Virtual Trust Level), kernel in VTL2
tries to access VTL0 low memory in probe_roms. This memory is not
described in the e820 map. Initialize probe_roms call to no-ops
during boot for VTL2 kernel to avoid this. The issue got identified
in OpenVMM which detects invalid accesses initiated from kernel running
in VTL2.
Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Tested-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250116061224.1701-1-namjain@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250116061224.1701-1-namjain@linux.microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tianyu Lan <tiala@microsoft.com>
Date: Thu Mar 13 04:52:17 2025 -0400
x86/hyperv: Fix check of return value from snp_set_vmsa()
commit e792d843aa3c9d039074cdce728d5803262e57a7 upstream.
snp_set_vmsa() returns 0 as success result and so fix it.
Cc: stable@vger.kernel.org
Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest")
Signed-off-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250313085217.45483-1-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250313085217.45483-1-ltykernel@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Kelley <mhklinux@outlook.com>
Date: Wed Feb 26 12:06:06 2025 -0800
x86/hyperv: Fix output argument to hypercall that changes page visibility
[ Upstream commit 09beefefb57bbc3a06d98f319d85db4d719d7bcb ]
The hypercall in hv_mark_gpa_visibility() is invoked with an input
argument and an output argument. The output argument ostensibly returns
the number of pages that were processed. But in fact, the hypercall does
not provide any output, so the output argument is spurious.
The spurious argument is harmless because Hyper-V ignores it, but in the
interest of correctness and to avoid the potential for future problems,
remove it.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250226200612.2062-2-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250226200612.2062-2-mhklinux@outlook.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Wed Feb 26 22:37:05 2025 +0100
x86/Kconfig: Add cmpxchg8b support back to Geode CPUs
commit 6ac43f2be982ea54b75206dccd33f4cf81bfdc39 upstream.
An older cleanup of mine inadvertently removed geode-gx1 and geode-lx
from the list of CPUs that are known to support a working cmpxchg8b.
Fixes: 88a2b4edda3d ("x86/Kconfig: Rework CONFIG_X86_PAE dependency")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250226213714.4040853-2-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Shuai Xue <xueshuai@linux.alibaba.com>
Date: Wed Mar 12 19:28:50 2025 +0800
x86/mce: use is_copy_from_user() to determine copy-from-user context
commit 1a15bb8303b6b104e78028b6c68f76a0d4562134 upstream.
Patch series "mm/hwpoison: Fix regressions in memory failure handling",
v4.
## 1. What am I trying to do:
This patchset resolves two critical regressions related to memory failure
handling that have appeared in the upstream kernel since version 5.17, as
compared to 5.10 LTS.
- copyin case: poison found in user page while kernel copying from user space
- instr case: poison found while instruction fetching in user space
## 2. What is the expected outcome and why
- For copyin case:
Kernel can recover from poison found where kernel is doing get_user() or
copy_from_user() if those places get an error return and the kernel return
-EFAULT to the process instead of crashing. More specifily, MCE handler
checks the fixup handler type to decide whether an in kernel #MC can be
recovered. When EX_TYPE_UACCESS is found, the PC jumps to recovery code
specified in _ASM_EXTABLE_FAULT() and return a -EFAULT to user space.
- For instr case:
If a poison found while instruction fetching in user space, full recovery
is possible. User process takes #PF, Linux allocates a new page and fills
by reading from storage.
## 3. What actually happens and why
- For copyin case: kernel panic since v5.17
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and later patches updated the
extable fixup type for copy-from-user operations, changing it from
EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. It breaks previous EX_TYPE_UACCESS
handling when posion found in get_user() or copy_from_user().
- For instr case: user process is killed by a SIGBUS signal due to #CMCI
and #MCE race
When an uncorrected memory error is consumed there is a race between the
CMCI from the memory controller reporting an uncorrected error with a UCNA
signature, and the core reporting and SRAR signature machine check when
the data is about to be consumed.
### Background: why *UN*corrected errors tied to *C*MCI in Intel platform [1]
Prior to Icelake memory controllers reported patrol scrub events that
detected a previously unseen uncorrected error in memory by signaling a
broadcast machine check with an SRAO (Software Recoverable Action
Optional) signature in the machine check bank. This was overkill because
it's not an urgent problem that no core is on the verge of consuming that
bad data. It's also found that multi SRAO UCE may cause nested MCE
interrupts and finally become an IERR.
Hence, Intel downgrades the machine check bank signature of patrol scrub
from SRAO to UCNA (Uncorrected, No Action required), and signal changed to
#CMCI. Just to add to the confusion, Linux does take an action (in
uc_decode_notifier()) to try to offline the page despite the UC*NA*
signature name.
### Background: why #CMCI and #MCE race when poison is consuming in
Intel platform [1]
Having decided that CMCI/UCNA is the best action for patrol scrub errors,
the memory controller uses it for reads too. But the memory controller is
executing asynchronously from the core, and can't tell the difference
between a "real" read and a speculative read. So it will do CMCI/UCNA if
an error is found in any read.
Thus:
1) Core is clever and thinks address A is needed soon, issues a
speculative read.
2) Core finds it is going to use address A soon after sending the read
request
3) The CMCI from the memory controller is in a race with MCE from the
core that will soon try to retire the load from address A.
Quite often (because speculation has got better) the CMCI from the memory
controller is delivered before the core is committed to the instruction
reading address A, so the interrupt is taken, and Linux offlines the page
(marking it as poison).
## Why user process is killed for instr case
Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported
"not recovered"") tries to fix noise message "Memory error not recovered"
and skips duplicate SIGBUSs due to the race. But it also introduced a bug
that kill_accessing_process() return -EHWPOISON for instr case, as result,
kill_me_maybe() send a SIGBUS to user process.
# 4. The fix, in my opinion, should be:
- For copyin case:
The key point is whether the error context is in a read from user memory.
We do not care about the ex-type if we know its a MOV reading from
userspace.
is_copy_from_user() return true when both of the following two checks are
true:
- the current instruction is copy
- source address is user memory
If copy_user is true, we set
m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
Then do_machine_check() will try fixup_exception() first.
- For instr case: let kill_accessing_process() return 0 to prevent a SIGBUS.
- For patch 3:
The return value of memory_failure() is quite important while discussed
instr case regression with Tony and Miaohe for patch 2, so add comment
about the return value.
This patch (of 3):
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
("x86/futex: Remove .fixup usage") updated the extable fixup type for
copy-from-user operations, changing it from EX_TYPE_UACCESS to
EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
longer functions as an in-kernel recovery context. Consequently, the
error context for copy-from-user operations no longer functions as an
in-kernel recovery context, resulting in kernel panics with the message:
"Machine check: Data load in unrecoverable area of kernel."
To address this, it is crucial to identify if an error context involves a
read operation from user memory. The function is_copy_from_user() can be
utilized to determine:
- the current operation is copy
- when reading user memory
When these conditions are met, is_copy_from_user() will return true,
confirming that it is indeed a direct copy from user memory. This check
is essential for correctly handling the context of errors in these
operations without relying on the extable fixup types that previously
allowed for in-kernel recovery.
So, use is_copy_from_user() to determine if a context is copy user directly.
Link: https://lkml.kernel.org/r/20250312112852.82415-1-xueshuai@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250312112852.82415-2-xueshuai@linux.alibaba.com
Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Tony Luck <tony.luck@intel.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Thu Mar 27 19:05:02 2025 -0400
x86/microcode/AMD: Fix __apply_microcode_amd()'s return value
commit 31ab12df723543047c3fc19cb8f8c4498ec6267f upstream.
When verify_sha256_digest() fails, __apply_microcode_amd() should propagate
the failure by returning false (and not -1 which is promoted to true).
Fixes: 50cef76d5cb0 ("x86/microcode/AMD: Load only SHA256-checksummed patches")
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250327230503.1850368-2-boris.ostrovsky@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mike Rapoport (Microsoft) <rppt@kernel.org>
Date: Sun Jan 26 09:47:25 2025 +0200
x86/mm/pat: cpa-test: fix length for CPA_ARRAY test
[ Upstream commit 33ea120582a638b2f2e380a50686c2b1d7cce795 ]
The CPA_ARRAY test always uses len[1] as numpages argument to
change_page_attr_set() although the addresses array is different each
iteration of the test loop.
Replace len[1] with len[i] to have numpages matching the addresses array.
Fixes: ecc729f1f471 ("x86/mm/cpa: Add ARRAY and PAGES_ARRAY selftests")
Signed-off-by: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250126074733.1384926-2-rppt@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Hildenbrand <david@redhat.com>
Date: Fri Mar 21 12:23:23 2025 +0100
x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()
[ Upstream commit dc84bc2aba85a1508f04a936f9f9a15f64ebfb31 ]
If track_pfn_copy() fails, we already added the dst VMA to the maple
tree. As fork() fails, we'll cleanup the maple tree, and stumble over
the dst VMA for which we neither performed any reservation nor copied
any page tables.
Consequently untrack_pfn() will see VM_PAT and try obtaining the
PAT information from the page table -- which fails because the page
table was not copied.
The easiest fix would be to simply clear the VM_PAT flag of the dst VMA
if track_pfn_copy() fails. However, the whole thing is about "simply"
clearing the VM_PAT flag is shaky as well: if we passed track_pfn_copy()
and performed a reservation, but copying the page tables fails, we'll
simply clear the VM_PAT flag, not properly undoing the reservation ...
which is also wrong.
So let's fix it properly: set the VM_PAT flag only if the reservation
succeeded (leaving it clear initially), and undo the reservation if
anything goes wrong while copying the page tables: clearing the VM_PAT
flag after undoing the reservation.
Note that any copied page table entries will get zapped when the VMA will
get removed later, after copy_page_range() succeeded; as VM_PAT is not set
then, we won't try cleaning VM_PAT up once more and untrack_pfn() will be
happy. Note that leaving these page tables in place without a reservation
is not a problem, as we are aborting fork(); this process will never run.
A reproducer can trigger this usually at the first try:
https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/pat_fork.c
WARNING: CPU: 26 PID: 11650 at arch/x86/mm/pat/memtype.c:983 get_pat_info+0xf6/0x110
Modules linked in: ...
CPU: 26 UID: 0 PID: 11650 Comm: repro3 Not tainted 6.12.0-rc5+ #92
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:get_pat_info+0xf6/0x110
...
Call Trace:
<TASK>
...
untrack_pfn+0x52/0x110
unmap_single_vma+0xa6/0xe0
unmap_vmas+0x105/0x1f0
exit_mmap+0xf6/0x460
__mmput+0x4b/0x120
copy_process+0x1bf6/0x2aa0
kernel_clone+0xab/0x440
__do_sys_clone+0x66/0x90
do_syscall_64+0x95/0x180
Likely this case was missed in:
d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed")
... and instead of undoing the reservation we simply cleared the VM_PAT flag.
Keep the documentation of these functions in include/linux/pgtable.h,
one place is more than sufficient -- we should clean that up for the other
functions like track_pfn_remap/untrack_pfn separately.
Fixes: d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed")
Fixes: 2ab640379a0a ("x86: PAT: hooks in generic vm code to help archs to track pfnmap regions - v3")
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yuxin wang <wang1315768607@163.com>
Reported-by: Marius Fleischer <fleischermarius@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250321112323.153741-1-david@redhat.com
Closes: https://lore.kernel.org/lkml/CABOYnLx_dnqzpCW99G81DmOr+2UzdmZMk=T3uxwNxwz+R1RAwg@mail.gmail.com/
Closes: https://lore.kernel.org/lkml/CAJg=8jwijTP5fre8woS4JVJQ8iUA6v+iNcsOgtj9Zfpc3obDOQ@mail.gmail.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jann Horn <jannh@google.com>
Date: Fri Jan 3 19:39:38 2025 +0100
x86/mm: Fix flush_tlb_range() when used for zapping normal PMDs
commit 3ef938c3503563bfc2ac15083557f880d29c2e64 upstream.
On the following path, flush_tlb_range() can be used for zapping normal
PMD entries (PMD entries that point to page tables) together with the PTE
entries in the pointed-to page table:
collapse_pte_mapped_thp
pmdp_collapse_flush
flush_tlb_range
The arm64 version of flush_tlb_range() has a comment describing that it can
be used for page table removal, and does not use any last-level
invalidation optimizations. Fix the X86 version by making it behave the
same way.
Currently, X86 only uses this information for the following two purposes,
which I think means the issue doesn't have much impact:
- In native_flush_tlb_multi() for checking if lazy TLB CPUs need to be
IPI'd to avoid issues with speculative page table walks.
- In Hyper-V TLB paravirtualization, again for lazy TLB stuff.
The patch "x86/mm: only invalidate final translations with INVLPGB" which
is currently under review (see
<https://lore.kernel.org/all/20241230175550.4046587-13-riel@surriel.com/>)
would probably be making the impact of this a lot worse.
Fixes: 016c4d92cd16 ("x86/mm/tlb: Add freed_tables argument to flush_tlb_mm_range")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20250103-x86-collapse-flush-fix-v1-1-3c521856cfa6@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Wed Feb 26 22:37:14 2025 +0100
x86/platform: Only allow CONFIG_EISA for 32-bit
[ Upstream commit 976ba8da2f3c2f1e997f4f620da83ae65c0e3728 ]
The CONFIG_EISA menu was cleaned up in 2018, but this inadvertently
brought the option back on 64-bit machines: ISA remains guarded by
a CONFIG_X86_32 check, but EISA no longer depends on ISA.
The last Intel machines ith EISA support used a 82375EB PCI/EISA bridge
from 1993 that could be paired with the 440FX chipset on early Pentium-II
CPUs, long before the first x86-64 products.
Fixes: 6630a8e50105 ("eisa: consolidate EISA Kconfig entry in drivers/eisa")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250226213714.4040853-11-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: James Morse <james.morse@arm.com>
Date: Tue Mar 11 18:36:46 2025 +0000
x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
[ Upstream commit a121798ae669351ec0697c94f71c3a692b2a755b ]
Commit
6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
added logic that causes resctrl to search for the CLOSID with the fewest dirty
cache lines when creating a new control group, if requested by the arch code.
This depends on the values read from the llc_occupancy counters. The logic is
applicable to architectures where the CLOSID effectively forms part of the
monitoring identifier and so do not allow complete freedom to choose an unused
monitoring identifier for a given CLOSID.
This support missed that some platforms may not have these counters. This
causes a NULL pointer dereference when creating a new control group as the
array was not allocated by dom_data_init().
As this feature isn't necessary on platforms that don't have cache occupancy
monitors, add this to the check that occurs when a new control group is
allocated.
Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64
Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20250311183715.16445-2-james.morse@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kevin Loughlin <kevinloughlin@google.com>
Date: Fri Nov 22 20:23:22 2024 +0000
x86/sev: Add missing RIP_REL_REF() invocations during sme_enable()
[ Upstream commit 72dafb567760320f2de7447cd6e979bf9d4e5d17 ]
The following commit:
1c811d403afd ("x86/sev: Fix position dependent variable references in startup code")
introduced RIP_REL_REF() to force RIP-relative accesses to global variables,
as needed to prevent crashes during early SEV/SME startup code.
For completeness, RIP_REL_REF() should be used with additional variables during
sme_enable():
https://lore.kernel.org/all/CAMj1kXHnA0fJu6zh634=fbJswp59kSRAbhW+ubDGj1+NYwZJ-Q@mail.gmail.com/
Access these vars with RIP_REL_REF() to prevent problem reoccurence.
Fixes: 1c811d403afd ("x86/sev: Fix position dependent variable references in startup code")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20241122202322.977678-1-kevinloughlin@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladis Dronov <vdronov@redhat.com>
Date: Sun Mar 9 18:22:16 2025 +0100
x86/sgx: Warn explicitly if X86_FEATURE_SGX_LC is not enabled
[ Upstream commit 65be5c95d08eedda570a6c888a12384c77fe7614 ]
The kernel requires X86_FEATURE_SGX_LC to be able to create SGX enclaves,
not just X86_FEATURE_SGX.
There is quite a number of hardware which has X86_FEATURE_SGX but not
X86_FEATURE_SGX_LC. A kernel running on such hardware does not create
the /dev/sgx_enclave file and does so silently.
Explicitly warn if X86_FEATURE_SGX_LC is not enabled to properly notify
users that the kernel disabled the SGX driver.
The X86_FEATURE_SGX_LC, a.k.a. SGX Launch Control, is a CPU feature
that enables LE (Launch Enclave) hash MSRs to be writable (with
additional opt-in required in the 'feature control' MSR) when running
enclaves, i.e. using a custom root key rather than the Intel proprietary
key for enclave signing.
I've hit this issue myself and have spent some time researching where
my /dev/sgx_enclave file went on SGX-enabled hardware.
Related links:
https://github.com/intel/linux-sgx/issues/837
https://patchwork.kernel.org/project/platform-driver-x86/patch/20180827185507.17087-3-jarkko.sakkinen@linux.intel.com/
[ mingo: Made the error message a bit more verbose, and added other cases
where the kernel fails to create the /dev/sgx_enclave device node. ]
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Kai Huang <kai.huang@intel.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250309172215.21777-2-vdronov@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vishal Annapurve <vannapurve@google.com>
Date: Fri Feb 28 01:44:15 2025 +0000
x86/tdx: Fix arch_safe_halt() execution for TDX VMs
commit 9f98a4f4e7216dbe366010b4cdcab6b220f229c4 upstream.
Direct HLT instruction execution causes #VEs for TDX VMs which is routed
to hypervisor via TDCALL. If HLT is executed in STI-shadow, resulting #VE
handler will enable interrupts before TDCALL is routed to hypervisor
leading to missed wakeup events, as current TDX spec doesn't expose
interruptibility state information to allow #VE handler to selectively
enable interrupts.
Commit bfe6ed0c6727 ("x86/tdx: Add HLT support for TDX guests")
prevented the idle routines from executing HLT instruction in STI-shadow.
But it missed the paravirt routine which can be reached via this path
as an example:
kvm_wait() =>
safe_halt() =>
raw_safe_halt() =>
arch_safe_halt() =>
irq.safe_halt() =>
pv_native_safe_halt()
To reliably handle arch_safe_halt() for TDX VMs, introduce explicit
dependency on CONFIG_PARAVIRT and override paravirt halt()/safe_halt()
routines with TDX-safe versions that execute direct TDCALL and needed
interrupt flag updates. Executing direct TDCALL brings in additional
benefit of avoiding HLT related #VEs altogether.
As tested by Ryan Afranji:
"Tested with the specjbb2015 benchmark. It has heavy lock contention which leads
to many halt calls. TDX VMs suffered a poor score before this patchset.
Verified the major performance improvement with this patchset applied."
Fixes: bfe6ed0c6727 ("x86/tdx: Add HLT support for TDX guests")
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Ryan Afranji <afranji@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250228014416.3925664-3-vannapurve@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Fri Mar 14 12:28:59 2025 -0700
x86/traps: Make exc_double_fault() consistently noreturn
[ Upstream commit 8085fcd78c1a3dbdf2278732579009d41ce0bc4e ]
The CONFIG_X86_ESPFIX64 version of exc_double_fault() can return to its
caller, but the !CONFIG_X86_ESPFIX64 version never does. In the latter
case the compiler and/or objtool may consider it to be implicitly
noreturn.
However, due to the currently inflexible way objtool detects noreturns,
a function's noreturn status needs to be consistent across configs.
The current workaround for this issue is to suppress unreachable
warnings for exc_double_fault()'s callers. Unfortunately that can
result in ORC coverage gaps and potentially worse issues like inert
static calls and silently disabled CPU mitigations.
Instead, prevent exc_double_fault() from ever being implicitly marked
noreturn by forcing a return behind a never-taken conditional.
Until a more integrated noreturn detection method exists, this is likely
the least objectionable workaround.
Fixes: 55eeab2a8a11 ("objtool: Ignore exc_double_fault() __noreturn warnings")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/d1f4026f8dc35d0de6cc61f2684e0cb6484009d1.1741975349.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guilherme G. Piccoli <gpiccoli@igalia.com>
Date: Sat Feb 15 17:58:16 2025 -0300
x86/tsc: Always save/restore TSC sched_clock() on suspend/resume
commit d90c9de9de2f1712df56de6e4f7d6982d358cabe upstream.
TSC could be reset in deep ACPI sleep states, even with invariant TSC.
That's the reason we have sched_clock() save/restore functions, to deal
with this situation. But what happens is that such functions are guarded
with a check for the stability of sched_clock - if not considered stable,
the save/restore routines aren't executed.
On top of that, we have a clear comment in native_sched_clock() saying
that *even* with TSC unstable, we continue using TSC for sched_clock due
to its speed.
In other words, if we have a situation of TSC getting detected as unstable,
it marks the sched_clock as unstable as well, so subsequent S3 sleep cycles
could bring bogus sched_clock values due to the lack of the save/restore
mechanism, causing warnings like this:
[22.954918] ------------[ cut here ]------------
[22.954923] Delta way too big! 18446743750843854390 ts=18446744072977390405 before=322133536015 after=322133536015 write stamp=18446744072977390405
[22.954923] If you just came from a suspend/resume,
[22.954923] please switch to the trace global clock:
[22.954923] echo global > /sys/kernel/tracing/trace_clock
[22.954923] or add trace_clock=global to the kernel command line
[22.954937] WARNING: CPU: 2 PID: 5728 at kernel/trace/ring_buffer.c:2890 rb_add_timestamp+0x193/0x1c0
Notice that the above was reproduced even with "trace_clock=global".
The fix for that is to _always_ save/restore the sched_clock on suspend
cycle _if TSC is used_ as sched_clock - only if we fallback to jiffies
the sched_clock_stable() check becomes relevant to save/restore the
sched_clock.
Debugged-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250215210314.351480-1-gpiccoli@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Herton R. Krzesinski <herton@redhat.com>
Date: Thu Mar 20 11:22:13 2025 -0300
x86/uaccess: Improve performance by aligning writes to 8 bytes in copy_user_generic(), on non-FSRM/ERMS CPUs
[ Upstream commit b5322b6ec06a6c58650f52abcd2492000396363b ]
History of the performance regression:
======================================
Since the following series of user copy updates were merged upstream
~2 years ago via:
a5624566431d ("Merge branch 'x86-rep-insns': x86 user copy clarifications")
.. copy_user_generic() on x86_64 stopped doing alignment of the
writes to the destination to a 8 byte boundary for the non FSRM case.
Previously, this was done through the ALIGN_DESTINATION macro that
was used in the now removed copy_user_generic_unrolled function.
Turns out this change causes some loss of performance/throughput on
some use cases and specific CPU/platforms without FSRM and ERMS.
Lately I got two reports of performance/throughput issues after a
RHEL 9 kernel pulled the same upstream series with updates to user
copy functions. Both reports consisted of running specific
networking/TCP related testing using iperf3.
Partial upstream fix
====================
The first report was related to a Linux Bridge testing using VMs on a
specific machine with an AMD CPU (EPYC 7402), and after a brief
investigation it turned out that the later change via:
ca96b162bfd2 ("x86: bring back rep movsq for user access on CPUs without ERMS")
... helped/fixed the performance issue.
However, after the later commit/fix was applied, then I got another
regression reported in a multistream TCP test on a 100Gbit mlx5 nic, also
running on an AMD based platform (AMD EPYC 7302 CPU), again that was using
iperf3 to run the test. That regression was after applying the later
fix/commit, but only this didn't help in telling the whole history.
Testing performed to pinpoint residual regression
=================================================
So I narrowed down the second regression use case, but running it
without traffic through a NIC, on localhost, in trying to narrow down
CPU usage and not being limited by other factor like network bandwidth.
I used another system also with an AMD CPU (AMD EPYC 7742). Basically,
I run iperf3 in server and client mode in the same system, for example:
- Start the server binding it to CPU core/thread 19:
$ taskset -c 19 iperf3 -D -s -B 127.0.0.1 -p 12000
- Start the client always binding/running on CPU core/thread 17, using
perf to get statistics:
$ perf stat -o stat.txt taskset -c 17 iperf3 -c 127.0.0.1 -b 0/1000 -V \
-n 50G --repeating-payload -l 16384 -p 12000 --cport 12001 2>&1 \
> stat-19.txt
For the client, always running/pinned to CPU 17. But for the iperf3 in
server mode, I did test runs using CPUs 19, 21, 23 or not pinned to any
specific CPU. So it basically consisted with four runs of the same
commands, just changing the CPU which the server is pinned, or without
pinning by removing the taskset call before the server command. The CPUs
were chosen based on NUMA node they were on, this is the relevant output
of lscpu on the system:
$ lscpu
...
Model name: AMD EPYC 7742 64-Core Processor
...
Caches (sum of all):
L1d: 2 MiB (64 instances)
L1i: 2 MiB (64 instances)
L2: 32 MiB (64 instances)
L3: 256 MiB (16 instances)
NUMA:
NUMA node(s): 4
NUMA node0 CPU(s): 0,1,8,9,16,17,24,25,32,33,40,41,48,49,56,57,64,65,72,73,80,81,88,89,96,97,104,105,112,113,120,121
NUMA node1 CPU(s): 2,3,10,11,18,19,26,27,34,35,42,43,50,51,58,59,66,67,74,75,82,83,90,91,98,99,106,107,114,115,122,123
NUMA node2 CPU(s): 4,5,12,13,20,21,28,29,36,37,44,45,52,53,60,61,68,69,76,77,84,85,92,93,100,101,108,109,116,117,124,125
NUMA node3 CPU(s): 6,7,14,15,22,23,30,31,38,39,46,47,54,55,62,63,70,71,78,79,86,87,94,95,102,103,110,111,118,119,126,127
...
So for the server run, when picking a CPU, I chose CPUs to be not on the same
node. The reason is with that I was able to get/measure relevant
performance differences when changing the alignment of the writes to the
destination in copy_user_generic.
Testing shows up to +81% performance improvement under iperf3
=============================================================
Here's a summary of the iperf3 runs:
# Vanilla upstream alignment:
CPU RATE SYS TIME sender-receiver
Server bind 19: 13.0Gbits/sec 28.371851000 33.233499566 86.9%-70.8%
Server bind 21: 12.9Gbits/sec 28.283381000 33.586486621 85.8%-69.9%
Server bind 23: 11.1Gbits/sec 33.660190000 39.012243176 87.7%-64.5%
Server bind none: 18.9Gbits/sec 19.215339000 22.875117865 86.0%-80.5%
# With the attached patch (aligning writes in non ERMS/FSRM case):
CPU RATE SYS TIME sender-receiver
Server bind 19: 20.8Gbits/sec 14.897284000 20.811101382 75.7%-89.0%
Server bind 21: 20.4Gbits/sec 15.205055000 21.263165909 75.4%-89.7%
Server bind 23: 20.2Gbits/sec 15.433801000 21.456175000 75.5%-89.8%
Server bind none: 26.1Gbits/sec 12.534022000 16.632447315 79.8%-89.6%
So I consistently got better results when aligning the write. The
results above were run on 6.14.0-rc6/rc7 based kernels. The sys is sys
time and then the total time to run/transfer 50G of data. The last
field is the CPU usage of sender/receiver iperf3 process. It's also
worth to note that each pair of iperf3 runs may get slightly different
results on each run, but I always got consistent higher results with
the write alignment for this specific test of running the processes
on CPUs in different NUMA nodes.
Linus Torvalds helped/provided this version of the patch. Initially I
proposed a version which aligned writes for all cases in
rep_movs_alternative, however it used two extra registers and thus
Linus provided an enhanced version that only aligns the write on the
large_movsq case, which is sufficient since the problem happens only
on those AMD CPUs like ones mentioned above without ERMS/FSRM, and
also doesn't require using extra registers. Also, I validated that
aligning only on large_movsq case is really enough for getting the
performance back.
I also tested this patch on an old Intel based non-ERMS/FRMS system
(with Xeon E5-2667 - Sandy Bridge based) and didn't get any problems:
no performance enhancement but also no regression either, using the
same iperf3 based benchmark. Also newer Intel processors after
Sandy Bridge usually have ERMS and should not be affected by this change.
[ mingo: Updated the changelog. ]
Fixes: ca96b162bfd2 ("x86: bring back rep movsq for user access on CPUs without ERMS")
Fixes: 034ff37d3407 ("x86: rewrite '__copy_user_nocache' function")
Reported-by: Ondrej Lichtner <olichtne@redhat.com>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250320142213.2623518-1-herton@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>