-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel fixes #4
Kernel fixes #4
Conversation
shankerd04
commented
Jan 6, 2023
- Two patches are for defconfig
- EFI-RTC fix, working with EFI & RTC maintainers
- Increase the number of IRQ descriptors, started discussion with the maintainer
Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
This reverts commit 93f3643. Signed-off-by: Dimitri John Ledkov <[email protected]>
…ter) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Dimitri John Ledkov <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Dimitri John Ledkov <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1991664 cbd/kteam-tools have been updated to install gcc-12 toolchain. dkms dynamically checks and tries to use the same compiler version as the kernel build. When backporting, the toolchain version should be set in full to the versioned gcc-12, make it so. This is to support building dkms modules with matching gcc in jammy. Signed-off-by: Dimitri John Ledkov <[email protected]> Signed-off-by: Andrea Righi <[email protected]> (cherry picked from commit e96b739)
Ignore: yes Signed-off-by: Dimitri John Ledkov <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1992162 Properties: no-test-build Signed-off-by: Dimitri John Ledkov <[email protected]>
Signed-off-by: Dimitri John Ledkov <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 ARM Performance Monitoring Unit Table describes the properties of PMU support in ARM-based system. The APMT table contains a list of nodes, each represents a PMU in the system that conforms to ARM CoreSight PMU architecture. The properties of each node include information required to access the PMU (e.g. MMIO base address, interrupt number) and also identification. For more detailed information, please refer to the specification below: * APMT: https://developer.arm.com/documentation/den0117/latest * ARM Coresight PMU: https://developer.arm.com/documentation/ihi0091/latest The initial support adds the detection of APMT table and generic infrastructure to create platform devices for ARM CoreSight PMUs. Similar to IORT the root pointer of APMT is preserved during runtime and each PMU platform device is given a pointer to the corresponding APMT node. Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Sudeep Holla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Add missing kerneldoc and fix alignment on one of the arguments of apmt_add_platform_device function. Signed-off-by: Besar Wicaksono <[email protected]> Link: https://lore.kernel.org/r/[email protected] [will: Fixed up additional indentation issue] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Add support for ARM CoreSight PMU driver framework and interfaces. The driver provides generic implementation to operate uncore PMU based on ARM CoreSight PMU architecture. The driver also provides interface to get vendor/implementation specific information, for example event attributes and formating. The specification used in this implementation can be found below: * ACPI Arm Performance Monitoring Unit table: https://developer.arm.com/documentation/den0117/latest * ARM Coresight PMU architecture: https://developer.arm.com/documentation/ihi0091/latest Reviewed-by: Suzuki K Poulose <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Add support for NVIDIA System Cache Fabric (SCF) and Memory Control Fabric (MCF) PMU attributes for CoreSight PMU implementation in NVIDIA devices. Acked-by: Suzuki K Poulose <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Building an arm64 allmodconfig target results in the following failure from modpost: | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/arm_cspmu.o | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/nvidia_cspmu.o | make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1 | make: *** [Makefile:1944: modpost] Error 2 Add the missing MODULE_LICENSE() macros, following the license of the source files and symbol exports. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Building on x86_64 allmodconfig failed: | drivers/perf/arm_cspmu/arm_cspmu.c:1114:29: error: implicit | declaration of function 'get_acpi_id_for_cpu' get_acpi_id_for_cpu is a helper function from ARM64. Fix by adding ARM64 dependency. Signed-off-by: Besar Wicaksono <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Build on arm64 allmodconfig failed with: | depmod: ERROR: Cycle detected: arm_cspmu -> nvidia_cspmu -> arm_cspmu | depmod: ERROR: Found 2 modules in dependency cycles! The arm_cspmu.c provides standard functions to operate the PMU and the vendor code provides vendor specific attributes. Both need to be built as single kernel module. Update the makefile to compile sources under arm_cspmu into one module. Signed-off-by: Besar Wicaksono <[email protected]> Reviewed-and-Tested-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 acpi_dev_parent is introduced in 6.1rc4. Fix by accessing acpi_device.parent directly. Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380 Enable driver for Coresight PMU arch device. Signed-off-by: Besar Wicaksono <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392 Cases like VFIO wish to attach a device to an existing domain that was not allocated specifically from the device. This raises a condition where the IOMMU driver can fail the domain attach because the domain and device are incompatible with each other. This is a soft failure that can be resolved by using a different domain. Provide a dedicated errno EINVAL from the IOMMU driver during attach that the reason why the attach failed is because of domain incompatibility. VFIO can use this to know that the attach is a soft failure and it should continue searching. Otherwise, the attach will be a hard failure and VFIO will return the code to userspace. Update kdocs to add rules of return value to the attach_dev op and APIs. Link: https://lore.kernel.org/r/bd56d93c18621104a0fa1b0de31e9b760b81b769.1666042872.git.nicolinc@nvidia.com Suggested-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit 0020885 linux-next) Signed-off-by: Nicolin Chen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392 Following the new rules in include/linux/iommu.h kdocs, EINVAL now can be used to indicate that domain and device are incompatible by a caller that treats it as a soft failure and tries attaching to another domain. On the other hand, there are ->attach_dev callback functions returning it for obvious device-specific errors. They will result in some inefficiency in the caller handling routine. Update these places to corresponding errnos following the new rules. Link: https://lore.kernel.org/r/5924c03bea637f05feb2a20d624bae086b555ec5.1666042872.git.nicolinc@nvidia.com Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit bd7ebb7 linux-next) Signed-off-by: Nicolin Chen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392 Following the new rules in include/linux/iommu.h kdocs, update all drivers ->attach_dev callback functions to return EINVAL in the failure paths that are related to domain incompatibility. Also, drop adjacent error prints to prevent a kernel log spam. Link: https://lore.kernel.org/r/f52a07f7320da94afe575c9631340d0019a203a7.1666042873.git.nicolinc@nvidia.com Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit f4a1477 linux-next) Signed-off-by: Nicolin Chen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392 The mtk_iommu and virtio drivers have places in the ->attach_dev callback functions that return hardcode errnos instead of the returned values, but callers of these ->attach_dv callback functions may care. Propagate them directly without the extra conversions. Link: https://lore.kernel.org/r/ca8c5a447b87002334f83325f28823008b4ce420.1666042873.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Yong Wu <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit 04cee82 linux-next) Signed-off-by: Nicolin Chen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527 Tegra Grace and later chips can support upto 4 chip select lines for QUAD SPI. Added new compatible for Tegra Grace. Signed-off-by: Krishna Yarlagadda <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit b761341 linux-next) Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527 Return value should be updated to zero in combined sequence routine if transfer is completed successfully. Currently it holds timeout value resulting in errors. Signed-off-by: Krishna Yarlagadda <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit 8777dd9 linux-next) Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527 The following error messages are observed on boot for Tegra234 ... ERR KERN tegra-qspi 3270000.spi: cannot use DMA: -19 ERR KERN tegra-qspi 3270000.spi: falling back to PIO Tegra234 does not support DMA for the QSPI and so initialising the DMA is expected to fail. The above error messages are misleading for devices that don't support DMA and so fix this by skipping the DMA initialisation for devices that don't support DMA. Signed-off-by: Jon Hunter <[email protected]> Acked-by: Thierry Reding <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit ae4b3c1 linux-next) Signed-off-by: Brad Figg <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527 controller data alloc is done with client device data causing duplicate resource error. Allocate memory using controller device when using devm Fixes: f89d2cc ("spi: tegra210-quad: use devm call for cdata memory") Signed-off-by: Krishna Yarlagadda <[email protected]> Reviewed-by: Jon Hunter <[email protected]> Tested-by: Jon Hunter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit 2197aa6 linux-next) Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527 Jetson AGX board has flash device on QSPI controller and SPI instance on 40-pin expander. Enable Tegra SPI & QSPI drivers in defconfig as modules. Signed-off-by: Krishna Yarlagadda <[email protected]> Signed-off-by: Thierry Reding <[email protected]> (cherry picked from commit 0ddf10a linux-next) Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jose Ogando <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
This patch increases the default value from 256 to 512 to use all cores of Grace. Tested-by: Rohit Khanna <[email protected]> Reviewed-by: Rohit Khanna <[email protected]> Reviewed-by: Jason Sequeira <[email protected]> Signed-off-by: Shanker Donthineni <[email protected]>
b814ca4
to
2995e6a
Compare
The default NUMA NODE count of 2^4 is not enough to support Grace platforms. This patch increases the count to 2^6. Reviewed-by: Jason Sequeira <[email protected]> Reviewed-by: Rohit Khanna <[email protected]> Signed-off-by: Shanker Donthineni <[email protected]>
The current implementation of rtc-efi is expecting all the 4 time services GET{SET}_TIME{WAKEUP} must be supported by UEFI firmware. As per the EFI_RT_PROPERTIES_TABLE, the platform specific implementations can choose to enable selective time services based on the RTC device capabilities. This patch does the following changes to provide GET/SET RTC services on platforms that do not support the WAKEUP feature. 1) Relax time services cap check when creating a platform device. 2) Clear RTC_FEATURE_ALARM bit in the absence of WAKEUP services. 3) Conditional alarm entries in '/proc/driver/rtc'. Reviewed-by: Rohit Khanna <[email protected]> Reviewed-by: Jason Sequeira <[email protected]> Signed-off-by: Shanker Donthineni <[email protected]>
The default value of NR_IRQS is not sufficient to support GICv4.1 features and ~56K LPIs. This parameter would be too small for certain server platforms where it has many IO devices and is capable of direct injection of vSGI and vLPI features. Currently, maximum of 64 + 8192 (IRQ_BITMAP_BITS) IRQ descriptors are allowed. The vCPU creation fails after reaching count ~400 with kvm-arm.vgic_v4_enable=1. This patch increases NR_IRQS to 1^19 to cover 56K LPIs and 262144 vSGIs (16K vPEs x 16). Reviewed-by: Rohit Khanna <[email protected]> Reviewed-by: Jason Sequeira <[email protected]> Signed-off-by: Shanker Donthineni <[email protected]>
2995e6a
to
68a30a8
Compare
The optimized Ubuntu kernel for Nvidia does not use defconfig it uses the configuration as defined by files under debian.nvidia-5.19. Do you want me to apply your defconfig changes to the Ubuntu configuration files? |
Yes, both options are must for CG4 |
Are the other two patches going upstream and if so do you have a target for which merge cycle you will be targeting? |
Started discussing with maintainers recently.
|
Thank you for quickly answering my questions. I'll get these to Canonical and get them incorporated as soon as possible. |
BugLink: https://bugs.launchpad.net/bugs/1994068 [ Upstream commit 84a5358 ] The SRv6 layer allows defining HMAC data that can later be used to sign IPv6 Segment Routing Headers. This configuration is realised via netlink through four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual length of the SECRET attribute, it is possible to provide invalid combinations (e.g., secret = "", secretlen = 64). This case is not checked in the code and with an appropriately crafted netlink message, an out-of-bounds read of up to 64 bytes (max secret length) can occur past the skb end pointer and into skb_shared_info: Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 208 memcpy(hinfo->secret, secret, slen); (gdb) bt #0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 #1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600, extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>, family=<optimized out>) at net/netlink/genetlink.c:731 #2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00, family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775 #3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792 #4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>) at net/netlink/af_netlink.c:2501 #5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803 #6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000) at net/netlink/af_netlink.c:1319 #7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>) at net/netlink/af_netlink.c:1345 #8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921 ... (gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end $1 = 0xffff88800b1b76c0 (gdb) p/x secret $2 = 0xffff88800b1b76c0 (gdb) p slen $3 = 64 '@' The OOB data can then be read back from userspace by dumping HMAC state. This commit fixes this by ensuring SECRETLEN cannot exceed the actual length of SECRET. Reported-by: Lucas Leong <[email protected]> Tested: verified that EINVAL is correctly returned when secretlen > len(secret) Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure") Signed-off-by: David Lebrun <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1994074 [ Upstream commit 0e400d6 ] Fix a NULL dereference of the struct bonding.rr_tx_counter member because if a bond is initially created with an initial mode != zero (Round Robin) the memory required for the counter is never created and when the mode is changed there is never any attempt to verify the memory is allocated upon switching modes. This causes the following Oops on an aarch64 machine: [ 334.686773] Unable to handle kernel paging request at virtual address ffff2c91ac905000 [ 334.694703] Mem abort info: [ 334.697486] ESR = 0x0000000096000004 [ 334.701234] EC = 0x25: DABT (current EL), IL = 32 bits [ 334.706536] SET = 0, FnV = 0 [ 334.709579] EA = 0, S1PTW = 0 [ 334.712719] FSC = 0x04: level 0 translation fault [ 334.717586] Data abort info: [ 334.720454] ISV = 0, ISS = 0x00000004 [ 334.724288] CM = 0, WnR = 0 [ 334.727244] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000008044d662000 [ 334.733944] [ffff2c91ac905000] pgd=0000000000000000, p4d=0000000000000000 [ 334.740734] Internal error: Oops: 96000004 [#1] SMP [ 334.745602] Modules linked in: bonding tls veth rfkill sunrpc arm_spe_pmu vfat fat acpi_ipmi ipmi_ssif ixgbe igb i40e mdio ipmi_devintf ipmi_msghandler arm_cmn arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram crct10dif_ce ast ghash_ce sbsa_gwdt nvme drm_vram_helper drm_ttm_helper nvme_core ttm xgene_hwmon [ 334.772217] CPU: 7 PID: 2214 Comm: ping Not tainted 6.0.0-rc4-00133-g64ae13ed4784 #4 [ 334.779950] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021 [ 334.789244] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 334.796196] pc : bond_rr_gen_slave_id+0x40/0x124 [bonding] [ 334.801691] lr : bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding] [ 334.807962] sp : ffff8000221733e0 [ 334.811265] x29: ffff8000221733e0 x28: ffffdbac8572d198 x27: ffff80002217357c [ 334.818392] x26: 000000000000002a x25: ffffdbacb33ee000 x24: ffff07ff980fa000 [ 334.825519] x23: ffffdbacb2e398ba x22: ffff07ff98102000 x21: ffff07ff981029c0 [ 334.832646] x20: 0000000000000001 x19: ffff07ff981029c0 x18: 0000000000000014 [ 334.839773] x17: 0000000000000000 x16: ffffdbacb1004364 x15: 0000aaaabe2f5a62 [ 334.846899] x14: ffff07ff8e55d968 x13: ffff07ff8e55db30 x12: 0000000000000000 [ 334.854026] x11: ffffdbacb21532e8 x10: 0000000000000001 x9 : ffffdbac857178ec [ 334.861153] x8 : ffff07ff9f6e5a28 x7 : 0000000000000000 x6 : 000000007c2b3742 [ 334.868279] x5 : ffff2c91ac905000 x4 : ffff2c91ac905000 x3 : ffff07ff9f554400 [ 334.875406] x2 : ffff2c91ac905000 x1 : 0000000000000001 x0 : ffff07ff981029c0 [ 334.882532] Call trace: [ 334.884967] bond_rr_gen_slave_id+0x40/0x124 [bonding] [ 334.890109] bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding] [ 334.896033] __bond_start_xmit+0x128/0x3a0 [bonding] [ 334.901001] bond_start_xmit+0x54/0xb0 [bonding] [ 334.905622] dev_hard_start_xmit+0xb4/0x220 [ 334.909798] __dev_queue_xmit+0x1a0/0x720 [ 334.913799] arp_xmit+0x3c/0xbc [ 334.916932] arp_send_dst+0x98/0xd0 [ 334.920410] arp_solicit+0xe8/0x230 [ 334.923888] neigh_probe+0x60/0xb0 [ 334.927279] __neigh_event_send+0x3b0/0x470 [ 334.931453] neigh_resolve_output+0x70/0x90 [ 334.935626] ip_finish_output2+0x158/0x514 [ 334.939714] __ip_finish_output+0xac/0x1a4 [ 334.943800] ip_finish_output+0x40/0xfc [ 334.947626] ip_output+0xf8/0x1a4 [ 334.950931] ip_send_skb+0x5c/0x100 [ 334.954410] ip_push_pending_frames+0x3c/0x60 [ 334.958758] raw_sendmsg+0x458/0x6d0 [ 334.962325] inet_sendmsg+0x50/0x80 [ 334.965805] sock_sendmsg+0x60/0x6c [ 334.969286] __sys_sendto+0xc8/0x134 [ 334.972853] __arm64_sys_sendto+0x34/0x4c [ 334.976854] invoke_syscall+0x78/0x100 [ 334.980594] el0_svc_common.constprop.0+0x4c/0xf4 [ 334.985287] do_el0_svc+0x38/0x4c [ 334.988591] el0_svc+0x34/0x10c [ 334.991724] el0t_64_sync_handler+0x11c/0x150 [ 334.996072] el0t_64_sync+0x190/0x194 [ 334.999726] Code: b9001062 f9403c02 d53cd044 8b040042 (b8210040) [ 335.005810] ---[ end trace 0000000000000000 ]--- [ 335.010416] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 335.017279] SMP: stopping secondary CPUs [ 335.021374] Kernel Offset: 0x5baca8eb0000 from 0xffff800008000000 [ 335.027456] PHYS_OFFSET: 0x80000000 [ 335.030932] CPU features: 0x0000,0085c029,19805c82 [ 335.035713] Memory Limit: none [ 335.038756] Rebooting in 180 seconds.. The fix is to allocate the memory in bond_open() which is guaranteed to be called before any packets are processed. Fixes: 848ca91 ("net: bonding: Use per-cpu rr_tx_counter") CC: Jussi Maki <[email protected]> Signed-off-by: Jonathan Toppins <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1994074 commit 29a5b8a upstream. When walking through an inode extents, the ext4_ext_binsearch_idx() function assumes that the extent header has been previously validated. However, there are no checks that verify that the number of entries (eh->eh_entries) is non-zero when depth is > 0. And this will lead to problems because the EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this: [ 135.245946] ------------[ cut here ]------------ [ 135.247579] kernel BUG at fs/ext4/extents.c:2258! [ 135.249045] invalid opcode: 0000 [#1] PREEMPT SMP [ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4 [ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0 [ 135.256475] Code: [ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246 [ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023 [ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c [ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c [ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024 [ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000 [ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0 [ 135.277952] Call Trace: [ 135.278635] <TASK> [ 135.279247] ? preempt_count_add+0x6d/0xa0 [ 135.280358] ? percpu_counter_add_batch+0x55/0xb0 [ 135.281612] ? _raw_read_unlock+0x18/0x30 [ 135.282704] ext4_map_blocks+0x294/0x5a0 [ 135.283745] ? xa_load+0x6f/0xa0 [ 135.284562] ext4_mpage_readpages+0x3d6/0x770 [ 135.285646] read_pages+0x67/0x1d0 [ 135.286492] ? folio_add_lru+0x51/0x80 [ 135.287441] page_cache_ra_unbounded+0x124/0x170 [ 135.288510] filemap_get_pages+0x23d/0x5a0 [ 135.289457] ? path_openat+0xa72/0xdd0 [ 135.290332] filemap_read+0xbf/0x300 [ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40 [ 135.292192] new_sync_read+0x103/0x170 [ 135.293014] vfs_read+0x15d/0x180 [ 135.293745] ksys_read+0xa1/0xe0 [ 135.294461] do_syscall_64+0x3c/0x80 [ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This patch simply adds an extra check in __ext4_ext_check(), verifying that eh_entries is not 0 when eh_depth is > 0. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283 Cc: Baokun Li <[email protected]> Cc: [email protected] Signed-off-by: Luís Henriques <[email protected]> Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Baokun Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1996540 [ Upstream commit 2bc9cd6 ] If an IIO driver uses callbacks from another IIO driver and calls iio_channel_start_all_cb() from one of its buffer setup ops, then lockdep complains due to the lock nesting, as in the below example with lmp91000. Since the locks are being taken on different IIO devices, there is no actual deadlock. Fix the warning by telling lockdep to use a different class for each iio_device. ============================================ WARNING: possible recursive locking detected -------------------------------------------- python3/23 is trying to acquire lock: (&indio_dev->mlock){+.+.}-{3:3}, at: iio_update_buffers but task is already holding lock: (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&indio_dev->mlock); lock(&indio_dev->mlock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by python3/23: #0: (sb_writers#5){.+.+}-{0:0}, at: ksys_write #1: (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter #2: (kn->active#14){.+.+}-{0:0}, at: kernfs_fop_write_iter #3: (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store #4: (&iio_dev_opaque->info_exist_lock){+.+.}-{3:3}, at: iio_update_buffers Call Trace: __mutex_lock iio_update_buffers iio_channel_start_all_cb lmp91000_buffer_postenable __iio_update_buffers enable_store Fixes: 67e1730 ("iio: potentiostat: add LMP91000 support") Signed-off-by: Vincent Whitchurch <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>