Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel fixes #4

Closed
wants to merge 42 commits into from
Closed

Conversation

shankerd04
Copy link
Contributor

  • Two patches are for defconfig
  • EFI-RTC fix, working with EFI & RTC maintainers
  • Increase the number of IRQ descriptors, started discussion with the maintainer

ianmay81 and others added 30 commits September 27, 2022 23:20
Ignore: yes
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1991664

cbd/kteam-tools have been updated to install gcc-12 toolchain. dkms
dynamically checks and tries to use the same compiler version as the
kernel build. When backporting, the toolchain version should be set in
full to the versioned gcc-12, make it so. This is to support building
dkms modules with matching gcc in jammy.

Signed-off-by: Dimitri John Ledkov <[email protected]>
Signed-off-by: Andrea Righi <[email protected]>
(cherry picked from commit e96b739)
Ignore: yes
Signed-off-by: Dimitri John Ledkov <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1992162
Properties: no-test-build
Signed-off-by: Dimitri John Ledkov <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

ARM Performance Monitoring Unit Table describes the properties of PMU
support in ARM-based system. The APMT table contains a list of nodes,
each represents a PMU in the system that conforms to ARM CoreSight PMU
architecture. The properties of each node include information required
to access the PMU (e.g. MMIO base address, interrupt number) and also
identification. For more detailed information, please refer to the
specification below:
 * APMT: https://developer.arm.com/documentation/den0117/latest
 * ARM Coresight PMU:
        https://developer.arm.com/documentation/ihi0091/latest

The initial support adds the detection of APMT table and generic
infrastructure to create platform devices for ARM CoreSight PMUs.
Similar to IORT the root pointer of APMT is preserved during runtime
and each PMU platform device is given a pointer to the corresponding
APMT node.

Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Add missing kerneldoc and fix alignment on one of the arguments of
apmt_add_platform_device function.

Signed-off-by: Besar Wicaksono <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[will: Fixed up additional indentation issue]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Add support for ARM CoreSight PMU driver framework and interfaces.
The driver provides generic implementation to operate uncore PMU based
on ARM CoreSight PMU architecture. The driver also provides interface
to get vendor/implementation specific information, for example event
attributes and formating.

The specification used in this implementation can be found below:
 * ACPI Arm Performance Monitoring Unit table:
        https://developer.arm.com/documentation/den0117/latest
 * ARM Coresight PMU architecture:
        https://developer.arm.com/documentation/ihi0091/latest

Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Add support for NVIDIA System Cache Fabric (SCF) and Memory Control
Fabric (MCF) PMU attributes for CoreSight PMU implementation in
NVIDIA devices.

Acked-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Building an arm64 allmodconfig target results in the following failure
from modpost:

  | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/arm_cspmu.o
  | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/nvidia_cspmu.o
  | make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1
  | make: *** [Makefile:1944: modpost] Error 2

Add the missing MODULE_LICENSE() macros, following the license of the
source files and symbol exports.

Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Building on x86_64 allmodconfig failed:
  | drivers/perf/arm_cspmu/arm_cspmu.c:1114:29: error: implicit
  |    declaration of function 'get_acpi_id_for_cpu'

get_acpi_id_for_cpu is a helper function from ARM64.
Fix by adding ARM64 dependency.

Signed-off-by: Besar Wicaksono <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Build on arm64 allmodconfig failed with:
  | depmod: ERROR: Cycle detected: arm_cspmu -> nvidia_cspmu -> arm_cspmu
  | depmod: ERROR: Found 2 modules in dependency cycles!

The arm_cspmu.c provides standard functions to operate the PMU and the
vendor code provides vendor specific attributes. Both need to be built as
single kernel module.

Update the makefile to compile sources under arm_cspmu into one module.

Signed-off-by: Besar Wicaksono <[email protected]>
Reviewed-and-Tested-by: Suzuki K Poulose <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

acpi_dev_parent is introduced in 6.1rc4. Fix by accessing
acpi_device.parent directly.

Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998380

Enable driver for Coresight PMU arch device.

Signed-off-by: Besar Wicaksono <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392

Cases like VFIO wish to attach a device to an existing domain that was
not allocated specifically from the device. This raises a condition
where the IOMMU driver can fail the domain attach because the domain and
device are incompatible with each other.

This is a soft failure that can be resolved by using a different domain.

Provide a dedicated errno EINVAL from the IOMMU driver during attach that
the reason why the attach failed is because of domain incompatibility.

VFIO can use this to know that the attach is a soft failure and it should
continue searching. Otherwise, the attach will be a hard failure and VFIO
will return the code to userspace.

Update kdocs to add rules of return value to the attach_dev op and APIs.

Link: https://lore.kernel.org/r/bd56d93c18621104a0fa1b0de31e9b760b81b769.1666042872.git.nicolinc@nvidia.com
Suggested-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
(cherry picked from commit 0020885 linux-next)
Signed-off-by: Nicolin Chen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
nicolinc and others added 9 commits December 6, 2022 09:09
BugLink: https://bugs.launchpad.net/bugs/1998392

Following the new rules in include/linux/iommu.h kdocs, EINVAL now can be
used to indicate that domain and device are incompatible by a caller that
treats it as a soft failure and tries attaching to another domain.

On the other hand, there are ->attach_dev callback functions returning it
for obvious device-specific errors. They will result in some inefficiency
in the caller handling routine.

Update these places to corresponding errnos following the new rules.

Link: https://lore.kernel.org/r/5924c03bea637f05feb2a20d624bae086b555ec5.1666042872.git.nicolinc@nvidia.com
Reviewed-by: Jean-Philippe Brucker <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
(cherry picked from commit bd7ebb7 linux-next)
Signed-off-by: Nicolin Chen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392

Following the new rules in include/linux/iommu.h kdocs, update all drivers
->attach_dev callback functions to return EINVAL in the failure paths that
are related to domain incompatibility.

Also, drop adjacent error prints to prevent a kernel log spam.

Link: https://lore.kernel.org/r/f52a07f7320da94afe575c9631340d0019a203a7.1666042873.git.nicolinc@nvidia.com
Reviewed-by: Jean-Philippe Brucker <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
(cherry picked from commit f4a1477 linux-next)
Signed-off-by: Nicolin Chen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998392

The mtk_iommu and virtio drivers have places in the ->attach_dev callback
functions that return hardcode errnos instead of the returned values, but
callers of these ->attach_dv callback functions may care. Propagate them
directly without the extra conversions.

Link: https://lore.kernel.org/r/ca8c5a447b87002334f83325f28823008b4ce420.1666042873.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jean-Philippe Brucker <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Yong Wu <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
(cherry picked from commit 04cee82 linux-next)
Signed-off-by: Nicolin Chen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527

Tegra Grace and later chips can support upto 4 chip select lines
for QUAD SPI. Added new compatible for Tegra Grace.

Signed-off-by: Krishna Yarlagadda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
(cherry picked from commit b761341 linux-next)
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527

Return value should be updated to zero in combined sequence routine
if transfer is completed successfully. Currently it holds timeout value
resulting in errors.

Signed-off-by: Krishna Yarlagadda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
(cherry picked from commit 8777dd9 linux-next)
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527

The following error messages are observed on boot for Tegra234 ...

 ERR KERN tegra-qspi 3270000.spi: cannot use DMA: -19
 ERR KERN tegra-qspi 3270000.spi: falling back to PIO

Tegra234 does not support DMA for the QSPI and so initialising the DMA
is expected to fail. The above error messages are misleading for devices
that don't support DMA and so fix this by skipping the DMA
initialisation for devices that don't support DMA.

Signed-off-by: Jon Hunter <[email protected]>
Acked-by: Thierry Reding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
(cherry picked from commit ae4b3c1 linux-next)
Signed-off-by: Brad Figg <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527

controller data alloc is done with client device data causing duplicate
resource error. Allocate memory using controller device when using devm

Fixes: f89d2cc ("spi: tegra210-quad: use devm call for cdata memory")

Signed-off-by: Krishna Yarlagadda <[email protected]>
Reviewed-by: Jon Hunter <[email protected]>
Tested-by: Jon Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
(cherry picked from commit 2197aa6 linux-next)
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1998527

Jetson AGX board has flash device on QSPI controller and SPI instance
on 40-pin expander.

Enable Tegra SPI & QSPI drivers in defconfig as modules.

Signed-off-by: Krishna Yarlagadda <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
(cherry picked from commit 0ddf10a linux-next)
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jose Ogando <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
This patch increases the default value from 256 to 512 to use
all cores of Grace.

Tested-by: Rohit Khanna <[email protected]>
Reviewed-by: Rohit Khanna <[email protected]>
Reviewed-by: Jason Sequeira <[email protected]>
Signed-off-by: Shanker Donthineni <[email protected]>
The default NUMA NODE count of 2^4 is not enough to support Grace
platforms. This patch increases the count to 2^6.

Reviewed-by: Jason Sequeira <[email protected]>
Reviewed-by: Rohit Khanna <[email protected]>
Signed-off-by: Shanker Donthineni <[email protected]>
The current implementation of rtc-efi is expecting all the 4
time services GET{SET}_TIME{WAKEUP} must be supported by UEFI
firmware. As per the EFI_RT_PROPERTIES_TABLE, the platform
specific implementations can choose to enable selective time
services based on the RTC device capabilities.

This patch does the following changes to provide GET/SET RTC
services on platforms that do not support the WAKEUP feature.

1) Relax time services cap check when creating a platform device.
2) Clear RTC_FEATURE_ALARM bit in the absence of WAKEUP services.
3) Conditional alarm entries in '/proc/driver/rtc'.

Reviewed-by: Rohit Khanna <[email protected]>
Reviewed-by: Jason Sequeira <[email protected]>
Signed-off-by: Shanker Donthineni <[email protected]>
The default value of NR_IRQS is not sufficient to support GICv4.1
features and ~56K LPIs. This parameter would be too small for certain
server platforms where it has many IO devices and is capable of
direct injection of vSGI and vLPI features.

Currently, maximum of 64 + 8192 (IRQ_BITMAP_BITS) IRQ descriptors
are allowed. The vCPU creation fails after reaching count ~400 with
kvm-arm.vgic_v4_enable=1.

This patch increases NR_IRQS to 1^19 to cover 56K LPIs and 262144
vSGIs (16K vPEs x 16).

Reviewed-by: Rohit Khanna <[email protected]>
Reviewed-by: Jason Sequeira <[email protected]>
Signed-off-by: Shanker Donthineni <[email protected]>
@nvidia-bfigg
Copy link
Contributor

The optimized Ubuntu kernel for Nvidia does not use defconfig it uses the configuration as defined by files under debian.nvidia-5.19. Do you want me to apply your defconfig changes to the Ubuntu configuration files?

@shankerd04
Copy link
Contributor Author

shankerd04 commented Jan 6, 2023

The optimized Ubuntu kernel for Nvidia does not use defconfig it uses the configuration as defined by files under debian.nvidia-5.19. Do you want me to apply your defconfig changes to the Ubuntu configuration files?

Yes, both options are must for CG4

@nvidia-bfigg
Copy link
Contributor

Are the other two patches going upstream and if so do you have a target for which merge cycle you will be targeting?

@shankerd04
Copy link
Contributor Author

Are the other two patches going upstream and if so do you have a target for which merge cycle you will be targeting?

Started discussing with maintainers recently.

  1. https://lkml.org/lkml/2023/1/3/168 --> likely to be in v6.2 and might be back-ported to stable branches
  2. https://lkml.org/lkml/2023/1/5/625 --> Most probably this patch will be replaced by a scalable solution that fixes other ARCHs

@nvidia-bfigg
Copy link
Contributor

Thank you for quickly answering my questions. I'll get these to Canonical and get them incorporated as soon as possible.

nvidia-bfigg pushed a commit that referenced this pull request Jan 19, 2023
BugLink: https://bugs.launchpad.net/bugs/1994068

[ Upstream commit 84a5358 ]

The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:

Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208		memcpy(hinfo->secret, secret, slen);
(gdb) bt
 #0  seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
 #1  0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
    extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
    family=<optimized out>) at net/netlink/genetlink.c:731
 #2  0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
    family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
 #3  genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
 #4  0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
    at net/netlink/af_netlink.c:2501
 #5  0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
 #6  0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
    at net/netlink/af_netlink.c:1319
 #7  netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
    at net/netlink/af_netlink.c:1345
 #8  0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'

The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.

Reported-by: Lucas Leong <[email protected]>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret)
Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: David Lebrun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
nvidia-bfigg pushed a commit that referenced this pull request Jan 19, 2023
BugLink: https://bugs.launchpad.net/bugs/1994074

[ Upstream commit 0e400d6 ]

Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.

This causes the following Oops on an aarch64 machine:
    [  334.686773] Unable to handle kernel paging request at virtual address ffff2c91ac905000
    [  334.694703] Mem abort info:
    [  334.697486]   ESR = 0x0000000096000004
    [  334.701234]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  334.706536]   SET = 0, FnV = 0
    [  334.709579]   EA = 0, S1PTW = 0
    [  334.712719]   FSC = 0x04: level 0 translation fault
    [  334.717586] Data abort info:
    [  334.720454]   ISV = 0, ISS = 0x00000004
    [  334.724288]   CM = 0, WnR = 0
    [  334.727244] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000008044d662000
    [  334.733944] [ffff2c91ac905000] pgd=0000000000000000, p4d=0000000000000000
    [  334.740734] Internal error: Oops: 96000004 [#1] SMP
    [  334.745602] Modules linked in: bonding tls veth rfkill sunrpc arm_spe_pmu vfat fat acpi_ipmi ipmi_ssif ixgbe igb i40e mdio ipmi_devintf ipmi_msghandler arm_cmn arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram crct10dif_ce ast ghash_ce sbsa_gwdt nvme drm_vram_helper drm_ttm_helper nvme_core ttm xgene_hwmon
    [  334.772217] CPU: 7 PID: 2214 Comm: ping Not tainted 6.0.0-rc4-00133-g64ae13ed4784 #4
    [  334.779950] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021
    [  334.789244] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  334.796196] pc : bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.801691] lr : bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.807962] sp : ffff8000221733e0
    [  334.811265] x29: ffff8000221733e0 x28: ffffdbac8572d198 x27: ffff80002217357c
    [  334.818392] x26: 000000000000002a x25: ffffdbacb33ee000 x24: ffff07ff980fa000
    [  334.825519] x23: ffffdbacb2e398ba x22: ffff07ff98102000 x21: ffff07ff981029c0
    [  334.832646] x20: 0000000000000001 x19: ffff07ff981029c0 x18: 0000000000000014
    [  334.839773] x17: 0000000000000000 x16: ffffdbacb1004364 x15: 0000aaaabe2f5a62
    [  334.846899] x14: ffff07ff8e55d968 x13: ffff07ff8e55db30 x12: 0000000000000000
    [  334.854026] x11: ffffdbacb21532e8 x10: 0000000000000001 x9 : ffffdbac857178ec
    [  334.861153] x8 : ffff07ff9f6e5a28 x7 : 0000000000000000 x6 : 000000007c2b3742
    [  334.868279] x5 : ffff2c91ac905000 x4 : ffff2c91ac905000 x3 : ffff07ff9f554400
    [  334.875406] x2 : ffff2c91ac905000 x1 : 0000000000000001 x0 : ffff07ff981029c0
    [  334.882532] Call trace:
    [  334.884967]  bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.890109]  bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.896033]  __bond_start_xmit+0x128/0x3a0 [bonding]
    [  334.901001]  bond_start_xmit+0x54/0xb0 [bonding]
    [  334.905622]  dev_hard_start_xmit+0xb4/0x220
    [  334.909798]  __dev_queue_xmit+0x1a0/0x720
    [  334.913799]  arp_xmit+0x3c/0xbc
    [  334.916932]  arp_send_dst+0x98/0xd0
    [  334.920410]  arp_solicit+0xe8/0x230
    [  334.923888]  neigh_probe+0x60/0xb0
    [  334.927279]  __neigh_event_send+0x3b0/0x470
    [  334.931453]  neigh_resolve_output+0x70/0x90
    [  334.935626]  ip_finish_output2+0x158/0x514
    [  334.939714]  __ip_finish_output+0xac/0x1a4
    [  334.943800]  ip_finish_output+0x40/0xfc
    [  334.947626]  ip_output+0xf8/0x1a4
    [  334.950931]  ip_send_skb+0x5c/0x100
    [  334.954410]  ip_push_pending_frames+0x3c/0x60
    [  334.958758]  raw_sendmsg+0x458/0x6d0
    [  334.962325]  inet_sendmsg+0x50/0x80
    [  334.965805]  sock_sendmsg+0x60/0x6c
    [  334.969286]  __sys_sendto+0xc8/0x134
    [  334.972853]  __arm64_sys_sendto+0x34/0x4c
    [  334.976854]  invoke_syscall+0x78/0x100
    [  334.980594]  el0_svc_common.constprop.0+0x4c/0xf4
    [  334.985287]  do_el0_svc+0x38/0x4c
    [  334.988591]  el0_svc+0x34/0x10c
    [  334.991724]  el0t_64_sync_handler+0x11c/0x150
    [  334.996072]  el0t_64_sync+0x190/0x194
    [  334.999726] Code: b9001062 f9403c02 d53cd044 8b040042 (b8210040)
    [  335.005810] ---[ end trace 0000000000000000 ]---
    [  335.010416] Kernel panic - not syncing: Oops: Fatal exception in interrupt
    [  335.017279] SMP: stopping secondary CPUs
    [  335.021374] Kernel Offset: 0x5baca8eb0000 from 0xffff800008000000
    [  335.027456] PHYS_OFFSET: 0x80000000
    [  335.030932] CPU features: 0x0000,0085c029,19805c82
    [  335.035713] Memory Limit: none
    [  335.038756] Rebooting in 180 seconds..

The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.

Fixes: 848ca91 ("net: bonding: Use per-cpu rr_tx_counter")
CC: Jussi Maki <[email protected]>
Signed-off-by: Jonathan Toppins <[email protected]>
Acked-by: Jay Vosburgh <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
nvidia-bfigg pushed a commit that referenced this pull request Jan 19, 2023
BugLink: https://bugs.launchpad.net/bugs/1994074

commit 29a5b8a upstream.

When walking through an inode extents, the ext4_ext_binsearch_idx() function
assumes that the extent header has been previously validated.  However, there
are no checks that verify that the number of entries (eh->eh_entries) is
non-zero when depth is > 0.  And this will lead to problems because the
EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:

[  135.245946] ------------[ cut here ]------------
[  135.247579] kernel BUG at fs/ext4/extents.c:2258!
[  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
[  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
[  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
[  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
[  135.256475] Code:
[  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
[  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
[  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
[  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
[  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
[  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
[  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
[  135.277952] Call Trace:
[  135.278635]  <TASK>
[  135.279247]  ? preempt_count_add+0x6d/0xa0
[  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
[  135.281612]  ? _raw_read_unlock+0x18/0x30
[  135.282704]  ext4_map_blocks+0x294/0x5a0
[  135.283745]  ? xa_load+0x6f/0xa0
[  135.284562]  ext4_mpage_readpages+0x3d6/0x770
[  135.285646]  read_pages+0x67/0x1d0
[  135.286492]  ? folio_add_lru+0x51/0x80
[  135.287441]  page_cache_ra_unbounded+0x124/0x170
[  135.288510]  filemap_get_pages+0x23d/0x5a0
[  135.289457]  ? path_openat+0xa72/0xdd0
[  135.290332]  filemap_read+0xbf/0x300
[  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
[  135.292192]  new_sync_read+0x103/0x170
[  135.293014]  vfs_read+0x15d/0x180
[  135.293745]  ksys_read+0xa1/0xe0
[  135.294461]  do_syscall_64+0x3c/0x80
[  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch simply adds an extra check in __ext4_ext_check(), verifying that
eh_entries is not 0 when eh_depth is > 0.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
Cc: Baokun Li <[email protected]>
Cc: [email protected]
Signed-off-by: Luís Henriques <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Baokun Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
nvidia-bfigg pushed a commit that referenced this pull request Jan 22, 2023
BugLink: https://bugs.launchpad.net/bugs/1996540

[ Upstream commit 2bc9cd6 ]

If an IIO driver uses callbacks from another IIO driver and calls
iio_channel_start_all_cb() from one of its buffer setup ops, then
lockdep complains due to the lock nesting, as in the below example with
lmp91000.

Since the locks are being taken on different IIO devices, there is no
actual deadlock.  Fix the warning by telling lockdep to use a different
class for each iio_device.

 ============================================
 WARNING: possible recursive locking detected
 --------------------------------------------
 python3/23 is trying to acquire lock:
 (&indio_dev->mlock){+.+.}-{3:3}, at: iio_update_buffers

 but task is already holding lock:
 (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&indio_dev->mlock);
   lock(&indio_dev->mlock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 5 locks held by python3/23:
  #0: (sb_writers#5){.+.+}-{0:0}, at: ksys_write
  #1: (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter
  #2: (kn->active#14){.+.+}-{0:0}, at: kernfs_fop_write_iter
  #3: (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store
  #4: (&iio_dev_opaque->info_exist_lock){+.+.}-{3:3}, at: iio_update_buffers

 Call Trace:
  __mutex_lock
  iio_update_buffers
  iio_channel_start_all_cb
  lmp91000_buffer_postenable
  __iio_update_buffers
  enable_store

Fixes: 67e1730 ("iio: potentiostat: add LMP91000 support")
Signed-off-by: Vincent Whitchurch <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants