-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sfc driver does not load in a bare metal with i40e driver #7
Comments
I have not been able to reproduce this issue with rhel8.3 4.18.0-240.el8.x86_64 kernel and fad4522 It looks like dependency on mtd module. Is mtd module inserted, would |
Hi @maciejj-xilinx It did not help I am not certain why would we need a Memory Tech Device for sfc driver. Else this should have also affected ixgbe based system as well. Are you reproducing the error on a i40e based system ? |
Thanks for trying. |
Hi @maciejj-xilinx , I tried to make i40e driver with onload and the same issue persists as before. I checked the commit @abower-xilinx made and hence made an attempt. This is an FYI |
Hello @maciejj-xilinx , I have tried to compile the latest Onload on top of 140e driver again and it looks like we are still hitting the same issue, as described in this ticket. Can you please let me know the status of the issue you have raised internally ? |
I have raised the priority of the issue. We depend on other internal team to have this fixed. |
Hello @maciejj-xilinx any updates on this issue ? Just checking, if any updates, I will test it again. |
1 similar comment
Hello @maciejj-xilinx any updates on this issue ? Just checking, if any updates, I will test it again. |
Hey @shirshen12
UPDATE: we canNOT reproduce the issue with 4.18.0-240.15.1.el8_3.x86_64 on rhel8.3 Regards, |
Thanks for sharing the exact kernel version, can you also share the exact driver version @maciejj-xilinx for i40e ? I am updating the driver version to obtain zerocopy support, are you also updating the driver ? |
My earlier survey of the standalone i40 driver code brought me to believe that zerocopy gets disabled when built against redhat/centos 8 kernels. The distro driver code did not seem to suffer from this issue. |
Closing this ticket since i40e driver is compatible now with Onload. |
Deferring oo_exit_hook() fixes a stuck C++ application: #0 0x00007fd2d7afb87b in ioctl () from /lib64/libc.so.6 #1 0x00007fd2d80c0621 in oo_resource_op (cmd=3221510722, io=0x7ffd15be696c, fp=<optimized out>) at /home/iteterev/lab/onload_internal/src/include/onload/mmap.h:104 #2 __oo_eplock_lock (timeout=<synthetic pointer>, maybe_wedged=0, ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/eplock_slow.c:35 #3 __ef_eplock_lock_slow (ni=ni@entry=0x20c8480, timeout=timeout@entry=-1, maybe_wedged=maybe_wedged@entry=0) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/eplock_slow.c:72 #4 0x00007fd2d80d7dbf in ef_eplock_lock (ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/include/onload/eplock.h:61 #5 __ci_netif_lock_count (stat=0x7fd2d5c5b62c, ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/include/ci/internal/ip_shared_ops.h:79 #6 ci_tcp_setsockopt (ep=ep@entry=0x20c8460, fd=6, level=level@entry=1, optname=optname@entry=9, optval=optval@entry=0x7ffd15be6acc, optlen=optlen@entry=4) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/tcp_sockopts.c:580 #7 0x00007fd2d8010da7 in citp_tcp_setsockopt (fdinfo=0x20c8420, level=1, optname=9, optval=0x7ffd15be6acc, optlen=4) at /home/iteterev/lab/onload_internal/src/lib/transport/unix/tcp_fd.c:1594 #8 0x00007fd2d7fde088 in onload_setsockopt (fd=6, level=1, optname=9, optval=0x7ffd15be6acc, optlen=4) at /home/iteterev/lab/onload_internal/src/lib/transport/unix/sockcall_intercept.c:737 #9 0x00007fd2d7dcb7dd in ?? () #10 0x00007fd2d83392e0 in ?? () from /home/iteterev/lab/onload_internal/build/gnu_x86_64/lib/transport/unix/libcitransport0.so #11 0x000000000060102c in data_start () #12 0x00007fd2d8339540 in ?? () from /home/iteterev/lab/onload_internal/build/gnu_x86_64/lib/transport/unix/libcitransport0.so #13 0x00000001d85426c0 in ?? () #14 0x00007fd2d7fcbe08 in ?? () #15 0x00007fd2d7a433c7 in __cxa_finalize () from /lib64/libc.so.6 #16 0x00007fd2d7dcb757 in ?? () #17 0x00007ffd15be6be0 in ?? () #18 0x00007fd2d834f2a6 in _dl_fini () from /lib64/ld-linux-x86-64.so.2 Here, _fini() is a function that calls all library destructors. The problem is that _fini() decides to run the C++ library destructor *after* Onload and makes it operate on an invalid Onload state. The patch leverages the fact that Glibc sets up _fini() after running the last library constructor, so by manually installing the exit handler (instead of providing a library destructor), Onload wins the race with _fini(). There's still an issue if the user library sets a custom exit handler with atexit() or on_exit() and makes intercepted system calls from there. Tested: * RHEL 7.9/glibc 2.17 * RHEL 8.2/glibc 2.28 * RHEL 9.1/glibc 2.34 Thanks-to: Richard Hughes <[email protected]> Thanks-to: Siân James <[email protected]>
Hello @maciejj-xilinx
I have been trying to get Onload working on a:
40GBe NiC (X710, 4 x 10 bonded)
i40e driver, version: 2.8.20-k (its a stock version, same issue with 2.13 i40e driver update as well)
64 core Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
Centos 8, 4.18.0-240.1.1.el8_3.x86_64
So onload compiles well, but when I go to reload it, I see this error:
Please see dmesg as well below:
Any help is appreciated.
The text was updated successfully, but these errors were encountered: