-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XDP & AF_XDP for IAVF based on libie and page-pool #19
base: net-next
Are you sure you want to change the base?
Commits on Mar 24, 2023
-
net: intel: introduce Intel Ethernet common library
Not a secret there's a ton of code duplication between two and more Intel ethernet modules. Before introducing new changes, which would need to be copied over again, start decoupling the already existing duplicate functionality into a new module, which will be shared between several Intel Ethernet drivers. Add the lookup table which converts 8/10-bit hardware packet type into a parsed bitfield structure for easy checking packet format parameters, such as payload level, IP version, etc. This is currently used by i40e, ice and iavf and it's all the same in all three drivers. The only difference introduced in this implementation is that instead of defining a 256 (or 1024 in case of ice) element array, add unlikely() condition to limit the input to 154 (current maximum non-reserved packet type). There's no reason to waste 600 (or even 3600) bytes only to not hurt very unlikely exception packets. The hash computation function now takes payload level directly as a pkt_hash_type. There's a couple cases when non-IP ptypes are marked as L3 payload and in the previous versions their hash level would be 2, not 3. But skb_set_hash() only sees difference between L4 and non-L4, thus this won't change anything at all. The module is behind the hidden Kconfig symbol, which the drivers will select when needed. The exports are behind 'LIBIE' namespace to limit the scope of the functions. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 67bcf37 - Browse repository at this point
Copy the full SHA 67bcf37View commit details -
iavf: kill "legacy-rx" for good
Ever since build_skb() became stable, the old way with allocating an skb for storing the headers separately, which will be then copied manually, was slower, less flexible and thus obsolete. * it had higher pressure on MM since it actually allocates new pages, which then get split and refcount-biased (NAPI page cache); * it implies memcpy() of packet headers (40+ bytes per each frame); * the actual header length was calculated via eth_get_headlen(), which invokes Flow Dissector and thus wastes a bunch of CPU cycles; * XDP makes it even more weird since it requires headroom for long and also tailroom for some time (since mbuf landed). Take a look at the ice driver, which is built around work-arounds to make XDP work with it. Even on some quite low-end hardware (not a common case for 100G NICs) it was performing worse. The only advantage "legacy-rx" had is that it didn't require any reserved headroom and tailroom. But iavf didn't use this, as it always splits pages into two halves of 2k, while that save would only be useful when striding. And again, XDP effectively removes that sole pro. There's a train of features to land in IAVF soon: Page Pool, XDP, XSk, multi-buffer etc. Each new would require adding more and more Danse Macabre for absolutely no reason, besides making hotpath less and less effective. Remove the "feature" with all the related code. This includes at least one very hot branch (typically hit on each new frame), which was either always-true or always-false at least for a complete NAPI bulk of 64 frames, the whole private flags cruft and so on. Some stats: Function: add/remove: 0/2 grow/shrink: 0/7 up/down: 0/-774 (-774) RO Data: add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-40 (-40) Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0e490c5 - Browse repository at this point
Copy the full SHA 0e490c5View commit details -
iavf: optimize Rx buffer allocation a bunch
The Rx hotpath code of IAVF is not well-optimized TBH. Before doing any further buffer model changes, shake it up a bit. Notably: 1. Cache more variables on the stack. DMA device, Rx page size, NTC -- these are the most common things used all throughout the hotpath, often in loops on each iteration. Instead of fetching (or even calculating, as with the page size) them from the ring all the time, cache them on the stack at the beginning of the NAPI polling callback. NTC will be written back at the end, the rest are used read-only, so no sync needed. 2. Don't move the recycled buffers around the ring. The idea of passing the page of the right-now-recycled-buffer to a different buffer, in this case, the first one that needs to be allocated, moreover, on each new frame, is fundamentally wrong. It involves a few o' fetches, branches and then writes (and one Rx buffer struct is at least 32 bytes) where they're completely unneeded, but gives no good -- the result is the same as if we'd recycle it inplace, at the same position where it was used. So drop this and let the main refilling function take care of all the buffers, which were processed and now need to be recycled/refilled. 3. Don't allocate with %GPF_ATOMIC on ifup. This involved introducing the @gfp parameter to a couple functions. Doesn't change anything for Rx -> softirq. 4. 1 budget unit == 1 descriptor, not skb. There could be underflow when receiving a lot of fragmented frames. If each of them would consist of 2 frags, it means that we'd process 64 descriptors at the point where we pass the 32th skb to the stack. But the driver would count that only as a half, which could make NAPI re-enable interrupts prematurely and create unnecessary CPU load. 5. Shortcut !size case. It's super rare, but possible -- for example, if the last buffer of the fragmented frame contained only FCS, which was then stripped by the HW. Instead of checking for size several times when processing, quickly reuse the buffer and jump to the skb fields part. 6. Refill the ring after finishing the polling loop. Previously, the loop wasn't starting a new iteration after the 64th desc, meaning that we were always leaving 16 buffers non-refilled until the next NAPI poll. It's better to refill them while they're still hot, so do that right after exiting the loop as well. For a full cycle of 64 descs, there will be 4 refills of 16 descs from now on. Function: add/remove: 4/2 grow/shrink: 0/5 up/down: 473/-647 (-174) + up to 2% performance. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 88798b9 - Browse repository at this point
Copy the full SHA 88798b9View commit details -
iavf: remove page splitting/recycling
As an intermediate step, remove all page splitting/recyclig code. Just always allocate a new page and don't touch its refcount, so that it gets freed by the core stack later. The change allows to greatly simplify certain parts of the code: Function: add/remove: 2/3 grow/shrink: 0/5 up/down: 543/-963 (-420) &iavf_rx_buf can even now retire in favor of just storing an array of pages used for Rx. Their DMA addresses can be stored in page::dma_addr -- use Page Pool's function for that. No surprise perf loses up to 30% here, but that regression will go away once PP lands. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 46eb61c - Browse repository at this point
Copy the full SHA 46eb61cView commit details -
iavf: always use a full order-0 page
The current scheme with trying to pick the smallest buffer possible for the current MTU in order to flip/split pages is not very optimal. For example, on default MTU of 1500 it gives only 192 bytes of headroom, while XDP may require up to 258. But this also involves unnecessary code complication, which sometimes is even hard to follow. As page split is no more, always allocate order-0 pages. This optimizes performance a bit and drops some bytes off the object code. Next, always pick the maximum buffer length available for this %PAGE_SIZE to set it up in the hardware. This means it now becomes a constant value, which also has its positive impact. On x64 this means (without XDP): 4096 page 64 head, 320 tail 3712 HW buffer size 3686 max MTU w/o frags Previously, the maximum MTU w/o splitting a frame into several buffers was 3046. Increased buffer size allows us to reach the maximum frame size w/ frags supported by HW: 16382 bytes (MTU 16356). Reflect it in the netdev config as well. Relying on max single buffer size when calculating MTU was not correct. Move around a couple of fields in &iavf_ring after ::rx_buf_len removal to reduce holes and improve cache locality. Instead of providing the Rx definitions, which can and will be reused in rest of the drivers, exclusively for IAVF, do that in the libie header. Non-PP drivers could still use at least some of them and lose a couple copied lines. Function: add/remove: 0/0 grow/shrink: 3/9 up/down: 18/-265 (-247) + even reclaims a half percent of performance, nice. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 95a993b - Browse repository at this point
Copy the full SHA 95a993bView commit details -
net: page_pool: allow DMA mapping with %DMA_ATTR_WEAK_ORDERING
Add a new flag, %PP_FLAG_DMA_MAP_WEAK, whill will tell PP to map pages with %DMA_ATTR_WEAK_ORDERING. To keep the code simple and optimized, map the following PP flags to DMA map attr flags: %PP_FLAG_DMA_MAP => %DMA_ATTR_SKIP_CPU_SYNC %PP_FLAG_DMA_MAP_WEAK => %DMA_ATTR_WEAK_ORDERING The first pair is done to be able to just pass it directly to dma_map_page_attrs(). When a driver wants Page Pool to maintain DMA mappings, it always sets this flag. Page Pool always skips CPU syncs when mapping to do that separately later, so having those two 1:1 avoids introducing ifs and/or bit-ors and keeps the code more compact. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 49b4e5c - Browse repository at this point
Copy the full SHA 49b4e5cView commit details -
net: page_pool: add DMA-sync-for-CPU inline helpers
Each driver is responsible for syncing buffers written by HW for CPU before accessing them. Almost each PP-enabled driver uses the same pattern, which could be shorthanded into a static inline to make driver code a little bit more compact. Introduce a pair of such functions. The first one takes the actual size of the data written by HW and is the main one to be used on Rx. The second picks max_len from the PP params and is designed for more extreme cases when the size is unknown, but the buffer still needs to be synced. Also constify pointer arguments of page_pool_get_dma_dir() and page_pool_get_dma_addr() to give a bit more room for optimization, as both of them are read-only. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b71da32 - Browse repository at this point
Copy the full SHA b71da32View commit details -
Now that the IAVF driver simply uses dev_alloc_page() + free_page() with no custom recycling logics and one whole page per frame, it can easily be switched to using Page Pool API instead. Introduce libie_rx_page_pool_create(), a wrapper for creating a PP with the default libie settings applicable to all Intel hardware, and replace the alloc/free calls with the corresponding PP functions, including the newly added sync-for-CPU helpers. Use skb_mark_for_recycle() to bring back the recycling and restore the initial performance. From the important object code changes, worth mentioning that __iavf_alloc_rx_pages() is now inlined due to the greatly reduced size. The resulting driver is on par with the pre-series code and 1-2% slower than the "optimized" version right before the recycling removal. But the number of locs and object code bytes slaughtered is much more important here after all, not speaking of that there's still a vast space for optimization and improvements. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b71ce3c - Browse repository at this point
Copy the full SHA b71ce3cView commit details -
Next stop, per-queue private stats. They have only subtle differences from driver to driver and can easily be resolved. Define common structures, inline helpers and Ethtool helpers to collect, update and export the statistics. Use u64_stats_t right from the start, as well as the corresponding helpers to ensure tear-free operations. For the NAPI parts of both Rx and Tx, also define small onstack containers to update them in polling loops and then sync the actual containers once a loop ends. The drivers will be switched to use this API later on a per-driver basis, along with conversion to PP. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3d884e3 - Browse repository at this point
Copy the full SHA 3d884e3View commit details -
libie: add per-queue Page Pool stats
Expand the libie generic per-queue stats with the generic Page Pool stats provided by the API itself, when CONFIG_PAGE_POOL is enable. When it's not, there'll be no such fields in the stats structure, so no space wasted. They are also a bit special in terms of how they are obtained. One &page_pool accumulates statistics until it's destroyed obviously, which happens on ifdown. So, in order to not lose any statistics, get the stats and store in the queue container before destroying a pool. This container survives ifups/downs, so it basically stores the statistics accumulated since the very first pool was allocated on this queue. When it's needed to export the stats, first get the numbers from this container and then add the "live" numbers -- the ones that the current active pool returns. The result values will always represent the actual device-lifetime* stats. There's a cast from &page_pool_stats to `u64 *` in a couple functions, but they are guarded with stats asserts to make sure it's safe to do. FWIW it saves a lot of object code. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 16d126c - Browse repository at this point
Copy the full SHA 16d126cView commit details -
iavf: switch queue stats to libie
iavf is pretty much ready for using the generic libie stats, so drop all the custom code and just use generic definitions. The only thing is that it previously lacked the counter of Tx queue stops. It's present in the other drivers, so add it here as well. The rest is straightforward. There were two fields in the Tx stats struct, which didn't belong there. The first one has never been used, wipe it; and move the other to the queue structure. Plus move around a couple fields in &iavf_ring to account stats structs' alignment. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1f934b6 - Browse repository at this point
Copy the full SHA 1f934b6View commit details -
selftests/bpf: robustify test_xdp_do_redirect with more payload magics
Currently, the test relies on that only dropped ("xmitted") frames will be recycled and if a frame became an skb, it will be freed later by the stack and never come back to its page_pool. So, it easily gets broken by trying to recycle skbs: test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec test_xdp_do_redirect:FAIL:pkt_count_zero unexpected pkt_count_zero: actual 9936 != expected 2 test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec That huge mismatch happened because after the TC ingress hook zeroes the magic, the page gets recycled when skb is freed, not returned to the MM layer. "Live frames" mode initializes only new pages and keeps the recycled ones as is by design, so they appear with zeroed magic on the Rx path again. Expand the possible magic values from two: 0 (was "xmitted"/dropped or did hit the TC hook) and 0x42 (hit the input XDP prog) to three: the new one will mark frames hit the TC hook, so that they will elide both @pkt_count_zero and @pkt_count_xdp. They can then be recycled to their page_pool or returned to the page allocator, this won't affect the counters anyhow. Just make sure to mark them as "input" (0x42) when they appear on the Rx path again. Also make an enum from those magics, so that they will be always visible and can be changed in just one place anytime. This also eases adding any new marks later on. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4e60236 - Browse repository at this point
Copy the full SHA 4e60236View commit details -
net: page_pool, skbuff: make skb_mark_for_recycle() always available
skb_mark_for_recycle() is guarded with CONFIG_PAGE_POOL, this creates unneeded complication when using it in the generic code. For now, it's only used in the drivers always selecting Page Pool, so this works. Move the guards so that preprocessor will cut out only the operation itself and the function will still be a noop on !PAGE_POOL systems, but available there as well. No functional changes. Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected] Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2e28dff - Browse repository at this point
Copy the full SHA 2e28dffView commit details -
xdp: recycle Page Pool backed skbs built from XDP frames
__xdp_build_skb_from_frame() state(d): /* Until page_pool get SKB return path, release DMA here */ Page Pool got skb pages recycling in April 2021, but missed this function. xdp_release_frame() is relevant only for Page Pool backed frames and it detaches the page from the corresponding page_pool in order to make it freeable via page_frag_free(). It can instead just mark the output skb as eligible for recycling if the frame is backed by a pp. No change for other memory model types (the same condition check as before). cpumap redirect and veth on Page Pool drivers now become zero-alloc (or almost). Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for df09c63 - Browse repository at this point
Copy the full SHA df09c63View commit details -
xdp: remove unused {__,}xdp_release_frame()
__xdp_build_skb_from_frame() was the last user of {__,}xdp_release_frame(), which detaches pages from the page_pool. All the consumers now recycle Page Pool skbs and page, except mlx5, stmmac and tsnep drivers, which use page_pool_release_page() directly (might change one day). It's safe to assume this functionality is not needed anymore and can be removed (in favor of recycling). Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4fcbfcd - Browse repository at this point
Copy the full SHA 4fcbfcdView commit details
Commits on Apr 14, 2023
-
iavf: optimize Rx hotpath a bunch -- vol. 2
Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 897dcfd - Browse repository at this point
Copy the full SHA 897dcfdView commit details -
iavf: fixup for optimize vol. 2
Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d6ea05c - Browse repository at this point
Copy the full SHA d6ea05cView commit details -
i40e: Unify handling of zero ring length in 'configure queue'
The current version of Intel 'ice' driver allows for using zero for the ring lenghth in 'configure queue' VIRTCHNL message. Such a value indicates the ring should not be configured. Implement the same handling in i40e driver. Instead of returning an 'invalid parameter' error for zero-sized rings, just skip that ring during queue pair configuration. That unified handling is needed for AF_XDP implementation for 'iavf' driver. In that use case we sometimes need to configure Tx ring only for a given queue pair. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8aca6e1 - Browse repository at this point
Copy the full SHA 8aca6e1View commit details -
iavf: Remove IAVF_TX_FLAGS_FD_SB flag
This flag was never set, so remove it and simplify buffer cleaning process. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 557b392 - Browse repository at this point
Copy the full SHA 557b392View commit details -
iavf: Use separate ring masks for TX and RX in q_vector
Replace the existing ring mask (common for RX and TX rings) in iavf_q_vector with two masks dedicated to handling RX and TX rings separately. The virtchnl interface allows separate masks to be used for different ring types, so there is no need to merge them into a single mask. Also, after adding XDP support to iavf, the number of RX and TX rings can be asymmetric. Therefore, this patch is a necessary preparation for XDP support. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 80f873c - Browse repository at this point
Copy the full SHA 80f873cView commit details -
iavf: Prepare VIRTCHNL functions to support XDP
The XDP and AF_XDP feature is initialized using .ndo functions. Those functions are always synchronous and may require some serious queues reconfiguration including changing the number of queues. Performing such a reconfiguration implies sending a bunch of VIRTCHNL messages to the PF in order to disable queues, re-enable and re-configure them, or update the RSS LUT. By definition, those VIRTCHNL messages are sent asynchronously, so the result of each VIRTCHNL operation can be received from the PF via admin queue after some time. Moreover, the previous implementation of some VIRTCHNL functions (e.g. 'iavf_disable_queues()' or 'iavf_enable_queues()' does not allow to call them selectively for specific queues only. In order to addres those problems and cover all scenarios of XDP and AF_XDP initialization, implement a polling mechanism with a timeout for blocking the execution of XDP .ndo functions until the result of VIRTCHNL operation on PF is known to the driver. Also, refactor the existing VIRTCHNL API by adding functions for selective queue enabling, disabling and configuration. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 455dee1 - Browse repository at this point
Copy the full SHA 455dee1View commit details -
iavf: Refactor ring initialization functions to handle XDP
Introduce modular functions to allocate and initialize Rx and Tx rings in order to prepare the initialization procedure to easily fit the XDP setup. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d1aea70 - Browse repository at this point
Copy the full SHA d1aea70View commit details -
iavf: Prepare rings to support XDP
Extend basic structures of the driver (e.g. 'iavf_adapter', 'iavf_ring') by adding members necessary to support XDP. Register those members using required functions from BPF API. Implement a support for XDP_TX and XDP_REDIRECT actions by adding additional XDP Tx queues to transmit packets without interferring a regular Tx traffic. Finally, add required XDP setup and release calls to queue allocation and deallocation functions respectively. Signed-off-by: Michal Kubiak <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 76bf498 - Browse repository at this point
Copy the full SHA 76bf498View commit details -
iavf: don't hardcode DMA direction, headroom and buffer len on Rx
Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 629e722 - Browse repository at this point
Copy the full SHA 629e722View commit details -
iavf: Handle XDP_SETUP_PROG command in .ndo_bpf
Add .ndo_bpf function to handle XDP_SETUP_PROG command. In order to avoid synchronization issues, implement functions dedicated to re-initialize only those parts of the interface which are really necessary to setup the XDP program. Such an approach is much lighter than performing a full reset of the driver and thanks to it we can immediately know the result of traffic initialization comparing to the reset task which triggers some asynchronous events (e.g. link speed negotiation). Signed-off-by: Michal Kubiak <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9f85b9d - Browse repository at this point
Copy the full SHA 9f85b9dView commit details -
iavf: Add XDP_PASS and XDP_DROP support
Implement basic XDP program setup, refactor data path to use xdp_buff, implement XDP_PASS and XDP_DROP actions. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c59cad5 - Browse repository at this point
Copy the full SHA c59cad5View commit details -
Implement sending the packet from an XDP ring. XDP path functions are separate from the general TX routines, because this allows to simplify and therefore speedup the process. It also makes code more friendly to future XDP-specific optimizations. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 19cfad9 - Browse repository at this point
Copy the full SHA 19cfad9View commit details -
iavf: Implement XDP redirect path
Implement XDP_REDIRECT action and ndo_xdp_xmit() callback. For now, packets redirected from CPU with index greater than XDP queues number are just dropped with an error. This is a rather common situation, especially when VF is configured to run on host and will be addressed in later patches. Patch also refactors RX XDP handling to use switch statement due to increased number of actions. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ed6e942 - Browse repository at this point
Copy the full SHA ed6e942View commit details -
Port of commit 22bf877 ("ice: introduce XDP_TX fallback path"). The patch handles the case, when queue number is not sufficient for the current number of CPUs. To avoid dropping some packets redirected from other interfaces, XDP TxQs are allowed to be shared between CPUs, which imposes the locking requirement. Static key approach has little to none performance penalties when sharing is not needed. This mechanism is much more applicable when dealing with VFs. In fact, maximum number of queue pairs that ice PF can give to an iavf VF is 16, which allows up to 8 XDP TxQs, so without XDP TxQ sharing, some redirected packets can be dropped even on a 10 CPU system. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 55adf95 - Browse repository at this point
Copy the full SHA 55adf95View commit details -
iavf: Enable XDP netdev features
Enable NETDEV_XDP_ACT_BASIC and NETDEV_XDP_ACT_REDIRECT XDP features in netdev. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8f9b548 - Browse repository at this point
Copy the full SHA 8f9b548View commit details -
iavf: Add AF_XDP initialization
Add necessary functions and data structures to support AF_XDP feature. Implement handling of 'XDP_SETUP_XSK_POOL' in .ndo_bpf(). Also, implement functions for selectively stopping only those queues which take part in XDP socket creation. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7910142 - Browse repository at this point
Copy the full SHA 7910142View commit details -
iavf: Implement Tx path for AF_XDP
Implement Tx handling for AF_XDP feature in zero-copy mode. Add '.ndo_xdp_xmit()' and '.ndo_xsk_wakeup()' implementations to support AF_XDP Tx path. Also, add Tx interrupt handling function for zero-copy mode. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for af588dd - Browse repository at this point
Copy the full SHA af588ddView commit details -
iavf: Implement AF_XDP RX processing
Implement RX packet processing specific to AF_XDP ZC. All actions except XDP_PASS are supported, the skb path will be implemented in later patches. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for af38da2 - Browse repository at this point
Copy the full SHA af38da2View commit details -
iavf: consolidate skb fields processing
For now, filling the skb fields on Rx is a bit scattered across RQ polling function. This makes it harder to reuse the code on XSk Rx path and also sometimes costs some CPU (e.g. doing a lookup for the decoded packet type two times). Make it consistent and do everything in iavf_process_skb_fields(). First of all, get the packet type and decode it. Then, move to hash, csum and VLAN, which is moved here too. iavf_receive_skb() becomes then the classic eth_type_trans() + napi_gro_receive() pair. Finally, make the fields processing function global and the skb receive function static inline in order to call them from a different file later on. Signed-off-by: Alexander Lobakin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f057680 - Browse repository at this point
Copy the full SHA f057680View commit details -
iavf: Implement XDP_PASS path in AF_XDP processing
Construct skb and fill in its fields, when AF_XDP is enabled on the ring, if XDP program returns XDP_PASS. (will be fixed up). Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7f53f95 - Browse repository at this point
Copy the full SHA 7f53f95View commit details -
iavf: Make request and free traffic irqs symmetric
The existing implementation of 'iavf_request_traffic_irqs()' function does not request any interrupt for q_vectors that have no Tx nor Rx queues assigned to it. However, the function 'iavf_free_traffic_irqs()' releases interrupts for all q_vectors unconditionally. Such an approach may result in showing kernel warning about an attempt of releasing the interrupt that was not requested. In order to solve that potential issue make both functions fully symmetric. Therefore, add the logic to 'iavf_free_traffic_irqs()' for skipping not used q_vectors. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3e2fafc - Browse repository at this point
Copy the full SHA 3e2fafcView commit details -
iavf: Do not reset the number of requested queues
When the number of queues is being changed by the user, the information about a new queue number is kept in the adapter structure member (num_req_queues). Such an information was always reset to zero just after setting queue number request is processed. However, that structure member should always provide an information about user's preference regarding the requested queue number, so it should be preserved for future driver reinitializations or setting up the adapter for XDP program. Remove setting the number of requested queues to zero and use that value as a priority one during next reinitializations of the adapter, in order to avoid the scenario when the queue count can be changed automatically out of user's control. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a3e00d6 - Browse repository at this point
Copy the full SHA a3e00d6View commit details -
iavf: Limit number of channels in ethtool when XDP is enabled
When XDP is enabled, our true maximum number of queue pairs can be reduced up to being cut in half, ex. for system with 10 CPUs and 16 queue pairs allowed by PF, normally maximum would be 10 queues, but XDP requires 2 queue pairs per channel, so the maximum of 8 queues can be used if program is attached. The above fact has to be reflected in ethtool. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c784562 - Browse repository at this point
Copy the full SHA c784562View commit details -
iavf: lock XDP queue while using in ZC mode
To avoid race between .ndo_xdp_xmit(), normal ZC TX processing and XDP_TX in ZC mode when also sharing queues, add locking to the later two. Locking in .ndo_xdp_xmit() is already present. Signed-off-by: Larysa Zaremba <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b7fbed3 - Browse repository at this point
Copy the full SHA b7fbed3View commit details -
iavf: Enable AF_XDP zero-copy feature in netdev
Enable NETDEV_XDP_ACT_XSK_ZEROCOPY feature in netdev structure. Signed-off-by: Michal Kubiak <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f71cafe - Browse repository at this point
Copy the full SHA f71cafeView commit details