summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2021-02-18i40e: Add zero-initialization of AQ command structuresMateusz Palczewski
Zero-initialize AQ command data structures to comply with API specifications. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Fixes: f4492db16df8 ("i40e: Add NPAR BW get and set functions") Signed-off-by: Andrzej Sawuła <andrzej.sawula@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix memory leak in i40e_probeKeita Suzuki
Struct i40e_veb is allocated in function i40e_setup_pf_switch, and stored to an array field veb inside struct i40e_pf. However when i40e_setup_misc_vector fails, this memory leaks. Fix this by calling exit and teardown functions. Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix flow for IPv6 next header (extension header)Slawomir Laba
When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with I40E_TX_CTX_EXT_IP_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: a3fd9d8876a5 ("i40e/i40evf: Handle IPv6 extension headers in checksum offload") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-15i40e: Fix uninitialized variable mfs_maxColin Ian King
The variable mfs_max is not initialized and is being compared to find the maximum value. Fix this by initializing it to 0. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 90bc8e003be2 ("i40e: Add hardware configuration for software based DCB") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15i40e: Fix incorrect argument in call to ipv6_addr_any()Gustavo A. R. Silva
It seems that the right argument to be passed is &tcp_ip6_spec->ip6dst, not &tcp_ip6_spec->ip6src, when calling function ipv6_addr_any(). Addresses-Coverity-ID: 1501734 ("Copy-paste error") Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ringMaciej Fijalkowski
Output of ixgbe_rx_offset() is based on ethtool's priv flag setting, which when changed, causes PF reset (disables napi, frees irqs, loads different Rx mem model, etc.). This means that within napi its result is constant and there is no reason to call it per each processed frame. Add new 'rx_offset' field to ixgbe_ring that is meant to hold the ixgbe_rx_offset() result and use it within ixgbe_clean_rx_irq(). Furthermore, use it within ixgbe_alloc_mapped_page(). Last but not least, un-inline the function of interest as it lives in .c file so let compiler do the decision about the inlining. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12ice: store the result of ice_rx_offset() onto ice_ringMaciej Fijalkowski
Output of ice_rx_offset() is based on ethtool's priv flag setting, which when changed, causes PF reset (disables napi, frees irqs, loads different Rx mem model, etc.). This means that within napi its result is constant and there is no reason to call it per each processed frame. Add new 'rx_offset' field to ice_ring that is meant to hold the ice_rx_offset() result and use it within ice_clean_rx_irq(). Furthermore, use it within ice_alloc_mapped_page(). Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12i40e: store the result of i40e_rx_offset() onto i40e_ringMaciej Fijalkowski
Output of i40e_rx_offset() is based on ethtool's priv flag setting, which when changed, causes PF reset (disables napi, frees irqs, loads different Rx mem model, etc.). This means that within napi its result is constant and there is no reason to call it per each processed frame. Add new 'rx_offset' field to i40e_ring that is meant to hold the i40e_rx_offset() result and use it within i40e_clean_rx_irq(). Furthermore, use it within i40e_alloc_mapped_page(). Last but not least, un-inline the function of interest so that compiler makes the decision about inlining as it lives in .c file. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12i40e: Simplify the do-while allocation loopBjörn Töpel
Fold the count decrement into the while-statement. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12ice: skip NULL check against XDP prog in ZC pathMaciej Fijalkowski
Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is attached to rx_ring and it can happen only when XDP program is present on interface. Therefore it is safe to assume that program is always !NULL and there is no need for checking it in ice_run_xdp_zc. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12ice: remove redundant checks in ice_change_mtuMaciej Fijalkowski
dev_validate_mtu checks that mtu value specified by user is not less than min mtu and not greater than max allowed mtu. It is being done before calling the ndo_change_mtu exposed by driver, so remove these redundant checks in ice_change_mtu. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12ice: move skb pointer from rx_buf to rx_ringMaciej Fijalkowski
Similar thing has been done in i40e, as there is no real need for having the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled on that pointer moved upwards to rx_ring. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12ice: simplify ice_run_xdpMaciej Fijalkowski
There's no need for 'result' variable, we can directly return the internal status based on action returned by xdp prog. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12i40e: adjust i40e_is_non_eopMaciej Fijalkowski
i40e_is_non_eop had a leftover comment and unused skb argument which was used for placing the skb onto rx_buf in case when current buffer was non-eop one. This is not relevant anymore as commit e72e56597ba1 ("i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring") pulled the non-complete skb handling out of rx_bufs up to rx_ring. Therefore, let's adjust the function arguments that i40e_is_non_eop takes. Furthermore, since there is already a function responsible for bumping the ntc, make use of that and drop that logic from i40e_is_non_eop so that the scope of this function is limited to what the name actually states. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12i40e: drop misleading function commentsMaciej Fijalkowski
i40e_cleanup_headers has a statement about check against skb being linear or not which is not relevant anymore, so let's remove it. Same case for i40e_can_reuse_rx_page, it references things that are not present there anymore. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12i40e: drop redundant check when setting xdp progMaciej Fijalkowski
Net core handles the case where netdev has no xdp prog attached and current prog is NULL. Therefore, remove such check within i40e_xdp_setup. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: remove the useless value assignment in i40e_clean_adminq_subtaskKaixu Xia
The variable ret is overwritten by the following call i40e_clean_arq_element() and the assignment is useless, so remove it. Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: VLAN field for flow directorPrzemyslaw Patynowski
Allow user to specify VLAN field and add it to flow director. Show VLAN field in "ethtool -n ethx" command. Handle VLAN type and tag field provided by ethtool command. Refactored filter addition, by replacing static arrays with runtime dummy packet creation, which allows specifying VLAN field. Previously, VLAN field was omitted. Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add flow director support for IPv6Przemyslaw Patynowski
Flow director for IPv6 is not supported. 1) Implementation of support for IPv6 flow director. 2) Added handlers for addition of TCP6, UDP6, SCTP6, IPv6. 3) Refactored legacy code to make it more generic. 4) Added packet templates for TCP6, UDP6, SCTP6, IPv6. 5) Added handling of IPv6 source and destination address for flow director. 6) Improved argument passing for source and destination portin TCP6, UDP6 and SCTP6. 7) Added handling of ethtool -n for IPv6, TCP6,UDP6, SCTP6. 8) Used correct bit flag regarding FLEXOFF field of flow director data descriptor. Without this patch, there would be no support for flow director on IPv6, TCP6, UDP6, SCTP6. Tested based on x710 datasheet by using: ethtool -N enp133s0f0 flow-type tcp4 src-port 13 dst-port 37 user-def 0x44142 action 1 ethtool -N enp133s0f0 flow-type tcp6 src-port 13 dst-port 40 user-def 0x44142 action 2 ethtool -N enp133s0f0 flow-type udp4 src-port 20 dst-port 40 user-def 0x44142 action 3 ethtool -N enp133s0f0 flow-type udp6 src-port 25 dst-port 40 user-def 0x44142 action 4 ethtool -N enp133s0f0 flow-type sctp4 src-port 55 dst-port 65 user-def 0x44142 action 5 ethtool -N enp133s0f0 flow-type sctp6 src-port 60 dst-port 40 user-def 0x44142 action 6 ethtool -N enp133s0f0 flow-type ip4 src-ip 1.1.1.1 dst-ip 1.1.1.4 user-def 0x44142 action 7 ethtool -N enp133s0f0 flow-type ip6 src-ip fe80::3efd:feff:fe6f:bbbb dst-ip fe80::3efd:feff:fe6f:aaaa user-def 0x44142 action 8 Then send traffic from client which matches the criteria provided to ethtool. Observe that packets are redirected to user set queues with ethtool -S <interface> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add EEE status getting & setting implementationAleksandr Loktionov
Implement Energy Efficient Ethernet (EEE) status getting & setting. The i40e_get_eee() requesting PHY EEE capabilities from firmware. The i40e_set_eee() function requests PHY EEE capabilities from firmware and sets PHY EEE advertising to full abilities or 0 depending whether EEE is to be enabled or disabled. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add netlink callbacks support for software based DCBArkadiusz Kubalewski
Add callbacks used by software based LLDP agent, which allows to configure DCB feature from userspace. Update copyright dates as appropriate. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initialized DCB functionality with default values, one traffic class with 100% bandwidth allocated. The new netlink callbacks are required for software LLDP agent, it must be able to acquire current DCB configuration of a network port and apply DCB configuration changes, if required. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add init and default config of software based DCBArkadiusz Kubalewski
Add extra handling on changing the "disable-fw-lldp" private flag to properly initialize software based DCB feature. Add default configuration of DCB functionality when Firmware LLDP agent is turned off, in case of driver probe and device reset on reconfiguration. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. Default configuration and new initialization flow for software based DCB is required. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initializes DCB functionality with default values, one traffic class with 100% bandwidth allocated. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-10i40e: Add hardware configuration for software based DCBArkadiusz Kubalewski
Add registers and definitions required for applying DCB related hardware configuration. Add functions responsible for calculating and setting proper hardware configuration values for software based DCB functionality. Add function responsible for invoking Admin Queue command, which results in applying new DCB configuration to the hardware. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. New communication channel between software and hardware is required for software driver. It must be able to calculate and configure all the registers related for DCB feature. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-09Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2021-02-08 This series contains updates to i40e driver only. Cristian makes improvements to driver XDP path. Avoids writing next-to-clean pointer on every update, removes redundant updates of cleaned_count and buffer info, creates a helper function to consolidate XDP actions and simplifies some of the behavior. Eryk adds messages to inform the user when MTU is larger than supported ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08i40e: Log error for oversized MTU on deviceEryk Rybak
When attempting to link XDP prog with MTU larger than supported, user is not informed why XDP linking fails. Adding proper error message: "MTU too large to enable XDP". Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08i40e: consolidate handling of XDP program actionsCristian Dumitrescu
Consolidate the actions performed on the packet based on the XDP program result into a separate function that is easier to read and maintain. Simplify the i40e_construct_skb_zc function, so that the input xdp buffer is always freed, regardless of whether the output skb is successfully created or not. Simplify the behavior of the i40e_clean_rx_irq_zc function, so that the current packet descriptor is dropped when function i40_construct_skb_zc returns an error as opposed to re-processing the same description on the next invocation. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08i40e: remove the redundant buffer info updatesCristian Dumitrescu
For performance reasons, remove the redundant buffer info updates (*bi = NULL). The buffers ready to be cleaned can easily be tracked based on the ring next-to-clean variable, which is consistently updated. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08i40e: remove unnecessary cleaned_count updatesCristian Dumitrescu
For performance reasons, remove the redundant updates of the cleaned_count variable, as its value can be computed based on the ring next-to-clean variable, which is consistently updated. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08i40e: remove unnecessary memory writes of the next to clean pointerCristian Dumitrescu
For performance reasons, avoid writing the ring next-to-clean pointer value back to memory on every update, as it is not really necessary. Instead, simply read it at initialization into a local copy, update the local copy as necessary and write the local copy back to memory after the last update. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: Improve MSI-X fallback logicTony Nguyen
Currently if the driver is unable to get all the MSI-X vectors it wants, it falls back to the minimum configuration which equates to a single Tx/Rx traffic queue pair. Instead of using the minimum configuration, if given more vectors than the minimum, utilize those vectors for additional traffic queues after accounting for other interrupts. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
2021-02-08ice: Fix trivial error messageMitch Williams
This message indicates an error on close, not open. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: remove unnecessary castsBruce Allan
Casting a void * rvalue in an assignment is unnecessary in C; remove the casts. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: Refactor DCB related variables out of the ice_port_info structChinh T Cao
Refactor the DCB related variables out of the ice_port_info_struct. The goal is to make the ice_port_info struct cleaner. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Co-developed-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: fix writeback enable logicJesse Brandeburg
The writeback enable logic was incorrectly implemented (due to misunderstanding what the side effects of the implementation would be during polling). Fix this logic issue, while implementing a new feature allowing the user to control the writeback frequency using the knobs for controlling interrupt throttling that we already have. Basically if you leave adaptive interrupts enabled, the writeback frequency will be varied even if busy_polling or if napi-poll is in use. If the interrupt rates are set to a fixed value by ethtool -C and adaptive is off, the driver will allow the user-set interrupt rate to guide how frequently the hardware will complete descriptors to the driver. Effectively the user will get a control over the hardware efficiency, allowing the choice between immediate interrupts or delayed up to a maximum of the interrupt rate, even when interrupts are disabled during polling. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: Use PSM clock frequency to calculate RL profilesBen Shelton
The core clock frequency is currently hardcoded at 446 MHz for the RL profile calculations. This causes issues since not all devices use that clock frequency. Read the GLGEN_CLKSTAT_SRC register to determine which PSM clock frequency is selected. This ensures that the rate limiter profile calculations will be correct. Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: create scheduler aggregator node config and move VSIsKiran Patil
Create set scheduler aggregator node and move for VSIs into respective scheduler node. Max children per aggregator node is 64. There are two types of aggregator node(s) created. 1. dedicated node for PF and _CTRL VSIs 2. dedicated node(s) for VFs. As part of reset and rebuild, aggregator nodes are recreated and VSIs are moved to respective aggregator node. Having related VSIs in respective tree avoid starvation between PF and VF w.r.t Tx bandwidth. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Co-developed-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Victor Raj <victor.raj@intel.com> Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: Add initial support framework for LAGDave Ertman
Add the framework and initial implementation for receiving and processing netdev bonding events. This is only the software support and the implementation of the HW offload for bonding support will be coming at a later time. There are some architectural gaps that need to be closed before that happens. Because this is a software only solution that supports in kernel bonding, SR-IOV is not supported with this implementation. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: Remove xsk_buff_pool from VSI structureMichal Swiatkowski
Current implementation of netdev already contains xsk_buff_pools. We no longer have to contain these structures in ice_vsi. Refactor the code to operate on netdev-provided xsk_buff_pools. Move scheduling napi on each queue to a separate function to simplify setup function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: implement new LLDP filter commandDave Ertman
There is an issue with some NVMs where an already existent LLDP filter is blocking the creation of a filter to allow LLDP packets to be redirected to the default VSI for the interface. This is blocking all LLDP functionality based in the kernel when the FW LLDP agent is disabled (e.g. software based DCBx). Implement the new AQ command to allow adding VSI destinations to existent filters on NVM versions that support the new command. The new lldp_fltr_ctrl AQ command supports Rx filters only, so the code flow for adding filters to disable Tx of control frames will remain intact. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-08ice: log message when trusted VF goes in/out of promisc modeBrett Creeley
Currently there is no message printed on the host when a VF goes in and out of promiscuous mode. This is causing confusion because this is the expected behavior based on i40e. Fix this. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: remove dead codeBruce Allan
The check for a NULL pf pointer is moot since the earlier declaration and assignment of struct device *dev already de-referenced the pointer. Also, the only caller of ice_set_dflt_mib() already ensures pf is not NULL. Cc: Dave Ertman <david.m.ertman@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: use flex_array_size where possibleBruce Allan
Use the flex_array_size() helper with the recently added flexible array members in structures. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: Replace one-element array with flexible-array memberGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct ice_res_tracker, instead of a one-element array and use the struct_size() helper to calculate the size for the allocations. Also, notice that the code below suggests that, currently, two too many bytes are being allocated with devm_kzalloc(), as the total number of entries (pf->irq_tracker->num_entries) for pf->irq_tracker->list[] is _vectors_ and sizeof(*pf->irq_tracker) also includes the size of the one-element array _list_ in struct ice_res_tracker. drivers/net/ethernet/intel/ice/ice_main.c:3511: 3511 /* populate SW interrupts pool with number of OS granted IRQs. */ 3512 pf->num_avail_sw_msix = (u16)vectors; 3513 pf->irq_tracker->num_entries = (u16)vectors; 3514 pf->irq_tracker->end = pf->irq_tracker->num_entries; With this change, the right amount of dynamic memory is now allocated because, contrary to one-element arrays which occupy at least as much space as a single object of the type, flexible-array members don't occupy such space in the containing structure. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Built-tested-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: display stored UNDI firmware version via devlink infoJacob Keller
Just as we recently added support for other stored firmware flash versions, support display of the stored UNDI Option ROM version via devlink info. To do this, we need to introduce a new ice_get_inactive_orom_ver function. This is a little trickier than with other flash versions. The Option ROM version data was being read from a special "Boot Configuration" block of the NVM Preserved Field Area. This block only contains the *active* Option ROM version data. It is populated when the device firmware finishes updating the Option ROM. This method is ineffective at reading the stored Option ROM version data. Instead of reading from this section of the flash, replace this version extraction with one which locates the Combo Version information from within the Option ROM binary. This data is stored within the Option ROM at a 512 byte offset, in a simple structured format. The structure uses a simple modulo 256 checksum for integrity verification. Scan through the Option ROM to locate the CIVD data section, and extract the Combo Version. Refactor ice_get_orom_ver_info so that it takes the bank select enumeration parameter. Use this to implement ice_get_inactive_orom_ver. Although all ice devices have a Boot Configuration block in the NVM PFA, not all devices have a valid Option ROM. In this case, the old ice_get_orom_ver_info would "succeed" but report a version of all zeros. The new implementation would fail to locate the $CIV section in the Option ROM and report an error. Thus, we must ensure that ice_init_nvm does not fail if ice_get_orom_ver_info fails. Use the new ice_get_inactive_orom_ver to allow reporting the Option ROM versions for a pending update via devlink info. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: display stored netlist versions via devlink infoJacob Keller
Add a function to read the inactive netlist bank for version information. To support this, refactor how we read the netlist version data. Instead of using the firmware AQ interface with a module ID, read from the flash as a flat NVM, using ice_read_flash_module. This change requires a slight adjustment to the offset values used, as reading from the flat NVM includes the type field (which was stripped by firmware previously). Cleanup the macro names and move them to ice_type.h. For clarity in how we calculate the offsets and so that programmers can easily map the offset value to the data sheet, use a wrapper macro to account for the offset adjustments. Use the newly added ice_get_inactive_netlist_ver function to extract the version data from the pending netlist module update. Add the stored variants of "fw.netlist", and "fw.netlist.build" to the info version map array. With this change, we now report the "fw.netlist" and "fw.netlist.build" versions into the stored section of the devlink info report. As with the main NVM module versions, if there is no pending update, we report the currently active values as stored. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: display some stored NVM versions via devlink infoJacob Keller
The devlink info interface supports drivers reporting "stored" versions. These versions indicate the version of an update that has been downloaded to the device, but is not yet active. The code for extracting the NVM version recently changed to enable support for reading from either the active or the inactive bank. Use this to implement ice_get_inactive_nvm_ver, which will read the NVM version data from the inactive section of flash. When reporting the versions via devlink info, first read the device capabilities. Determine if there is a pending flash update, and if so, extract relevant version information from the inactive flash. Store these within the info context structure. When reporting "stored" firmware versions, devlink documentation indicates that we ought to always report a stored value, even if there is no pending update. In this common case, the stored version should match the running version. This means that each stored version should by default fallback to the same value as reported by the running handler. To support this, modify the version structure to have both a "getter" and a "fallback". Modify the control loop so that it will use the "fallback" function if the "getter" function does not report a version. To report versions for which we can read the stored value, use a new "stored()" macro. This macro will insert two entries into the version list. The first entry is the traditional running version. The second is the stored version, implemented with a fallback to the active version. This is a little tricky, but reduces the overall duplication of elements in the entry list, and ensures that running and stored values remain consistent. To avoid some duplication, add a combined() macro that will insert both the running and stored versions into the version entry list. Using this new support, add pending version reporter functions for "fw.psid.api" and "fw.bundle_id". This enables reporting the stored values for some of versions in the NVM module of the flash. Reporting management versions is not implemented by this patch. The active management version is reported to the driver via the AdminQ mailbox during load. Although the version must be in the firmware binary somewhere, accessing this from the inactive firmware is not trivial and has not been implemented in this change. Future changes will introduce support for reading the UNDI Option ROM version and the version associated with the Netlist module. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: introduce function for reading from flash modulesJacob Keller
When reading from the flash memory of the device, the ice driver has two interfaces available to it. First, it can use a mediated interface via firmware that allows specifying a module ID. This allows reading from specific modules of the active flash bank. The second interface available is to perform flat reads. This allows complete access to the entire flash. However, using it requires the software to handle calculating module location and interpret pointer addresses. While most data required is accessible through the convenient first interface, certain flash contents are not. This includes the CSS header information associated with the Option ROM and NVM banks, as well as any access to the "inactive" banks used as scratch space for performing flash updates. In order to access all of the relevant flash contents, software must use the flat reads. Rather than forcing all flows to perform flat read calculations, introduce a new abstraction for reading from the flash: ice_read_flash_module. This function provides an abstraction for reading from either the active or inactive flash bank at the requested module. This interface is very similar to the abstraction provided via firmware, but allows access to additional modules, as well as providing a mechanism to request access to both flash banks. At first glance, it might make sense for this abstraction to allow specifying precisely which bank (1st or 2nd) the caller wishes to read. This is simpler to implement but more difficult to use. In practice, most callers only know whether they want the active bank, or the inactive bank. Rather than force callers to determine for themselves which bank to read from, implement ice_read_flash_module in terms of "active" vs "inactive". This significantly simplifies the implementation at the caller level and is a more useful abstraction over the flash contents. Make use of this new interface to refactor reading of the main NVM version information. Instead of using the firmware's mediated ShadowRAM function, use the ice_read_flash_module abstraction. To do this, notice that most reads of the NVM are going to be in 2-byte word chunks. To simplify using ice_read_flash_module for this case, ice_read_nvm_module is introduced. This is a simple wrapper around ice_read_flash_module which takes the correct pointer address for the NVM bank, and forces the 2-byte word format onto the caller. When reading the NVM versions, some fields are read from the Shadow RAM. The Shadow RAM is the first 64KB of flash memory, and is populated during device load. Most fields are copied from a section within the active NVM bank. In order to read this data from both the active and inactive NVM banks, we need to read not from the first 64KB of flash, but instead from the correct offset into the NVM bank. Introduce ice_read_nvm_sr_copy for this purpose. This function wraps around ice_read_nvm_module and has the same interface as the ice_read_sr_word, with the exception of allowing the caller to specify whether to read the active or inactive flash bank. With this change, it is now trivial to refactor ice_get_nvm_ver_info to read using the software mediated ice_read_flash_module interface instead of relying on the firmware mediated interface. This will be used in the following change to implement support for stored versions in the devlink info report. Additionally, the overall ice_read_flash_module interface will be used and extended to support all three major flash banks, and additionally to support reading the flash image security revision information. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: cache NVM module bank informationJacob Keller
The ice flash contains two copies of each of the NVM, Option ROM, and Netlist modules. Each bank has a pointer word and a size word. In order to correctly read from the active flash bank, the driver must calculate the offset manually. During NVM initialization, read the Shadow RAM control word and determine which bank is active for each NVM module. Additionally, cache the size and pointer values for use in calculating the correct offset. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: introduce context struct for info reportJacob Keller
The ice driver uses an array of structures which link an info name with a function that formats the associated version data into a string. All existing format functions simply format already captured static data from the driver hw structure. Future changes will introduce format functions for reporting the versions of flash sections stored but not yet applied. This type of version data is not stored as a member of the hw structure. This is because (a) it might not yet exist in the case there is no pending flash update, and (b) even if it does, it might change such as if an update is canceled or replaced by a new update before finalizing. We could simply have each format function gather its own data upon being called. However, in some cases the raw binary version data is a combination of multiple different reported fields. Additionally, the current interface doesn't have a way for the function to indicate that the version doesn't exist. Refactor this function interface to take a new ice_info_ctx structure instead of the buffer pointer and length. This context structure allows for future extensions to pre-gather version data that is stored within the context struct instead of the hw struct. Allocate this context structure initially at the start of ice_devlink_info_get. We use dynamic allocation instead of a local stack variable in order to avoid using too much kernel stack once we extend it with additional data structures. Modify the main loop that drives the info reporting so that the version buffer string is always cleared between each format. Explicitly check that the format function actually filled in a version string of non-zero length. If the string is not provided, simply skip this version without reporting an error. This allows for introducing format functions of versions which may or may not be present, such as the version of a pending update that has not yet been activated. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-05ice: create flash_info structure and separate NVM versionJacob Keller
The ice_nvm_info structure has become somewhat of a dumping ground for all of the fields related to flash version. It holds the NVM version and EETRACK id, the OptionROM info structure, the flash size, the ShadowRAM size, and more. A future change is going to add the ability to read the NVM version and EETRACK ID from the inactive NVM bank. To make this simpler, it is useful to have these NVM version info fields extracted to their own structure. Rename ice_nvm_info into ice_flash_info, and create a separate ice_nvm_info structure that will contain the eetrack and NVM map version. Move the netlist_ver structure into ice_flash_info and rename it ice_netlist_info for consistency. Modify the static ice_get_orom_ver_info to take the option rom structure as a pointer. This makes it more obvious what portion of the hw struct is being modified. Do the same for ice_get_netlist_ver_info. Introduce a new ice_get_nvm_ver_info function, which will be similar to ice_get_orom_ver_info and ice_get_netlist_ver_info, used to keep the NVM version extraction code co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>