aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/intel/igb
AgeCommit message (Collapse)Author
2021-07-28igb: Check if num of q_vectors is smaller than max before array accessAleksandr Loktionov
[ Upstream commit 6c19d772618fea40d9681f259368f284a330fd90 ] Ensure that the adapter->q_vector[MAX_Q_VECTORS] array isn't accessed beyond its size. It was fixed by using a local variable num_q_vectors as a limit for loop index, and ensure that num_q_vectors is not bigger than MAX_Q_VECTORS. Fixes: 047e0030f1e6 ("igb: add new data structure for handling interrupts and NAPI") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Reviewed-by: Mateusz Palczewski <mateusz.placzewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-28igb: Fix an error handling path in 'igb_probe()'Christophe JAILLET
[ Upstream commit fea03b1cebd653cd095f2e9a58cfe1c85661c363 ] If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function. Fixes: 40a914fa72ab ("igb: Add support for pci-e Advanced Error Reporting") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-28igb: Fix use-after-free error during resetVinicius Costa Gomes
[ Upstream commit 7b292608db23ccbbfbfa50cdb155d01725d7a52e ] Cleans the next descriptor to watch (next_to_watch) when cleaning the TX ring. Failure to do so can cause invalid memory accesses. If igb_poll() runs while the controller is reset this can lead to the driver try to free a skb that was already freed. (The crash is harder to reproduce with the igb driver, but the same potential problem exists as the code is identical to igc) Fixes: 7cc6fd4c60f2 ("igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ring") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reported-by: Erez Geva <erez.geva.ext@siemens.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21igb: reinit_locked() should be called with rtnl_lockFrancesco Ruggeri
[ Upstream commit 024a8168b749db7a4aa40a5fbdfa04bf7e77c1c0 ] We observed two panics involving races with igb_reset_task. The first panic is caused by this race condition: kworker reboot -f igb_reset_task igb_reinit_locked igb_down napi_synchronize __igb_shutdown igb_clear_interrupt_scheme igb_free_q_vectors igb_free_q_vector adapter->q_vector[v_idx] = NULL; napi_disable Panics trying to access adapter->q_vector[v_idx].napi_state The second panic (a divide error) is caused by this race: kworker reboot -f tx packet igb_reset_task __igb_shutdown rtnl_lock() ... igb_clear_interrupt_scheme igb_free_q_vectors adapter->num_tx_queues = 0 ... rtnl_unlock() rtnl_lock() igb_reinit_locked igb_down igb_up netif_tx_start_all_queues dev_hard_start_xmit igb_xmit_frame igb_tx_queue_mapping Panics on r_idx % adapter->num_tx_queues This commit applies to igb_reset_task the same changes that were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch adds support for DCB to the kernel and ixgbe driver"), commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in reset subtask"). Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-20igb: Report speed and duplex as unknown when device is runtime suspendedKai-Heng Feng
commit 165ae7a8feb53dc47fb041357e4b253bfc927cf9 upstream. igb device gets runtime suspended when there's no link partner. We can't get correct speed under that state: $ cat /sys/class/net/enp3s0/speed 1000 In addition to that, an error can also be spotted in dmesg: [ 385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost Since device can only be runtime suspended when there's no link partner, we can skip reading register and let the following logic set speed and duplex with correct status. The more generic approach will be wrap get_link_ksettings() with begin() and complete() callbacks. However, for this particular issue, begin() calls igb_runtime_resume() , which tries to rtnl_lock() while the lock is already hold by upper ethtool layer. So let's take this approach until the igb_runtime_resume() no longer needs to hold rtnl_lock. CC: stable <stable@vger.kernel.org> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-05igb: Fix SGMII SFP module discovery for 100FX/LX.Manfred Rudigier
[ Upstream commit 5365ec1aeff5b9f2962a9c9b31d63f9dad7e0e2d ] Changing the link mode should also be done for 100BaseFX SGMII modules, otherwise they just don't work when the default link mode in CTRL_EXT coming from the EEPROM is SERDES. Additionally 100Base-LX SGMII SFP modules are also supported now, which was not the case before. Tested with an i210 using Flexoptix S.1303.2M.G 100FX and S.1303.10.G 100LX SGMII SFP modules. Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01igb: shorten maximum PHC timecounter update intervalMiroslav Lichvar
[ Upstream commit 094bf4d0e9657f6ea1ee3d7e07ce3970796949ce ] The timecounter needs to be updated at least once per ~550 seconds in order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old timestamp. Since commit 500462a9d ("timers: Switch to a non-cascading wheel"), scheduling of delayed work seems to be less accurate and a requested delay of 540 seconds may actually be longer than 550 seconds. Shorten the delay to 480 seconds to be sure the timecounter is updated in time. This fixes an issue with HW timestamps on 82580/I350/I354 being off by ~1100 seconds for few seconds every ~9 minutes. Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12igb: Fix constant media auto sense switching when no cable is connectedManfred Rudigier
[ Upstream commit 8d5cfd7f76a2414e23c74bb8858af7540365d985 ] At least on the i350 there is an annoying behavior that is maybe also present on 82580 devices, but was probably not noticed yet as MAS is not widely used. If no cable is connected on both fiber/copper ports the media auto sense code will constantly swap between them as part of the watchdog task and produce many unnecessary kernel log messages. The swap code responsible for this behavior (switching to fiber) should not be executed if the current media type is copper and there is no signal detected on the fiber port. In this case we can safely wait until the AUTOSENSE_EN bit is cleared. Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-08igb: Fix WARN_ONCE on runtime suspendArvind Sankar
[ Upstream commit dabb8338be533c18f50255cf39ff4f66d4dabdbe ] The runtime_suspend device callbacks are not supposed to save configuration state or change the power state. Commit fb29f76cc566 ("igb: Fix an issue that PME is not enabled during runtime suspend") changed the driver to not save configuration state during runtime suspend, however the driver callback still put the device into a low-power state. This causes a warning in the pci pm core and results in pci_pm_runtime_suspend not calling pci_save_state or pci_finish_runtime_suspend. Fix this by not changing the power state either, leaving that to pci pm core, and make the same change for suspend callback as well. Also move a couple of defines into the appropriate header file instead of inline in the .c file. Fixes: fb29f76cc566 ("igb: Fix an issue that PME is not enabled during runtime suspend") Signed-off-by: Arvind Sankar <niveditas98@gmail.com> Reviewed-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12igb: Fix an issue that PME is not enabled during runtime suspendKai-Heng Feng
[ Upstream commit 1fb3a7a75e2efcc83ef21f2434069cddd6fae6f5 ] I210 ethernet card doesn't wakeup when a cable gets plugged. It's because its PME is not set. Since commit 42eca2302146 ("PCI: Don't touch card regs after runtime suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops calling pci_finish_runtime_suspend(), which enables the PCI PME. To fix the issue, let's not to save PCI states when it's runtime suspend, to let the PCI subsystem enables PME. Fixes: 42eca2302146 ("PCI: Don't touch card regs after runtime suspend D3") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17igb: fix uninitialized variablesYunjian Wang
[ Upstream commit e4c39f7926b4de355f7df75651d75003806aae09 ] This patch fixes the variable 'phy_word' may be used uninitialized. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-08-03igb: Fix queue selection on MAC filters on i210Vinicius Costa Gomes
[ Upstream commit 4dc93fcf0b95dc3fda4db917effae31fbb8ad2a8 ] On the RAH registers there are semantic differences on the meaning of the "queue" parameter for traffic steering depending on the controller model: there is the 82575 meaning, which "queue" means a RX Hardware Queue, and the i350 meaning, where it is a reception pool. The previous behaviour was having no effect for i210 based controllers because the QSEL bit of the RAH register wasn't being set. This patch separates the condition in discrete cases, so the different handling is clearer. Fixes: 83c21335c876 ("igb: improve MAC filter handling") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26igb: Clear TXSTMP when ptp_tx_work() is timeoutDaniel Hua
[ Upstream commit 3a53285228165225a7f76c7d5ff1ddc0213ce0e4 ] Problem description: After ethernet cable connect and disconnect for several iterations on a device with i210, tx timestamp will stop being put into the socket. Steps to reproduce: 1. Setup a device with i210 and wire it to a 802.1AS capable switch ( Extreme Networks Summit x440 is used in our case) 2. Have the gptp daemon running on the device and make sure it is synced with the switch 3. Have the switch disable and enable the port, wait for the device gets resynced with the switch 4. Iterates step 3 until the device is not albe to get resynced 5. Review the log in dmesg and you will see warning message "igb : clearing Tx timestamp hang" Root cause: If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK DOWN event will be processed before ptp_tx_work(), which may cause timeout in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit timestamp valid bit) is not cleared, causing no new timestamp loaded to TXSTMP register. Consequently therefore, no new interrupt is triggerred by TSICR.TXTS bit and no more Tx timestamp send to the socket. Signed-off-by: Daniel Hua <daniel.hua@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26igb: Allow to remove administratively set MAC on VFsCorinna Vinschen
[ Upstream commit 177132df5e45b134c147f419f567a3b56aafaf2b ] Before libvirt modifies the MAC address and vlan tag for an SRIOV VF for use by a virtual machine (either using vfio device assignment or macvtap passthru mode), it saves the current MAC address and vlan tag so that it can reset them to their original value when the guest is done. Libvirt can't leave the VF MAC set to the value used by the now-defunct guest since it may be started again later using a different VF, but it certainly shouldn't just pick any random value, either. So it saves the state of everything prior to using the VF, and resets it to that. The igb driver initializes the MAC addresses of all VFs to 00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK netlink message, also visible in the list of VFs in the output of "ip link show"). But when libvirt attempts to restore the MAC address back to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel responds with "Invalid argument". Forbidding a reset back to the original value leaves the VF MAC at the value set for the now-defunct virtual machine. Especially on a system with NetworkManager enabled, this has very bad consequences, since NetworkManager forces all interfacess to be IFF_UP all the time - if the same virtual machine is restarted using a different VF (or even on a different host), there will be multiple interfaces watching for traffic with the same MAC address. To allow libvirt to revert to the original state, we need a way to remove the administrative set MAC on a VF, to allow normal host operation again, and to reset/overwrite the VF MAC via VF netdev. This patch implements the outlined scenario by allowing to set the VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF. igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0, so it's possible to reset the VF MAC back to the original value via the VF netdev. Note: Recent patches to libvirt allow for a workaround if the NIC isn't capable of resetting the administrative MAC back to all 0, but in theory the NIC should allow resetting the MAC in the first place. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <arron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03igb: Free IRQs when device is hotpluggedLyude Paul
commit 888f22931478a05bc81ceb7295c626e1292bf0ed upstream. Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon hotplugging my kernel would immediately crash due to igb: [ 680.825801] kernel BUG at drivers/pci/msi.c:352! [ 680.828388] invalid opcode: 0000 [#1] SMP [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O 4.15.0-rc3Lyude-Test+ #6 [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017 [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178 [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00 [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298 [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000 [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000 [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0 [ 680.837954] Call Trace: [ 680.838853] pci_disable_msix+0xce/0xf0 [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] [ 680.840278] igb_remove+0x9d/0x110 [igb] [ 680.840764] pci_device_remove+0x36/0xb0 [ 680.841279] device_release_driver_internal+0x157/0x220 [ 680.841739] pci_stop_bus_device+0x7d/0xa0 [ 680.842255] pci_stop_bus_device+0x2b/0xa0 [ 680.842722] pci_stop_bus_device+0x3d/0xa0 [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 [ 680.843627] trim_stale_devices+0xf3/0x140 [ 680.844086] trim_stale_devices+0x94/0x140 [ 680.844532] trim_stale_devices+0xa6/0x140 [ 680.845031] ? get_slot_status+0x90/0xc0 [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 [ 680.846581] ? free_bridge+0x100/0x100 [ 680.847113] acpi_device_hotplug+0x8a/0x490 [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 [ 680.848076] process_one_work+0x182/0x3a0 [ 680.848543] worker_thread+0x2e/0x380 [ 680.848963] ? process_one_work+0x3a0/0x3a0 [ 680.849373] kthread+0x111/0x130 [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 [ 680.850188] ret_from_fork+0x1f/0x30 [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 As it turns out, normally the freeing of IRQs that would fix this is called inside of the scope of __igb_close(). However, since the device is already gone by the point we try to unregister the netdevice from the driver due to a hotplug we end up seeing that the netif isn't present and thus, forget to free any of the device IRQs. So: make sure that if we're in the process of dismantling the netdev, we always allow __igb_close() to be called so that IRQs may be freed normally. Additionally, only allow igb_close() to be called from __igb_close() if it hasn't already been called for the given adapter. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach") Cc: Todd Fujinaka <todd.fujinaka@intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25igb: check memory allocation failureChristophe JAILLET
[ Upstream commit 18eb86362a52f0af933cc0fd5e37027317eb2d1c ] Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30igb: Use smp_rmb rather than read_barrier_dependsBrian King
commit c4cb99185b4cc96c0a1c70104dc21ae14d7e7f28 upstream. The original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with igb as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-26igb: Fix TX map failure pathJean-Philippe Brucker
When the driver cannot map a TX buffer, instead of rolling back gracefully and retrying later, we currently get a panic: [ 159.885994] igb 0000:00:00.0: TX DMA map failed [ 159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8 ... [ 159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8 Fix the erroneous test that leads to this situation. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: support BCM54616 PHYJohn W Linville
The management port on an Edgecore AS7712-32 switch uses an igb MAC, but it uses a BCM54616 PHY. Without a patch like this, loading the igb module produces dmesg output like this: [ 3.439125] igb: Copyright (c) 2007-2014 Intel Corporation. [ 3.439866] igb: probe of 0000:00:14.0 failed with error -2 Signed-off-by: John W Linville <linville@tuxdriver.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: do not drop PF mailbox lock after read of VF messageGreg Edwards
When the PF receives a mailbox message from the VF, it grabs the mailbox lock, reads the VF message from the mailbox, ACKs the message and drops the lock. While the PF is performing the action for the VF message, nothing prevents another VF message from being posted to the mailbox. The current code handles this condition by just dropping any new VF messages without processing them. This results in a mailbox timeout in the VM for posted messages waiting for an ACK, and the VF is reset by the igbvf_watchdog_task in the VM. Given the right sequence of VF messages and mailbox timeouts, this condition can go on ad infinitum. Modify the PF mailbox read method to take an 'unlock' argument that optionally leaves the mailbox locked by the PF after reading the VF message. This ensures another VF message is not posted to the mailbox until after the PF has completed processing the VF message and written its reply. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: expose mailbox unlock methodGreg Edwards
Add a mailbox unlock method to e1000_mbx_operations, which will be used to unlock the PF/VF mailbox by the PF. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: add argument names to mailbox op function declarationsGreg Edwards
Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: Remove incorrect "unexpected SYS WRAP" log messageCorinna Vinschen
TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000 secs on 80576 (45 bit SYSTIM). This wrap event sets the TSICR.SysWrap bit unconditionally. However, checking TSIM at interrupt time shows that this event does not actually cause the interrupt. Rather, it's just bycatch while the actual interrupt is caused by, for instance, TSICR.TXTS. The conclusion is that the SYS WRAP is actually expected, so the "unexpected SYS WRAP" message is entirely bogus and just helps to confuse users. Drop it. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: protect TX timestamping from API misuseCliff Spradlin
HW timestamping can only be requested for a packet if the NIC is first setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb driver still allowed TX packets to request HW timestamping. In this situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never clear. This prevented any future HW timestamping requests to succeed. Fix this by checking that the NIC is configured for HW TX timestamping before accepting a HW TX timestamping request. Signed-off-by: Cliff Spradlin <cspradlin@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-08igb: Fix error of RX network flow classificationGangfeng Huang
After add an ethertype filter, if user change the adapter speed several times, the error "ethtool -N: etype filters are all used" is reported by igb driver. In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore() is not paried. igb_nfc_filter_restore() exist in igb_up(), but function igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed changing, only igb_nfc_filter_restore() is called, it will take a position of ethertype bitmap. Reproduce steps: Step 1: Add a etype filter by ethtool $ethtool -N eth0 flow-type ether proto 0x88F8 action 1 Step 2: Change the adapter speed to 100M/full duplex $ethtool -s eth0 speed 100 duplex full Step 3: Change the adapter speed to 1000M/full duplex ethtool -s eth0 speed 1000 duplex full Repeat step2 and step3, then dmesg the system log, you can find the error message, add new ethtype filter is also failed. This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down() to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired. Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-07igb: make a few local functions staticColin Ian King
Clean up a few sparse warnings, these following functions can be made static: drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_add_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_del_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_set_vf_mac_filter' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: Remove useless argumentBenjamin Poirier
Given that all callers of igb_update_stats() pass the same two arguments: (adapter, &adapter->stats64), the second argument can be removed. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: check for Tx timestamp timeouts during watchdogJacob Keller
The igb driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an igb_ptp_tx_hang() function similar to the already existing igb_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: add statistic indicating number of skipped Tx timestampsJacob Keller
The igb driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: avoid permanent lock of *_PTP_TX_IN_PROGRESSJacob Keller
The igb driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during igb_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in igb_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: fix race condition with PTP_TX_IN_PROGRESS bitsJacob Keller
Hardware related to the igb driver has a limitation of only handling one Tx timestamp at a time. Thus, the driver uses a state bit lock to enforce that only one timestamp request is honored at a time. Unfortunately this suffers from a simple race condition. The bit lock is not cleared until after skb_tstamp_tx() is called notifying the stack of a new Tx timestamp. Even a well behaved application which sends only one timestamp request at once and waits for a response might wake up and send a new packet before the bit lock is cleared. This results in needlessly dropping some Tx timestamp requests. We can fix this by unlocking the state bit as soon as we read the Timestamp register, as this is the first point at which it is safe to unlock. To avoid issues with the skb pointer, we'll use a copy of the pointer and set the global variable in the driver structure to NULL first. This ensures that the next timestamp request does not modify our local copy of the skb pointer. This ensures that well behaved applications do not accidentally race with the unlock bit. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: mark PM functions as __maybe_unusedArnd Bergmann
The new wake function is only used by the suspend/resume handlers that are defined in inside of an #ifdef, which can cause this harmless warning: drivers/net/ethernet/intel/igb/igb_main.c:7988:13: warning: 'igb_deliver_wake_packet' defined but not used [-Wunused-function] Removing the #ifdef, instead using a __maybe_unused annotation simplifies the code and avoids the warning. Fixes: b90fa8763560 ("igb: Enable reading of wake up packet") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: Explicitly select page 0 at initializationMatwey V Kornilov
The functions igb_read_phy_reg_gs40g/igb_write_phy_reg_gs40g (which were removed in 2a3cdea) explicitly selected the required page at every phy_reg access. Currently, igb_get_phy_id_82575 relays on the fact that page 0 is already selected. The assumption is not fulfilled for my Lex 3I380CW motherboard with integrated dual i211 based gigabit ethernet. This leads to igb initialization failure and network interfaces are not working: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k igb: Copyright (c) 2007-2014 Intel Corporation. igb: probe of 0000:01:00.0 failed with error -2 igb: probe of 0000:02:00.0 failed with error -2 In order to fix it, we explicitly select page 0 before first access to phy registers. See also: https://bugzilla.suse.com/show_bug.cgi?id=1009911 See also: http://www.lex.com.tw/products/pdf/3I380A&3I380CW.pdf Fixes: 2a3cdea ("igb: Remove GS40G specific defines/functions") Cc: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Matwey V Kornilov <matwey@sai.msu.ru> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-05-21net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALLMiroslav Lichvar
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid filter and update drivers which can timestamp all packets, or which explicitly list unsupported filters instead of using a default case, to handle the filter. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08scripts/spelling.txt: add regsiter -> register spelling mistakeStephen Boyd
This typo is quite common. Fix it and add it to the spelling file so that checkpatch catches it earlier. Link: http://lkml.kernel.org/r/20170317011131.6881-2-sboyd@codeaurora.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-20igb: Enable reading of wake up packetKim Tatt Chuah
Currently, in igb_resume(), igb driver ignores the Wake Up Status (WUS) and Wake Up Packet Memory (WUPM) registers. This patch enables the igb driver to read the WUPM if the controller was woken by a wake up packet that is not more than 128 bytes long (maximum WUPM size), then pass it up the kernel network stack. Signed-off-by: Kim Tatt Chuah <kim.tatt.chuah@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-20igb/igbvf: Add VF MAC filter request capabilitiesYury Kylulin
Add functionality for the VF to request up to 3 additional MAC filters. This is done using existing E1000_VF_SET_MAC_ADDR message, but with additional message info - E1000_VF_MAC_FILTER_CLR to clear all unicast MAC filters previously set for this VF and E1000_VF_MAC_FILTER_ADD to add MAC filter. Additional filters can be added only in case if administrator did not set VF MAC explicitly and allowed to change default MAC to the VF. Due to the limited number of RAR entries reserve at least 3 MAC filters for the PF. If SRIOV is supported by the NIC after this change RAR entries starting from 1 to (RAR MAX ENTRIES - NUM SRIOV VFS) will be used for PF and VF MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-20igb: improve MAC filter handlingYury Kylulin
Using the work which was done for ixgbe driver by Jacob Keller commit 5d7daa35b9eb ("ixgbe: improve mac filter handling") and Alexander Duyck commit 0f079d22834a ("ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses") and out-of-tree igb driver add functionality to manage (add and delete) MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-21igb: use new API ethtool_{get|set}_link_ksettingsPhilippe Reynes
The ethtool API {get|set}_settings is deprecated. We move this driver to new API {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb/ixgbe: Fix typo in igb_build_skb and/or ixgbe_build_skb code commentAlexander Duyck
There was a typo that I had left in the code comments for the igb and ixgbe functions that enabled build_skb support. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Re-add support for build_skb in igbAlexander Duyck
This reverts commit f9d40f6a9921 ("igb: Revert support for build_skb in igb") and adds a few changes to update it to work with the latest version of igb. We are now able to revert the removal of this due to the fact that with the recent changes to the page count and the use of DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be invalidating the additional data added when we call build_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Break out Rx buffer page managementAlexander Duyck
At this point we have 2 to 3 paths that can be taken depending on what Rx modes are enabled. In order to better support that and improve the maintainability I am breaking out the common bits from those paths and making them into their own functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for padding packetAlexander Duyck
With the size of the frame limited we can now write to an offset within the buffer instead of having to write at the very start of the buffer. The advantage to this is that it allows us to leave padding room for things like supporting XDP in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for using order 1 pages to receive large framesAlexander Duyck
This patch adds support for using 3K buffers in order 1 pages the same way we were using 2K buffers in 4K pages. We are reserving 1K of room for now to have space available for future headroom and tailroom when we enable build_skb support. One side effect of this patch is that we can end up using a larger buffer if jumbo frames is enabled. The impact shouldn't be too great, but it could hurt small packet performance for UDP workloads if jumbo frames is enabled as the truesize of frames will be larger. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for ethtool private flag to allow use of legacy RxAlexander Duyck
Since there are potential drawbacks to the new Rx allocation approach I thought it best to add a "chicken bit" so that we can turn the feature off if in the event that a problem is found. It also provides a means of validating the legacy Rx path in the event that we are forced to fall back. At some point in the future when we are convinced we don't need it anymore we might be able to drop the legacy-rx flag. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Use page_address offset from page instead of masking virtual addressAlexander Duyck
Update the handling of page addresses so that we always refer to them using a void pointer, and try to use the consistent name of va indicating we are working with a virtual address. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Only sync size of expected frame in ethtool testingAlexander Duyck
We only need to sync the size of the frame that is read to test. We don't need to sync the entire Rx buffer. This way the testing is more consistent with how we handle things in the receive path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Limit maximum frame Rx based on MTUAlexander Duyck
In order to support the use of build_skb going forward it will be necessary to place a maximum limit on the amount of data we can receive when jumbo frames is not enabled. In order to do this I am adding a new upper limit for receive based on the size of a 2K buffer minus padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ringAlexander Duyck
In the case of the Tx rings we need to only clear the Tx buffer_info when we are resetting the rings. Ideally we do this when we configure the ring to bring it back up instead of when we are taking it down in order to avoid dirtying pages we don't need to. In addition we don't need to clear the Tx descriptor ring since we will fully repopulate it when we begin transmitting frames and next_to_watch can be cleared to prevent the ring from being cleaned beyond that point instead of needing to touch anything in the Tx descriptor ring. Finally with these changes we can avoid having to reset the skb member of the Tx buffer_info structure in the cleanup path since the skb will always be associated with the first buffer which has next_to_watch set. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Clear Rx buffer_info in configure instead of cleanAlexander Duyck
This change makes it so that instead of going through the entire ring on Rx cleanup we only go through the region that was designated to be cleaned up and stop when we reach the region where new allocations should start. In addition we can avoid having to perform a memset on the Rx buffer_info structures until we are about to start using the ring again. By deferring this we can avoid dirtying the cache any more than we have to which can help to improve the time needed to bring the interface down and then back up again in a reset or suspend/resume cycle. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>