aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/ibm
AgeCommit message (Collapse)Author
2022-04-15ibmvnic: define map_txpool_buf_to_ltb()Sukadev Bhattiprolu
Define a helper to map a given txpool buffer into its corresponding long term buffer (LTB) and offset. Currently there is just one LTB per txpool so the mapping is trivial. When we add support for multiple LTBs per txpool, this helper will be more useful. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15ibmvnic: define map_rxpool_buf_to_ltb()Sukadev Bhattiprolu
Define a helper to map a given rx pool buffer into its corresponding long term buffer (LTB) and offset. Currently there is just one LTB per pool so the mapping is trivial. When we add support for multiple LTBs per pool, this helper will be more useful. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15ibmvnic: rename local variable index to bufidxSukadev Bhattiprolu
The local variable 'index' is heavily used in some functions and is confusing with the presence of other "index" fields like pool->index, ->consumer_index, etc. Rename it to bufidx to better reflect that its the index of a buffer in the pool. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Merge in overtime fixes, no conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18ibmvnic: fix race between xmit and resetSukadev Bhattiprolu
There is a race between reset and the transmit paths that can lead to ibmvnic_xmit() accessing an scrq after it has been freed in the reset path. It can result in a crash like: Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000016189f8 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic] LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 Call Trace: [c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable) [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 [c000000000c9cfcc] sch_direct_xmit+0xec/0x330 [c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0 [c000000000c00ad4] __dev_queue_xmit+0x394/0x730 [c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding] [c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding] [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 [c000000000c00ca4] __dev_queue_xmit+0x564/0x730 [c000000000cf97e0] neigh_hh_output+0xd0/0x180 [c000000000cfa69c] ip_finish_output2+0x31c/0x5c0 [c000000000cfd244] __ip_queue_xmit+0x194/0x4f0 [c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0 [c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0 [c000000000d2d984] tcp_retransmit_skb+0x34/0x130 [c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0 [c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330 [c000000000d317bc] tcp_write_timer+0x5c/0x200 [c000000000243270] call_timer_fn+0x50/0x1c0 [c000000000243704] __run_timers.part.0+0x324/0x460 [c000000000243894] run_timer_softirq+0x54/0xa0 [c000000000ea713c] __do_softirq+0x15c/0x3e0 [c000000000166258] __irq_exit_rcu+0x158/0x190 [c000000000166420] irq_exit+0x20/0x40 [c00000000002853c] timer_interrupt+0x14c/0x2b0 [c000000000009a00] decrementer_common_virt+0x210/0x220 --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c The immediate cause of the crash is the access of tx_scrq in the following snippet during a reset, where the tx_scrq can be either NULL or an address that will soon be invalid: ibmvnic_xmit() { ... tx_scrq = adapter->tx_scrq[queue_num]; txq = netdev_get_tx_queue(netdev, queue_num); ind_bufp = &tx_scrq->ind_buf; if (test_bit(0, &adapter->resetting)) { ... } But beyond that, the call to ibmvnic_xmit() itself is not safe during a reset and the reset path attempts to avoid this by stopping the queue in ibmvnic_cleanup(). However just after the queue was stopped, an in-flight ibmvnic_complete_tx() could have restarted the queue even as the reset is progressing. Since the queue was restarted we could get a call to ibmvnic_xmit() which can then access the bad tx_scrq (or other fields). We cannot however simply have ibmvnic_complete_tx() check the ->resetting bit and skip starting the queue. This can race at the "back-end" of a good reset which just restarted the queue but has not cleared the ->resetting bit yet. If we skip restarting the queue due to ->resetting being true, the queue would remain stopped indefinitely potentially leading to transmit timeouts. IOW ->resetting is too broad for this purpose. Instead use a new flag that indicates whether or not the queues are active. Only the open/ reset paths control when the queues are active. ibmvnic_complete_tx() and others wake up the queue only if the queue is marked active. So we will have: A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open() ->resetting = true ->tx_queues_active = false disable tx queues ... ->tx_queues_active = true start tx queues B. Tx interrupt in ibmvnic_complete_tx(): if (->tx_queues_active) netif_wake_subqueue(); To ensure that ->tx_queues_active and state of the queues are consistent, we need a lock which: - must also be taken in the interrupt path (ibmvnic_complete_tx()) - shared across the multiple queues in the adapter (so they don't become serialized) Use rcu_read_lock() and have the reset thread synchronize_rcu() after updating the ->tx_queues_active state. While here, consolidate a few boolean fields in ibmvnic_adapter for better alignment. Based on discussions with Brian King and Dany Madden. Fixes: 7ed5b31f4a66 ("net/ibmvnic: prevent more than one thread from running in reset") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/batman-adv/hard-interface.c commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check") commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message") https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/ net/smc/af_smc.c commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails") commit 462791bbfa35 ("net/smc: add sysctl interface for SMC") https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25ibmvnic: Allow queueing resets during probeSukadev Bhattiprolu
We currently don't allow queuing resets when adapter is in VNIC_PROBING state - instead we throw away the reset and return EBUSY. The reasoning is probably that during ibmvnic_probe() the ibmvnic_adapter itself is being initialized so performing a reset during this time can lead us to accessing fields in the ibmvnic_adapter that are not fully initialized. A review of the code shows that all the adapter state neede to process a reset is initialized before registering the CRQ so that should no longer be a concern. Further the expectation is that if we do get a reset (transport event) during probe, the do..while() loop in ibmvnic_probe() will handle this by reinitializing the CRQ. While that is true to some extent, it is possible that the reset might occur _after_ the CRQ is registered and CRQ_INIT message was exchanged but _before_ the adapter state is set to VNIC_PROBED. As mentioned above, such a reset will be thrown away. While the client assumes that the adapter is functional, the vnic server will wait for the client to reinit the adapter. This disconnect between the two leaves the adapter down needing manual intervention. Because ibmvnic_probe() has other work to do after initializing the CRQ (such as registering the netdev at a minimum) and because the reset event can occur at any instant after the CRQ is initialized, there will always be a window between initializing the CRQ and considering the adapter ready for resets (ie state == PROBED). So rather than discarding resets during this window, allow queueing them - but only process them after the adapter is fully initialized. To do this, introduce a new completion state ->probe_done and have the reset worker thread wait on this before processing resets. This change brings up two new situations in or just after ibmvnic_probe(). First after one or more resets were queued, we encounter an error and decide to retry the initialization. At that point the queued resets are no longer relevant since we could be talking to a new vnic server. So we must purge/flush the queued resets before restarting the initialization. As a side note, since we are still in the probing stage and we have not registered the netdev, it will not be CHANGE_PARAM reset. Second this change opens up a potential race between the worker thread in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the following sequence of events: 1. Register CRQ 2. Get transport event before CRQ_INIT completes. 3. Tasklet schedules reset: a) add rwi to list b) schedule_work() to start worker thread which runs and waits for ->probe_done. 4. ibmvnic_probe() decides to retry, purges rwi_list 5. Re-register crq and this time rest of probe succeeds - register netdev and complete(->probe_done). 6. Worker thread resumes in __ibmvnic_reset() from 3b. 7. Worker thread sets ->resetting bit 8. ibmvnic_open() comes in, notices ->resetting bit, sets state to IBMVNIC_OPEN and returns early expecting worker thread to finish the open. 9. Worker thread finds rwi_list empty and returns without opening the interface. If this happens, the ->ndo_open() call is effectively lost and the interface remains down. To address this, ensure that ->rwi_list is not empty before setting the ->resetting bit. See also comments in __ibmvnic_reset(). Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: clear fop when retrying probeSukadev Bhattiprolu
Clear ->failover_pending flag that may have been set in the previous pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open() call would be misled into thinking a failover is pending and assuming that the reset worker thread would open the adapter. If this pass of registering the CRQ succeeds (i.e there is no transport event), there wouldn't be a reset worker thread. This would leave the adapter unconfigured and require manual intervention to bring it up during boot. Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: init init_done_rc earlierSukadev Bhattiprolu
We currently initialize the ->init_done completion/return code fields before issuing a CRQ_INIT command. But if we get a transport event soon after registering the CRQ the taskslet may already have recorded the completion and error code. If we initialize here, we might overwrite/ lose that and end up issuing the CRQ_INIT only to timeout later. If that timeout happens during probe, we will leave the adapter in the DOWN state rather than retrying to register/init the CRQ. Initialize the completion before registering the CRQ so we don't lose the notification. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: register netdev after init of adapterSukadev Bhattiprolu
Finish initializing the adapter before registering netdev so state is consistent. Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: complete init_done on transport eventsSukadev Bhattiprolu
If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a868bb ("ibmvnic: Fix completion structure initialization") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: define flush_reset_queue helperSukadev Bhattiprolu
Define and use a helper to flush the reset queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: initialize rc before completing waitSukadev Bhattiprolu
We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0cb378 ("ibmvnic delay complete()") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: free reset-work-item when flushingSukadev Bhattiprolu
Fix a tiny memory leak when flushing the reset work queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") 857898eb4b28 ("selftests: mptcp: add missing join check") 6ef84b1517e0 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22ibmvnic: schedule failover only if vioctl failsSukadev Bhattiprolu
If client is unable to initiate a failover reset via H_VIOCTL hcall, then it should schedule a failover reset as a last resort. Otherwise, there is no need to do a last resort. Fixes: 334c42414729 ("ibmvnic: improve failover sysfs entry") Reported-by: Cris Forno <cforno12@outlook.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220221210545.115283-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18net/ibmvnic: Cleanup workaround doing an EOI after partition migrationCédric Le Goater
There were a fair amount of changes to workaround a firmware bug leaving a pending interrupt after migration of the ibmvnic device : commit 2df5c60e198c ("net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode") commit 284f87d2f387 ("Revert "net/ibmvnic: Fix EOI when running in XIVE mode"") commit 11d49ce9f794 ("net/ibmvnic: Fix EOI when running in XIVE mode.") commit f23e0643cd0b ("ibmvnic: Clear pending interrupt after device reset") Here is the final one taking into account the XIVE interrupt mode. Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc: Dany Madden <drt@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-08ibmvnic: don't release napi in __ibmvnic_open()Sukadev Bhattiprolu
If __ibmvnic_open() encounters an error such as when setting link state, it calls release_resources() which frees the napi structures needlessly. Instead, have __ibmvnic_open() only clean up the work it did so far (i.e. disable napi and irqs) and leave the rest to the callers. If caller of __ibmvnic_open() is ibmvnic_open(), it should release the resources immediately. If the caller is do_reset() or do_hard_reset(), they will release the resources on the next reset. This fixes following crash that occurred when running the drmgr command several times to add/remove a vnic interface: [102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq [102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq [102056] ibmvnic 30000003 env3: Replenished 8 pools Kernel attempted to read user page (10) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000010 Faulting instruction address: 0xc000000000a3c840 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries ... CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted 5.16.0-rc5-autotest-g6441998e2e37 #1 Workqueue: events_long __ibmvnic_reset [ibmvnic] NIP: c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820 REGS: c0000000548e37e0 TRAP: 0300 Not tainted (5.16.0-rc5-autotest-g6441998e2e37) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28248484 XER: 00000004 CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000 ... NIP [c000000000a3c840] napi_enable+0x20/0xc0 LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic] Call Trace: [c0000000548e3a80] [0000000000000006] 0x6 (unreliable) [c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic] [c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic] [c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570 [c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660 [c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0 [c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010 38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020 ---[ end trace 5f8033b08fd27706 ]--- Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220208001918.900602-1-sukadev@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24ibmvnic: remove unused ->wait_capabilitySukadev Bhattiprolu
With previous bug fix, ->wait_capability flag is no longer needed and can be removed. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: don't spin in taskletSukadev Bhattiprolu
ibmvnic_tasklet() continuously spins waiting for responses to all capability requests. It does this to avoid encountering an error during initialization of the vnic. However if there is a bug in the VIOS and we do not receive a response to one or more queries the tasklet ends up spinning continuously leading to hard lock ups. If we fail to receive a message from the VIOS it is reasonable to timeout the login attempt rather than spin indefinitely in the tasklet. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: init ->running_cap_crqs earlySukadev Bhattiprolu
We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should send out the next protocol message type. i.e when we get back responses to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs. Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we send out the QUERY_IP_OFFLOAD CRQ. We currently increment ->running_cap_crqs as we send out each CRQ and have the ibmvnic_tasklet() send out the next message type, when this running_cap_crqs count drops to 0. This assumes that all the CRQs of the current type were sent out before the count drops to 0. However it is possible that we send out say 6 CRQs, get preempted and receive all the 6 responses before we send out the remaining CRQs. This can result in ->running_cap_crqs count dropping to zero before all messages of the current type were sent and we end up sending the next protocol message too early. Instead initialize the ->running_cap_crqs upfront so the tasklet will only send the next protocol message after all responses are received. Use the cap_reqs local variable to also detect any discrepancy (either now or in future) in the number of capability requests we actually send. Currently only send_query_cap() is affected by this behavior (of sending next message early) since it is called from the worker thread (during reset) and from application thread (during ->ndo_open()) and they can be preempted. send_request_cap() is only called from the tasklet which processes CRQ responses sequentially, is not be affected. But to maintain the existing symmtery with send_query_capability() we update send_request_capability() also. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: Allow extra failures before disablingSukadev Bhattiprolu
If auto-priority-failover (APF) is enabled and there are at least two backing devices of different priorities, some resets like fail-over, change-param etc can cause at least two back to back failovers. (Failover from high priority backing device to lower priority one and then back to the higher priority one if that is still functional). Depending on the timimg of the two failovers it is possible to trigger a "hard" reset and for the hard reset to fail due to failovers. When this occurs, the driver assumes that the network is unstable and disables the VNIC for a 60-second "settling time". This in turn can cause the ethtool command to fail with "No such device" while the vnic automatically recovers a little while later. Given that it's possible to have two back to back failures, allow for extra failures before disabling the vnic for the settling time. Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06ethernet: ibmveth: use default_groups in kobj_typeGreg Kroah-Hartman
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ibmveth sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Cristobal Forno <cforno12@linux.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14ibmvnic: remove unused definesDany Madden
IBMVNIC_STATS_TIMEOUT and IBMVNIC_INIT_FAILED are not used in the driver. Remove them. Suggested-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14ibmvnic: Update driver return codesDany Madden
Update return codes to be more informative. Signed-off-by: Jacob Root <otis@otisroot.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02ibmvnic: drop bad optimization in reuse_tx_pools()Sukadev Bhattiprolu
When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02ibmvnic: drop bad optimization in reuse_rx_pools()Sukadev Bhattiprolu
When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22ethtool: extend ringparam setting/getting API with rx_buf_lenHao Chen
Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17net: annotate accesses to queue->trans_startEric Dumazet
In following patches, dev_watchdog() will no longer stop all queues. It will read queue->trans_start locklessly. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Merge in the fixes we had queued in case there was another -rc. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01ibmvnic: delay complete()Sukadev Bhattiprolu
If we get CRQ_INIT, we set errno to -EIO and first call complete() to notify the waiter. Then we try to schedule a FAILOVER reset. If this occurs while adapter is in PROBING state, ibmvnic_reset() changes the error code to EAGAIN and returns without scheduling the FAILOVER. The purpose of setting error code to EAGAIN is to ask the waiter to retry. But due to the earlier complete() call, the waiter may already have seen the -EIO response and decided not to retry. This can cause intermittent failures when bringing up ibmvnic adapters during boot, specially in in kexec/kdump kernels. Defer the complete() call until after scheduling the reset. Also streamline the error code to EAGAIN. Don't see why we need EIO sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN. Fixes: 17c8705838a5 ("ibmvnic: Return error code if init interrupted by transport event") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01ibmvnic: Process crqs after enabling interruptsSukadev Bhattiprolu
Soon after registering a CRQ it is possible that we get a fail over or maybe a CRQ_INIT from the VIOS while interrupts were disabled. Look for any such CRQs after enabling interrupts. Otherwise we can intermittently fail to bring up ibmvnic adapters during boot, specially in kexec/kdump kernels. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01ibmvnic: don't stop queue in xmitSukadev Bhattiprolu
If adapter's resetting bit is on, discard the packet but don't stop the transmit queue - instead leave that to the reset code. With this change, it is possible that we may get several calls to ibmvnic_xmit() that simply discard packets and return. But if we stop the queue here, we might end up doing so just after __ibmvnic_open() started the queues (during a hard/soft reset) and before the ->resetting bit was cleared. If that happens, there will be no one to restart queue and transmissions will be blocked indefinitely. This can cause a TIMEOUT reset and with auto priority failover enabled, an unnecessary FAILOVER reset to less favored backing device and then a FAILOVER back to the most favored backing device. If we hit the window repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER resets leaving the adapter unusable for extended periods of time. Fixes: 7f5b030830fe ("ibmvnic: Free skb's in cases of failure in transmit") Reported-by: Abdul Haleem <abdhalee@in.ibm.com> Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16ethernet: ibmveth: use ether_addr_to_u64()Jakub Kicinski
We'll want to make netdev->dev_addr const, remove the local helper which is missing a const qualifier on the argument and use ether_addr_to_u64(). Similar story to mlx4. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-14ethernet: ibm/emac: use of_get_ethdev_address() to load dev_addrJakub Kicinski
A straggler I somehow missed in the automated conversion in commit 9ca01b25dfff ("ethernet: use of_get_ethdev_address()"). Use the new helper instead of using netdev->dev_addr directly. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-05ethernet: use eth_hw_addr_set() for dev->addr_len casesJakub Kicinski
Convert all Ethernet drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) In theory addr_len may not be ETH_ALEN, but we don't expect non-Ethernet devices to live under this directory, and only the following cases of setting addr_len exist: - cxgb4 for mgmt device, and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05ethernet: ehea: add missing castJakub Kicinski
We need to cast the pointer, unlike memcpy() eth_hw_addr_set() does not take void *. The driver already casts &port->mac_addr to u8 * in other places. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a96d317fb1a3 ("ethernet: use eth_hw_addr_set()") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02ethernet: use eth_hw_addr_set() instead of ether_addr_copy()Jakub Kicinski
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02ethernet: use eth_hw_addr_set()Jakub Kicinski
Convert all Ethernet drivers from memcpy(... ETH_ADDR) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, ETH_ALEN) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/phy/bcm7xxx.c d88fd1b546ff ("net: phy: bcm7xxx: Fixed indirect MMD operations") f68d08c437f9 ("net: phy: bcm7xxx: Add EPHY entry for 72165") net/sched/sch_api.c b193e15ac69d ("net: prevent user from passing illegal stab size") 69508d43334e ("net_sched: Use struct_size() and flex_array_size() helpers") Both cases trivial - adjacent code additions. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-27net: ethernet: emac: utilize of_net's of_get_mac_address()Christian Lamparter
of_get_mac_address() reads the same "local-mac-address" property. ... But goes above and beyond by checking the MAC value properly. printk+message seems outdated too, so let's put dev_err in the queue. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27ibmveth: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()Cai Huoqing
Replacing kmalloc/kfree/dma_map_single/dma_unmap_single() with dma_alloc_coherent/dma_free_coherent() helps to reduce code size, and simplify the code, and coherent DMA will not clear the cache every time. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27Revert "ibmvnic: check failover_pending in login response"Desnes A. Nunes do Rosario
This reverts commit d437f5aa23aa2b7bd07cd44b839d7546cc17166f. Code has been duplicated through commit <273c29e944bd> "ibmvnic: check failover_pending in login response" Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts! Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-15ibmvnic: Reuse tx pools when possibleSukadev Bhattiprolu
Rather than releasing the tx pools on every close and reallocating them on open, reuse the tx pools unless the pool parameters (number of pools, size of each pool or size of each buffer in a pool) have changed. If the pool parameters changed, then release the old pools (if any) and allocate new ones. Specifically release tx pools, if: - adapter is removed, - pool parameters change during reset, - we encounter an error when opening the adapter in response to a user request (in ibmvnic_open()). and don't release them: - in __ibmvnic_close() or - on errors in __ibmvnic_open() in the hope that we can reuse them during this or next reset. With these changes reset_tx_pools() can be dropped because its optimization is now included in init_tx_pools() itself. cleanup_tx_pools() releases all the skbs associated with the pool and is called from ibmvnic_cleanup(), which is called on every reset. Since we want to reuse skbs across resets, move cleanup_tx_pools() out of ibmvnic_cleanup() and call it only when user closes the adapter. Add two new adapter fields, ->prev_mtu, ->prev_tx_pool_size to track the previous values and use them to decide whether to reuse or realloc the pools. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Reuse rx pools when possibleSukadev Bhattiprolu
Rather than releasing the rx pools on and reallocating them on every reset, reuse the rx pools unless the pool parameters (number of pools, size of each pool or size of each buffer in a pool) have changed. If the pool parameters changed, then release the old pools (if any) and allocate new ones. Specifically release rx pools, if: - adapter is removed, - pool parameters change during reset, - we encounter an error when opening the adapter in response to a user request (in ibmvnic_open()). and don't release them: - in __ibmvnic_close() or - on errors in __ibmvnic_open() in the hope that we can reuse them on the next reset. With these, reset_rx_pools() can be dropped because its optimzation is now included in init_rx_pools() itself. cleanup_rx_pools() releases all the skbs associated with the pool and is called from ibmvnic_cleanup(), which is called on every reset. Since we want to reuse skbs across resets, move cleanup_rx_pools() out of ibmvnic_cleanup() and call it only when user closes the adapter. Add two new adapter fields, ->prev_rx_buf_sz, ->prev_rx_pool_size to keep track of the previous values and use them to decide whether to reuse or realloc the pools. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Reuse LTB when possibleSukadev Bhattiprolu
Reuse the long term buffer during a reset as long as its size has not changed. If the size has changed, free it and allocate a new one of the appropriate size. When we do this, alloc_long_term_buff() and reset_long_term_buff() become identical. Drop reset_long_term_buff(). Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Use bitmap for LTB map_idsSukadev Bhattiprolu
In a follow-on patch, we will reuse long term buffers when possible. When doing so we have to be careful to properly assign map ids. We can no longer assign them sequentially because a lower map id may be available and we could wrap at 255 and collide with an in-use map id. Instead, use a bitmap to track active map ids and to find a free map id. Don't need to take locks here since the map_id only changes during reset and at that time only the reset worker thread should be using the adapter. Noticed this when analyzing an error Dany Madden ran into with the patch set. Reported-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: init_tx_pools move loop-invariant codeSukadev Bhattiprolu
In init_tx_pools() move some loop-invariant code out of the loop. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>