aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/hyperv/netvsc.c
AgeCommit message (Collapse)Author
2017-09-11hv_netvsc: fix deadlock on hotplugStephen Hemminger
When a virtual device is added dynamically (via host console), then the vmbus sends an offer message for the primary channel. The processing of this message for networking causes the network device to then initialize the sub channels. The problem is that setting up the sub channels needs to wait until the subsequent subchannel offers have been processed. These offers come in on the same ring buffer and work queue as where the primary offer is being processed; leading to a deadlock. This did not happen in older kernels, because the sub channel waiting logic was broken (it wasn't really waiting). The solution is to do the sub channel setup in its own work queue context that is scheduled by the primary channel setup; and then happens later. Fixes: 732e49850c5e ("netvsc: fix race on sub channel creation") Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16vmbus: remove unused vmbus_sendpacket_ctlstephen hemminger
The only usage of vmbus_sendpacket_ctl was by vmbus_sendpacket. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16vmbus: remove unused vmubs_sendpacket_pagebuffer_ctlstephen hemminger
The function vmbus_sendpacket_pagebuffer_ctl was never used directly. Just have vmbus_send_pagebuffer Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: keep track of some non-fatal overload conditionsstephen hemminger
Add ethtool statistics for case where send chimmeny buffer is exhausted and driver has to fall back to doing scatter/gather send. Also, add statistic for case where ring buffer is full and receive completions are delayed. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: allow controlling send/recv buffer sizestephen hemminger
Control the size of the buffer areas via ethtool ring settings. They aren't really traditional hardware rings, but host API breaks receive and send buffer into chunks. The final size of the chunks are controlled by the host. The default value of send and receive buffer area for host DMA is much larger than it needs to be. Experimentation shows that 4M receive and 1M send is sufficient. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: no need to allocate send/receive on numa nodestephen hemminger
The send and receive buffers are both per-device (not per-channel). The associated NUMA node is a property of the CPU which is per-channel therefore it makes no sense to force the receive/send buffer to be allocated on a particular node (since it is a shared resource). Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: don't signal host twice if emptystephen hemminger
When hv_pkt_iter_next() returns NULL, it has already called hv_pkt_iter_close(). Calling it twice can lead to extra host signal. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08netvsc: make sure and unregister datapathstephen hemminger
Go back to switching datapath directly in the notifier callback. Otherwise datapath might not get switched on unregister. No need for calling the NOTIFY_PEERS notifier since that is only for a gratitious ARP/ND packet; but that is not required with Hyper-V because both VF and synthetic NIC have the same MAC address. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: 0c195567a8f6 ("netvsc: transparent VF management") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06netvsc: fix race on sub channel creationstephen hemminger
The existing sub channel code did not wait for all the sub-channels to completely initialize. This could lead to race causing crash in napi_netif_del() from bad list. The existing code would send an init message, then wait only for the initial response that the init message was received. It thought it was waiting for sub channels but really the init response did the wakeup. The new code keeps track of the number of open channels and waits until that many are open. Other issues here were: * host might return less sub-channels than was requested. * the new init status is not valid until after init was completed. Fixes: b3e6b82a0099 ("hv_netvsc: Wait for sub-channels to be processed during probe") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02hyperv: netvsc: Neaten netvsc_send_pkt by using a temporaryJoe Perches
Repeated dereference of nvmsg.msg.v1_msg.send_rndis_pkt can be shortened by using a temporary. Do so. No change in object code. Miscellanea: o Use * const for rpkt and nvchan Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01netvsc: Initialize 64-bit stats seqcountFlorian Fainelli
On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. In commit 6c80f3fc2398 ("netvsc: report per-channel stats in ethtool statistics") netdev_alloc_pcpu_stats() was removed in favor of open-coding the 64-bits statistics, except that u64_stats_init() was missed. Fixes: 6c80f3fc2398 ("netvsc: report per-channel stats in ethtool statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: signal host if receive ring is emptiedstephen hemminger
Latency improvement related to NAPI conversion. If all packets are processed from receive ring then need to signal host. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: fix error unwind on device setup failurestephen hemminger
If setting receive buffer fails, the error unwind would cause kernel panic because it was not correctly doing RCU and NAPI unwind. RCU'd pointer needs to be reset to NULL, and NAPI needs to be disabled not deleted. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: optimize receive completionsstephen hemminger
Optimize how receive completion ring are managed. * Allocate only as many slots as needed for all buffers from host * Allocate before setting up sub channel for better error detection * Don't need to keep copy of initial receive section message * Precompute the watermark for when receive flushing is needed * Replace division with conditional test * Replace atomic per-device variable with per-channel check. * Handle corner case where receive completion send fails if ring buffer to host is full. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: remove unnecessary indirection of page_bufferstephen hemminger
The internal API was passing struct hv_page_buffer ** when only simple struct hv_page_buffer * was necessary for passing an array. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: don't print pointer value in error messagestephen hemminger
Using %p to print pointer to packet meta-data doesn't give any good info, and exposes kernel memory offsets. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29netvsc: fix warnings reported by lockdepstephen hemminger
This includes a bunch of fixups for issues reported by lockdep. * ethtool routines can assume RTNL * send is done with RCU lock (and BH disable) * avoid refetching internal device struct (netvsc) instead pass it as a parameter. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24netvsc: prefetch the first incoming ring elementstephen hemminger
In interrupt handler, prefetch the first incoming ring element so that it is in cache by the time NAPI poll gets to it. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19netvsc: add rtnl annotations in rndisstephen hemminger
The rndis functions are used when changing device state. Therefore the references from network device to internal state are protected by RTNL mutex. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19netvsc: save pointer to parent netvsc_device in channel tablestephen hemminger
Keep back pointer in the per-channel data structure to avoid any possible RCU related issues when napi poll is called but netvsc_device is in RCU limbo. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19netvsc: need rcu_derefence when accessing internal device infostephen hemminger
The netvsc_device structure should be accessed by rcu_dereference in the send path. Change arguments to netvsc_send() to make this easier to do correctly. Remove no longer needed hv_device_to_netvsc_device. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19netvsc: use ERR_PTR to avoid dereference issuesstephen hemminger
The rndis_filter_device_add function is called both in probe context and RTNL context,and creates the netvsc_device inner structure. It is easier to get the RTNL lock annotation correct if it returns the object directly, rather than implicitly by updating network device private data. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19netvsc: add some rtnl_dereference annotationsstephen hemminger
In a couple places RTNL is held, and the netvsc_device pointer is acquired without annotation. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22hv_netvsc: Fix the carrier state error when data path is offHaiyang Zhang
When the VF NIC is opened, the synthetic NIC's carrier state is set to off. This tells the host to transitions data path to the VF device. But if startup script or user manipulates the admin state of the netvsc device directly for example: # ifconfig eth0 down # ifconfig eth0 up Then the carrier state of the synthetic NIC would be on, even though the data path was still over the VF NIC. This patch sets the carrier state of synthetic NIC with consideration of the related VF state. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09netvsc: fold in get_outbound_net_devicestephen hemminger
No longer need common code to find get_outbound_net_device. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09netvsc: pass net_device to netvsc_init_buf and netvsc_connect_vspstephen hemminger
Don't need to find netvsc_device structure, caller already had it. Also rearrange declarations. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09netvsc: mark error cases as unlikelystephen hemminger
Mark if() statements used for error handling only as unlikely() Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-04netvsc: make sure napi enabled before vmbus_openstephen hemminger
This fixes a race where vmbus callback for new packet arriving could occur before NAPI is initialized. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25netvsc: fix calculation of available send sectionsstephen hemminger
My change (introduced in 4.11) to use find_first_clear_bit incorrectly assumed that the size argument was words, not bits. The effect was only a small limited number of the available send sections were being actually used. This can cause performance loss with some workloads. Since map_words is now used only during initialization, it can be on stack instead of in per-device data. Fixes: b58a185801da ("netvsc: simplify get next send section") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21netvsc: fix use after free on module removalstephen hemminger
The NAPI data structure is embedded in the netvsc_device structure and is freed when device is closed. There is still a reference (in NAPI list) to this which causes a crash in netif_napi_del when device is removed. Fix by managing NAPI instances correctly. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21netvsc: Deal with rescinded channels correctlyK. Y. Srinivasan
We will not be able to send packets over a channel that has been rescinded. Make necessary adjustments so we can properly cleanup even when the channel is rescinded. This issue can be trigerred in the NIC hot-remove path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09netvsc: use napi_consume_skbstephen hemminger
This allows using deferred skb freeing and with NAPI. And get buffer recycling. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08netvsc: Initialize all channel related state prior to opening the channelK. Y. Srinivasan
Prior to opening the channel we should have all the state setup to handle interrupts. The current code does not do that; fix the bug. This bug can result in faults in the interrupt path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22netvsc: eliminate unnecessary skb == NULL checksstephen hemminger
Since there already is a special case goto for control messages (skb == NULL) in netvsc_send, there is no need for later checks in same code path. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22netvsc: uses RCU instead of removal flagstephen hemminger
It is cleaner to use RCU protected pointer (nvdev_ctx->nvdev) to indicate device is in removed state, rather than having a separate boolean flag. By using the pointer the context can be checked by static checkers and dynamic lockdep. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22netvsc: use RCU to protect inner device structurestephen hemminger
The netvsc driver has an internal structure (netvsc_device) which is created when device is opened and released when device is closed. And also opened/released when MTU or number of channels change. Since this is referenced in the receive and transmit path, it is safer to use RCU to protect/prevent use after free problems. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22netvsc: fix NAPI performance regressionstephen hemminger
When using NAPI, the single stream performance declined signifcantly because the poll routine was updating host after every burst of packets. This excess signalling caused host throttling. This fix restores the old behavior. Host is only signalled after the ring has been emptied. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16netvsc: add comments about callback's and NAPIstephen hemminger
Add some short description of how callback's and NAPI interoperate. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16netvsc: avoid race with callbackstephen hemminger
Change the argument to channel callback from the channel pointer to the internal data structure containing per-channel info. This avoids any possible races when callback happens during initialization and makes IRQ code simpler. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16netvsc: fix race during initializationstephen hemminger
When device is being setup on boot, there is a small race where network device callback is registered, but the netvsc_device pointer is not set yet. This can cause a NULL ptr dereference if packet arrives during this window. Fixes: 46b4f7f5d1f7 ("netvsc: eliminate per-device outstanding send counter") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/genet/bcmgenet.c net/core/sock.c Conflicts were overlapping changes in bcmgenet and the lockdep handling of sockets. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-12netvsc: fix hang on netvsc module removalstephen hemminger
The code in netvsc_device_remove was incorrectly calling napi_disable repeatedly on the same element. This would cause attempts to remove netvsc module to hang. Fixes: 2506b1dc4bbe ("netvsc: implement NAPI") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-12netvsc: need napi scheduled during removalstephen hemminger
Since rndis_halt_device waits until all outstanding sends and receives are completed. Netvsc device needs to still schedule NAPI to see those completions. Fixes: 2506b1dc4bbe ("netvsc: implement NAPI") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-12netvsc: handle select_queue when device is being removedstephen hemminger
Move the send indirection table from the inner device (netvsc) to the network device context. It is possible that netvsc_device is not present (remove in progress). This solves potential use after free issues when packet is being created during MTU change, shutdown, or queue count changes. Fixes: d8e18ee0fa96 ("netvsc: enhance transmit select_queue") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-06netvsc: implement NAPIstephen hemminger
Use NAPI (softirq), to handle receive packets and send completions. Previously this was handled by tasklet. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-06vmbus: introduce in-place packet iteratorstephen hemminger
This is mostly just a refactoring of previous functions (get_pkt_next_raw, put_pkt_raw and commit_rd_index) to make it easier to use for other drivers and NAPI. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-06netvsc: don't overload variable in same functionstephen hemminger
There are two variables named packet in the same function. One is the metadata descriptor from host (vmpacket_descriptor) and the other is the control block in the skb used to hold metadata from send. Change name to avoid possible confusion and bugs. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-22Merge tag 'char-misc-4.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver patchset for 4.11-rc1. Lots of different driver subsystems updated here: rework for the hyperv subsystem to handle new platforms better, mei and w1 and extcon driver updates, as well as a number of other "minor" driver updates. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits) goldfish: Sanitize the broken interrupt handler x86/platform/goldfish: Prevent unconditional loading vmbus: replace modulus operation with subtraction vmbus: constify parameters where possible vmbus: expose hv_begin/end_read vmbus: remove conditional locking of vmbus_write vmbus: add direct isr callback mode vmbus: change to per channel tasklet vmbus: put related per-cpu variable together vmbus: callback is in softirq not workqueue binder: Add support for file-descriptor arrays binder: Add support for scatter-gather binder: Add extra size to allocator binder: Refactor binder_transact() binder: Support multiple /dev instances binder: Deal with contexts in debugfs binder: Support multiple context managers binder: Split flat_binder_object auxdisplay: ht16k33: remove private workqueue auxdisplay: ht16k33: rework input device initialization ...