summaryrefslogtreecommitdiffstats
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2015-06-18netfilter: don't pull include/linux/netfilter.h from netns headersPablo Neira Ayuso
This pulls the full hook netfilter definitions from all those that include net_namespace.h. Instead let's just include the bare minimum required in the new linux/netfilter_defs.h file, and use it from the netfilter netns header files. I also needed to include in.h and in6.h from linux/netfilter.h otherwise we hit this compilation error: In file included from include/linux/netfilter_defs.h:4:0, from include/net/netns/netfilter.h:4, from include/net/net_namespace.h:22, from include/linux/netdevice.h:43, from net/netfilter/nfnetlink_queue_core.c:23: include/uapi/linux/netfilter.h:76:17: error: field ‘in’ has incomplete type struct in_addr in; And also explicit include linux/netfilter.h in several spots. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2015-06-18[media] videodev2.h: fix copy-and-paste error in V4L2_MAP_XFER_FUNC_DEFAULTHans Verkuil
The colorspace argument was compared against a V4L2_XFER_FUNC define instead of against a V4L2_COLORSPACE define, returning the wrong answer. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-18netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flagHarout Hedeshian
xt_socket is useful for matching sockets with IP_TRANSPARENT and taking some action on the matching packets. However, it lacks the ability to match only a small subset of transparent sockets. Suppose there are 2 applications, each with its own set of transparent sockets. The first application wants all matching packets dropped, while the second application wants them forwarded somewhere else. Add the ability to retore the skb->mark from the sk_mark. The mark is only restored if a matching socket is found and the transparent / nowildcard conditions are satisfied. Now the 2 hypothetical applications can differentiate their sockets based on a mark value set with SO_MARK. iptables -t mangle -I PREROUTING -m socket --transparent \ --restore-skmark -j action iptables -t mangle -A action -m mark --mark 10 -j action2 iptables -t mangle -A action -m mark --mark 11 -j action3 Signed-off-by: Harout Hedeshian <harouth@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18netfilter: nfnetlink_queue: add security context informationRoman Kubiak
This patch adds an additional attribute when sending packet information via netlink in netfilter_queue module. It will send additional security context data, so that userspace applications can verify this context against their own security databases. Signed-off-by: Roman Kubiak <r.kubiak@samsung.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-17kvm: remove one useless check extensionTiejun Chen
We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD, kvm_vm_ioctl_check_extension_generic() | + switch (arg) { + ... + #ifdef CONFIG_HAVE_KVM_IRQFD + case KVM_CAP_IRQFD: + #endif + ... + return 1; + ... + } | + kvm_vm_ioctl_check_extension() So its not necessary to check this in arch again, and also fix one typo, s/emlation/emulation. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-06-15sock_diag: implement a get_info handler for inetCraig Gallek
This get_info handler will simply dispatch to the appropriate existing inet protocol handler. This patch also includes a new netlink attribute (INET_DIAG_PROTOCOL). This attribute is currently only used for multicast messages. Without this attribute, there is no way of knowing the IP protocol used by the socket information being broadcast. This attribute is not necessary in the 'dump' variant of this protocol (though it could easily be added) because dump requests are issued for specific family/protocol pairs. Tested: ss -E (note, the -E option has not yet been merged into the upstream version of ss). Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15sock_diag: define destruction multicast groupsCraig Gallek
These groups will contain socket-destruction events for AF_INET/AF_INET6, IPPROTO_TCP/IPPROTO_UDP. Near the end of socket destruction, a check for listeners is performed. In the presence of a listener, rather than completely cleanup the socket, a unit of work will be added to a private work queue which will first broadcast information about the socket and then finish the cleanup operation. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15net/core: Add reading VF statistics through the PF netdeviceEran Ben Elisha
Add ndo_get_vf_stats where the PF retrieves and fills the VFs traffic statistics. We encode the VF stats in a nested manner to allow for future extensions. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15Merge tag 'nfc-next-4.2-1' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz says: ==================== NFC 4.2 pull request This is the NFC pull request for 4.2. - NCI drivers can now define their own handlers for processing proprietary NCI responses and notifications. - NFC vendors can use a dedicated netlink API to send their own proprietary commands, like e.g. all commands needed to implement vendor specific manufacturing tools. - A new generic NCI over UART driver against which any NCI chipset running on top of a serial interface can register. - The st21nfcb driver is renamed to st-nci as it can and will support most of ST Microelectronics NCI chipsets. - The st21nfcb driver can put its CLF in hibernate mode and save significant amount of power. - A few st21nfcb minor fixes. - The NXP NCI driver now supports ACPI enumeration. - The Marvell NCI driver now supports both USB and serial physical interfaces. - The Marvell NCI drivers also supports NCI frames being muxed over HCI. This is a setting that can be defined by a DT property. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15bonding: export slave's partner_oper_port_state via sysfs and netlinkNikolay Aleksandrov
Export the partner_oper_port_state of each port via sysfs and netlink. In 802.3ad mode it is valuable for the user to be able to check the partner_oper state, it is already exported via bond's proc entry. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15bonding: export slave's actor_oper_port_state via sysfs and netlinkNikolay Aleksandrov
Export the actor_oper_port_state of each port via sysfs and netlink. In 802.3ad mode it is valuable for the user to be able to check the actor_oper state, it is already exported via bond's proc entry. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15bpf: introduce current->pid, tgid, uid, gid, comm accessorsAlexei Starovoitov
eBPF programs attached to kprobes need to filter based on current->pid, uid and other fields, so introduce helper functions: u64 bpf_get_current_pid_tgid(void) Return: current->tgid << 32 | current->pid u64 bpf_get_current_uid_gid(void) Return: current_gid << 32 | current_uid bpf_get_current_comm(char *buf, int size_of_buf) stores current->comm into buf They can be used from the programs attached to TC as well to classify packets based on current task fields. Update tracex2 example to print histogram of write syscalls for each process instead of aggregated for all. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15netfilter: nf_tables: attach net_device to basechainPablo Neira Ayuso
The device is part of the hook configuration, so instead of a global configuration per table, set it to each of the basechain that we create. This patch reworks ebddf1a8d78a ("netfilter: nf_tables: allow to bind table to net_device"). Note that this adds a dev_name field in the nft_base_chain structure which is required the netdev notification subscription that follows up in a patch to handle gone net_devices. Suggested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-14netfilter: ipset: Fix coding styles reported by checkpatch.plJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-11Merge tag 'samsung-mach-1' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/soc Samsung updates for v4.2 - add failure(exception) handling : of_iomap(), of_find_device_by_node() and kstrdup() - add common poweroff to use PS_HOLD based for all of exynos SoCs - add exnos_get/set_boot_addr() helper - constify platform_device_id and irq_domain_ops - get current parent clock for power domain on/off - use core_initcall to register power domain driver - make exynos_core_restart() less verbose - add support coupled CPUidle for exynos3250 - fix exynos_boot_secondary() return value on timeout - fix clk_enable() in s3c24xx adc - fix missing of_node_put() for power domains * tag 'samsung-mach-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: (301 commits) ARM: EXYNOS: register power domain driver from core_initcall ARM: EXYNOS: use PS_HOLD based poweroff for all supported SoCs ARM: SAMSUNG: Constify platform_device_id ARM: EXYNOS: Constify irq_domain_ops ARM: EXYNOS: add coupled cpuidle support for Exynos3250 ARM: EXYNOS: add exynos_get_boot_addr() helper ARM: EXYNOS: add exynos_set_boot_addr() helper ARM: EXYNOS: make exynos_core_restart() less verbose ARM: EXYNOS: fix exynos_boot_secondary() return value on timeout ARM: EXYNOS: Get current parent clock for power domain on/off ARM: SAMSUNG: fix clk_enable() WARNing in S3C24XX ADC ARM: EXYNOS: Add missing of_node_put() when parsing power domains ARM: EXYNOS: Handle of_find_device_by_node() and kstrdup() failures ARM: EXYNOS: Handle of of_iomap() failure Linux 4.1-rc4 ....
2015-06-11NFC: nci: add generic uart supportVincent Cuissard
Some NFC controller supports UART as host interface. As with SPI, a lot of code can be shared between vendor drivers. This patch add the generic support of UART and provides some extension API for vendor specific needs. This code is strongly inspired by the Bluetooth HCI ldisc implementation. NCI UART vendor drivers will have to register themselves to this layer via nci_uart_register. Underlying tty will have to be configured from user land thanks to an ioctl. Signed-off-by: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-06-11net/ethtool: Add current supported tunable optionsHadar Hen Zion
Add strings array of the current supported tunable options. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11vfio: powerpc/spapr: Support Dynamic DMA windowsAlexey Kardashevskiy
This adds create/remove window ioctls to create and remove DMA windows. sPAPR defines a Dynamic DMA windows capability which allows para-virtualized guests to create additional DMA windows on a PCI bus. The existing linux kernels use this new window to map the entire guest memory and switch to the direct DMA operations saving time on map/unmap requests which would normally happen in a big amounts. This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows. Up to 2 windows are supported now by the hardware and by this driver. This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional information such as a number of supported windows and maximum number levels of TCE tables. DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature as we still want to support v2 on platforms which cannot do DDW for the sake of TCE acceleration in KVM (coming soon). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11vfio: powerpc/spapr: Register memory and define IOMMU v2Alexey Kardashevskiy
The existing implementation accounts the whole DMA window in the locked_vm counter. This is going to be worse with multiple containers and huge DMA windows. Also, real-time accounting would requite additional tracking of accounted pages due to the page size difference - IOMMU uses 4K pages and system uses 4K or 64K pages. Another issue is that actual pages pinning/unpinning happens on every DMA map/unmap request. This does not affect the performance much now as we spend way too much time now on switching context between guest/userspace/host but this will start to matter when we add in-kernel DMA map/unmap acceleration. This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU. New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces 2 new ioctls to register/unregister DMA memory - VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY - which receive user space address and size of a memory region which needs to be pinned/unpinned and counted in locked_vm. New IOMMU splits physical pages pinning and TCE table update into 2 different operations. It requires: 1) guest pages to be registered first 2) consequent map/unmap requests to work only with pre-registered memory. For the default single window case this means that the entire guest (instead of 2GB) needs to be pinned before using VFIO. When a huge DMA window is added, no additional pinning will be required, otherwise it would be guest RAM + 2GB. The new memory registration ioctls are not supported by VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration will require memory to be preregistered in order to work. The accounting is done per the user process. This advertises v2 SPAPR TCE IOMMU and restricts what the userspace can do with v1 or v2 IOMMUs. In order to support memory pre-registration, we need a way to track the use of every registered memory region and only allow unregistration if a region is not in use anymore. So we need a way to tell from what region the just cleared TCE was from. This adds a userspace view of the TCE table into iommu_table struct. It contains userspace address, one per TCE entry. The table is only allocated when the ownership over an IOMMU group is taken which means it is only used from outside of the powernv code (such as VFIO). As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support DDW API), this creates a default DMA window for IODA2 for consistency. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-10serial: stm32-usart: Add STM32 USART DriverMaxime Coquelin
This drivers adds support to the STM32 USART controller, which is a standard serial driver. Tested-by: Chanwoo Choi <cw00.choi@samsung.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Reviewed-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-09[media] DocBook: Change format for enum dmx_output documentationMauro Carvalho Chehab
Use a table for the Demux output. No new information added here. They were all merged inside the table. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] dvb: dmx.h: don't use anonymous enumsMauro Carvalho Chehab
There are several anonymous enums here, used via a typedef. Well, we don't like typedefs on Kernel, so let's de-anonimize those enums. Then, latter, we may be able to get rid of the typedefs, at least from Kernelspace. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] dvb: frontend.h: add a note for the deprecated enums/structsMauro Carvalho Chehab
Let be clear, at the header, about what got deprecated. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] dvb: frontend.h: improve dvb_frontent_parameters commentMauro Carvalho Chehab
The comment for struct dvb_frontend_parameters is weird, as it mixes delivery system name (ATSC) with modulation names (QPSK, QAM, OFDM). Use delivery system names there on the frequency comment, as this is clearer, specially after 2GEN delivery systems. While here, add comments at the union, to make live easier for ones that may try to understand the convention used by the legacy API. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] frontend: Fix a typo at the commentsMauro Carvalho Chehab
The description of struct dtv_stats has a spmall typo: FE_SCALE_DECIBELS instead of FE_SCALE_DECIBEL Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] frontend: move legacy typedefs to the endMauro Carvalho Chehab
Just userspace need those typedefs. So, put it in the compat part of the header. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] frontend: Move legacy API enums/structs to the endMauro Carvalho Chehab
In order to better organize the header file, move the legacy API (DVBv3) support to the end, just before the ioctl definitions. This way, we can use just one #if for all of them. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] dvb: Get rid of typedev usage for enumsMauro Carvalho Chehab
The DVB API was originally defined using typedefs. This is against Kernel CodingStyle, and there's no good usage here. While we can't remove its usage on userspace, we can avoid its usage in Kernelspace. So, let's do it. This patch was generated by this shell script: for j in $(grep typedef include/uapi/linux/dvb/frontend.h |cut -d' ' -f 3); do for i in $(find drivers/media -name '*.[ch]' -type f) $(find drivers/staging/media -name '*.[ch]' -type f); do sed "s,${j}_t,enum $j," <$i >a && mv a $i; done; done While here, make CodingStyle fixes on the affected lines. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> # for drivers/media/firewire/*
2015-06-09[media] DocBook: add xrefs for enum fe_typeMauro Carvalho Chehab
The only enum that was missing xrefs at frontend.h is fe_type. Add xrefs for them. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] DocBook: properly document the delivery systemsMauro Carvalho Chehab
Use a table for the delivery systems. The table is organized by the type (cable, satellite, terrestrial) and shows what standards are not fully implemented. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] DocBook: better document the DVB-S2 rolloff factorMauro Carvalho Chehab
Instead of using a program listing, use a table and make clearer what each define means. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09[media] DocBook: document DVB-S2 pilot in a tableMauro Carvalho Chehab
Putting it into a table allows to comment each possible values, with makes more clear what field means. Also, it allows to do cross-references with the frontend.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-09can: cangw: introduce optional uid to reference created routing jobsOliver Hartkopp
Similar to referencing iptables rules by their line number this UID allows to reference created routing jobs, e.g. to alter configured data modifications. The UID is an optional non-zero value which can be provided at routing job creation time. When the UID is set the UID replaces the data modification configuration as job identification attribute e.g. at job removal time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-06-09NFC: netlink: Implement vendor command supportSamuel Ortiz
Vendor commands are passed from userspace through the NFC_CMD_VENDOR netlink command, allowing driver and hardware specific operations implementations like for example RF tuning or production line calibration. Drivers will associate a set of vendor commands to a vendor id, which could typically be an OUI. The netlink kernel implementation will try to match the received vendor id and sub command attributes with the registered ones. When such match is found, the driver defined sub command routine is called. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-06-08Merge 4.1-rc7 into tty-nextGreg Kroah-Hartman
This fixes up a merge issue with the amba-pl011.c driver, and we want the fixes in this branch as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-08Merge 4.1-rc7 into staging-testingGreg Kroah-Hartman
We want the staging tree fixes in here too to help with testing and merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-07perf/x86/intel: Introduce PERF_RECORD_LOST_SAMPLESKan Liang
After enlarging the PEBS interrupt threshold, there may be some mixed up PEBS samples which are discarded by the kernel. This patch makes the kernel emit a PERF_RECORD_LOST_SAMPLES record with the number of possible discarded records when it is impossible to demux the samples. It makes sure the user is not left in the dark about such discards. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285195-14269-8-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07perf: add new PERF_SAMPLE_BRANCH_IND_JUMP branch sample typeStephane Eranian
This patch adds a new branch_sample_type flag to enable filtering branch sampling to indirect jumps. The support is subject to hardware or kernel software support on each architecture. Filtering on indirect jump is useful to study the targets of the jump. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: dsahern@gmail.com Cc: jolsa@redhat.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1431637800-31061-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07bpf: allow programs to write to certain skb fieldsAlexei Starovoitov
allow programs read/write skb->mark, tc_index fields and ((struct qdisc_skb_cb *)cb)->data. mark and tc_index are generically useful in TC. cb[0]-cb[4] are primarily used to pass arguments from one program to another called via bpf_tail_call() which can be seen in sockex3_kern.c example. All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs. mark, tc_index are writeable from tc_cls_act only. cb[0]-cb[4] are writeable by both sockets and tc_cls_act. Add verifier tests and improve sample code. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-06inet: add IP_BIND_ADDRESS_NO_PORT to overcome bind(0) limitationsEric Dumazet
When an application needs to force a source IP on an active TCP socket it has to use bind(IP, port=x). As most applications do not want to deal with already used ports, x is often set to 0, meaning the kernel is in charge to find an available port. But kernel does not know yet if this socket is going to be a listener or be connected. It has very limited choices (no full knowledge of final 4-tuple for a connect()) With limited ephemeral port range (about 32K ports), it is very easy to fill the space. This patch adds a new SOL_IP socket option, asking kernel to ignore the 0 port provided by application in bind(IP, port=0) and only remember the given IP address. The port will be automatically chosen at connect() time, in a way that allows sharing a source port as long as the 4-tuples are unique. This new feature is available for both IPv4 and IPv6 (Thanks Neal) Tested: Wrote a test program and checked its behavior on IPv4 and IPv6. strace(1) shows sequences of bind(IP=127.0.0.2, port=0) followed by connect(). Also getsockname() show that the port is still 0 right after bind() but properly allocated after connect(). socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5 setsockopt(5, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 connect(5, {sa_family=AF_INET, sin_port=htons(53174), sin_addr=inet_addr("127.0.0.3")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(38050), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 IPv6 test : socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 7 setsockopt(7, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 connect(7, {sa_family=AF_INET6, sin6_port=htons(57300), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(60964), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 I was able to bind()/connect() a million concurrent IPv4 sockets, instead of ~32000 before patch. lpaa23:~# ulimit -n 1000010 lpaa23:~# ./bind --connect --num-flows=1000000 & 1000000 sockets lpaa23:~# grep TCP /proc/net/sockstat TCP: inuse 2000063 orphan 0 tw 47 alloc 2000157 mem 66 Check that a given source port is indeed used by many different connections : lpaa23:~# ss -t src :40000 | head -10 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 0 127.0.0.2:40000 127.0.202.33:44983 ESTAB 0 0 127.0.0.2:40000 127.2.27.240:44983 ESTAB 0 0 127.0.0.2:40000 127.2.98.5:44983 ESTAB 0 0 127.0.0.2:40000 127.0.124.196:44983 ESTAB 0 0 127.0.0.2:40000 127.2.139.38:44983 ESTAB 0 0 127.0.0.2:40000 127.1.59.80:44983 ESTAB 0 0 127.0.0.2:40000 127.3.6.228:44983 ESTAB 0 0 127.0.0.2:40000 127.0.38.53:44983 ESTAB 0 0 127.0.0.2:40000 127.1.197.10:44983 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-05NVMe: Automatic namespace rescanKeith Busch
Namespaces may be dynamically allocated and deleted or attached and detached. This has the driver rescan the device for namespace changes after each device reset or namespace change asynchronous event. There could potentially be many detached namespaces that we don't want polluting /dev/ with unusable block handles, so this will delete disks if the namespace is not active as indicated by the response from identify namespace. This also skips adding the disk if no capacity is provisioned to the namespace in the first place. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-05NVMe: add sysfs and ioctl controller resetKeith Busch
We need the ability to perform an nvme controller reset as discussed on the mailing list thread: http://lists.infradead.org/pipermail/linux-nvme/2015-March/001585.html This adds a sysfs entry that when written to will reset perform an NVMe controller reset if the controller was successfully initialized in the first place. This also adds locking around resetting the device in the async probe method so the driver can't schedule two resets. Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Brandon Schultz <brandon.schulz@hgst.com> Cc: David Sariel <david.sariel@pmcs.com> Updated by Jens to: 1) Merge this with the ioctl reset patch from David Sariel. The ioctl path now shares the reset code from the sysfs path. 2) Don't flush work if we fail issuing the reset. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-05KVM: implement multiple address spacesPaolo Bonzini
Only two ioctls have to be modified; the address space id is placed in the higher 16 bits of their slot id argument. As of this patch, no architecture defines more than one address space; x86 will be the first. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05[media] videodev2.h: add support for transfer functionsHans Verkuil
In the past the transfer function was implied by the colorspace. However, it is an independent entity in its own right. Add support for explicitly choosing the transfer function. This change will allow us to represent linear RGB (as is used by openGL), and it will make it easier to work with decoded video material since most codecs store the transfer function as a separate property as well. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-06-05virtgpu: include linux/types.h to avoid warning.Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-04mpls: Add definition for IPPROTO_MPLSTom Herbert
Add uapi define for MPLS over IP. Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04KVM: x86: API changes for SMM supportPaolo Bonzini
This patch includes changes to the external API for SMM support. Userspace can predicate the availability of the new fields and ioctls on a new capability, KVM_CAP_X86_SMM, which is added at the end of the patch series. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-03bpf: introduce bpf_clone_redirect() helperAlexei Starovoitov
Allow eBPF programs attached to classifier/actions to call bpf_clone_redirect(skb, ifindex, flags) helper which will mirror or redirect the packet by dynamic ifindex selection from within the program to a target device either at ingress or at egress. Can be used for various scenarios, for example, to load balance skbs into veths, split parts of the traffic to local taps, etc. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04Merge branch 'virtio-gpu-drm-next' of git://git.kraxel.org/linux into drm-nextDave Airlie
Yay, thanks to Gerd for pull this together. * 'virtio-gpu-drm-next' of git://git.kraxel.org/linux: Add MAINTAINERS entry for virtio-gpu. Add virtio gpu driver. drm_vblank_get: don't WARN_ON in case vblanks are not initialized break kconfig dependency loop
2015-06-04Merge tag 'v4.1-rc6' into drm-nextDave Airlie
Linux 4.1-rc6 backmerge 4.1-rc6 as some of the later pull reqs are based on newer bases and I'd prefer to do the fixup myself.