summaryrefslogtreecommitdiffstats
path: root/include/net/netfilter
AgeCommit message (Collapse)Author
2015-01-19netfilter: nf_tables: validate hooks in NAT expressionsPablo Neira Ayuso
The user can crash the kernel if it uses any of the existing NAT expressions from the wrong hook, so add some code to validate this when loading the rule. This patch introduces nft_chain_validate_hooks() which is based on an existing function in the bridge version of the reject expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-12-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
2014-12-08Merge branch 'iov_iter' into for-nextAl Viro
2014-11-27netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one modulePablo Neira Ayuso
This resolves linking problems with CONFIG_IPV6=n: net/built-in.o: In function `redirect_tg6': xt_REDIRECT.c:(.text+0x6d021): undefined reference to `nf_nat_redirect_ipv6' Reported-by: Andreas Ruprecht <rupran@einserver.de> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-27netfilter: nf_tables_bridge: export nft_reject_ip*hdr_validate functionsAlvaro Neira
This patch exports the functions nft_reject_iphdr_validate and nft_reject_ip6hdr_validate to use it in follow up patches. These functions check if the IPv4/IPv6 header is correct. Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-27netfilter: conntrack: avoid zeroing timerFlorian Westphal
add a __nfct_init_offset annotation member to struct nf_conn to make it clear which members are covered by the memset when the conntrack is allocated. This avoids zeroing timer_list and ct_net; both are already inited explicitly. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter/ipvs updates for net-next The following patchset contains Netfilter updates for your net-next tree, this includes the NAT redirection support for nf_tables, the cgroup support for nft meta and conntrack zone support for the connlimit match. Coming after those, a bunch of sparse warning fixes, missing netns bits and cleanups. More specifically, they are: 1) Prepare IPv4 and IPv6 NAT redirect code to use it from nf_tables, patches from Arturo Borrero. 2) Introduce the nf_tables redir expression, from Arturo Borrero. 3) Remove an unnecessary assignment in ip_vs_xmit/__ip_vs_get_out_rt(). Patch from Alex Gartrell. 4) Add nft_log_dereference() macro to the nf_log infrastructure, patch from Marcelo Leitner. 5) Add some extra validation when registering logger families, also from Marcelo. 6) Some spelling cleanups from stephen hemminger. 7) Fix sparse warning in nf_logger_find_get(). 8) Add cgroup support to nf_tables meta, patch from Ana Rey. 9) A Kconfig fix for the new redir expression and fix sparse warnings in the new redir expression. 10) Fix several sparse warnings in the netfilter tree, from Florian Westphal. 11) Reduce verbosity when OOM in nfnetlink_log. User can basically do nothing when this situation occurs. 12) Add conntrack zone support to xt_connlimit, again from Florian. 13) Add netnamespace support to the h323 conntrack helper, contributed by Vasily Averin. 14) Remove unnecessary nul-pointer checks before free_percpu() and module_put(), from Markus Elfring. 15) Use pr_fmt in nfnetlink_log, again patch from Marcelo Leitner. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12netfilter: nf_tables: restore synchronous object release from commit/abortPablo Neira Ayuso
The existing xtables matches and targets, when used from nft_compat, may sleep from the destroy path, ie. when removing rules. Since the objects are released via call_rcu from softirq context, this results in lockdep splats and possible lockups that may be hard to reproduce. Patrick also indicated that delayed object release via call_rcu can cause us problems in the ordering of event notifications when anonymous sets are in place. So, this patch restores the synchronous object release from the commit and abort paths. This includes a call to synchronize_rcu() to make sure that no packets are walking on the objects that are going to be released. This is slowier though, but it's simple and it resolves the aforementioned problems. This is a partial revert of c7c32e7 ("netfilter: nf_tables: defer all object release via rcu") that was introduced in 3.16 to speed up interaction with userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-05netfilter: Convert print_tuple functions to return voidJoe Perches
Since adding a new function to seq_file (seq_has_overflowed()) there isn't any value for functions called from seq_show to return anything. Remove the int returns of the various print_tuple/<foo>_print_tuple functions. Link: http://lkml.kernel.org/p/f2e8cf8df433a197daa62cbaf124c900c708edc7.1412031505.git.joe@perches.com Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-05netfilter: Remove return values for print_conntrack callbacksSteven Rostedt (Red Hat)
The seq_printf() and friends are having their return values removed. The print_conntrack() returns the result of seq_printf(), which is meaningless when seq_printf() returns void. Might as well remove the return values of print_conntrack() as well. Link: http://lkml.kernel.org/r/20141029220107.465008329@goodmis.org Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-31netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functionsPablo Neira Ayuso
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_ip6hdr_put(): to build the IPv6 header. * nf_reject_ip6_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-31netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functionsPablo Neira Ayuso
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_iphdr_put(): to build the IPv4 header. * nf_reject_ip_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-27netfilter: nf_tables: add new expression nft_redirArturo Borrero
This new expression provides NAT in the redirect flavour, which is to redirect packets to local machine. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-27netfilter: refactor NAT redirect IPv6 code to use it from nf_tablesArturo Borrero
This patch refactors the IPv6 code so it can be usable both from xt and nf_tables. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-27netfilter: refactor NAT redirect IPv4 to use it from nf_tablesArturo Borrero
This patch refactors the IPv4 code so it can be usable both from xt and nf_tables. A similar patch follows-up to handle IPv6. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains netfilter fixes for your net tree, they are: 1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules. 2) Restrict nat and masq expressions to the nat chain type. Otherwise, users may crash their kernel if they attach a nat/masq rule to a non nat chain. 3) Fix hook validation in nft_compat when non-base chains are used. Basically, initialize hook_mask to zero. 4) Make sure you use match/targets in nft_compat from the right chain type. The existing validation relies on the table name which can be avoided by 5) Better netlink attribute validation in nft_nat. This expression has to reject the configuration when no address and proto configurations are specified. 6) Interpret NFTA_NAT_REG_*_MAX if only if NFTA_NAT_REG_*_MIN is set. Yet another sanity check to reject incorrect configurations from userspace. 7) Conditional NAT attribute dumping depending on the existing configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15Merge branch 'for-3.18-consistent-ops' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
2014-10-13netfilter: nf_tables: restrict nat/masq expressions to nat chain typePablo Neira Ayuso
This adds the missing validation code to avoid the use of nat/masq from non-nat chains. The validation assumes two possible configuration scenarios: 1) Use of nat from base chain that is not of nat type. Reject this configuration from the nft_*_init() path of the expression. 2) Use of nat from non-base chain. In this case, we have to wait until the non-base chain is referenced by at least one base chain via jump/goto. This is resolved from the nft_*_validate() path which is called from nf_tables_check_loops(). The user gets an -EOPNOTSUPP in both cases. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-07netfilter: kill nf_send_reset6() from include/net/netfilter/ipv6/nf_reject.hPablo Neira Ayuso
nf_send_reset6() now resides in net/ipv6/netfilter/nf_reject_ipv6.c Fixes: c8d7b98 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Eric Dumazet <edumazet@google.com>
2014-10-02netfilter: explicit module dependency between br_netfilter and physdevPablo Neira Ayuso
You can use physdev to match the physical interface enslaved to the bridge device. This information is stored in skb->nf_bridge and it is set up by br_netfilter. So, this is only available when iptables is used from the bridge netfilter path. Since 34666d4 ("netfilter: bridge: move br_netfilter out of the core"), the br_netfilter code is modular. To reduce the impact of this change, we can autoload the br_netfilter if the physdev match is used since we assume that the users need br_netfilter in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: move nf_send_resetX() code to nf_reject_ipvX modulesPablo Neira Ayuso
Move nf_send_reset() and nf_send_reset6() to nf_reject_ipv4 and nf_reject_ipv6 respectively. This code is shared by x_tables and nf_tables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: nft_reject: introduce icmp code abstraction for inet and bridgePablo Neira Ayuso
This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides an abstraction to the ICMP and ICMPv6 codes that you can use from the inet and bridge tables, they are: * NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable * NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable * NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable * NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited You can still use the specific codes when restricting the rule to match the corresponding layer 3 protocol. I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have different semantics depending on the table family and to allow the user to specify ICMP family specific codes if they restrict it to the corresponding family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-29netfilter: nf_tables: store and dump set policyArturo Borrero
We want to know in which cases the user explicitly sets the policy options. In that case, we also want to dump back the info. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-26netfilter: bridge: move br_netfilter out of the corePablo Neira Ayuso
Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>
2014-09-11netfilter: fix compilation of masquerading without IP_NF_TARGET_MASQUERADEPablo Neira Ayuso
CONFIG_NF_NAT_MASQUERADE_IPV6=m # CONFIG_IP6_NF_TARGET_MASQUERADE is not set results in: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘nf_nat_masquerade_ipv6’: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:41:14: error: ‘struct nf_conn_nat’ has no member named ‘masq_index’ nfct_nat(ct)->masq_index = out->ifindex; ^ net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘device_cmp’: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:61:12: error: ‘const struct nf_conn_nat’ has no member named ‘masq_index’ return nat->masq_index == (int)(long)ifindex; ^ net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:62:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ make[3]: *** [net/ipv6/netfilter/nf_nat_masquerade_ipv6.o] Error 1 Fix this by using the new NF_NAT_MASQUERADE_IPV4 and _IPV6 symbols in include/net/netfilter/nf_nat.h. Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09netfilter: nf_tables: add new nft_masq expressionArturo Borrero
The nft_masq expression is intended to perform NAT in the masquerade flavour. We decided to have the masquerade functionality in a separated expression other than nft_nat. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09netfilter: nf_nat: generalize IPv6 masquerading support for nf_tablesArturo Borrero
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv6; a similar patch exists to handle IPv4. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09netfilter: nf_nat: generalize IPv4 masquerading support for nf_tablesArturo Borrero
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv4; a similar patch follows to handle IPv6. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09netfilter: nat: move specific NAT IPv6 to corePablo Neira Ayuso
Move the specific NAT IPv6 core functions that are called from the hooks from ip6table_nat.c to nf_nat_l3proto_ipv6.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. This also renames nf_nat_ipv6_fn to nft_nat_ipv6_fn in net/ipv6/netfilter/nft_chain_nat_ipv6.c to avoid a compilation breakage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-02netfilter: nat: move specific NAT IPv4 to corePablo Neira Ayuso
Move the specific NAT IPv4 core functions that are called from the hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-08-26net: Replace get_cpu_var through this_cpu_ptrChristoph Lameter
Replace uses of get_cpu_var for address calculation through this_cpu_ptr. Cc: netdev@vger.kernel.org Cc: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains updates for your net-next tree, they are: 1) Use kvfree() helper function from x_tables, from Eric Dumazet. 2) Remove extra timer from the conntrack ecache extension, use a workqueue instead to redeliver lost events to userspace instead, from Florian Westphal. 3) Removal of the ulog targets for ebtables and iptables. The nflog infrastructure superseded this almost 9 years ago, time to get rid of this code. 4) Replace the list of loggers by an array now that we can only have two possible non-overlapping logger flavours, ie. kernel ring buffer and netlink logging. 5) Move Eric Dumazet's log buffer code to nf_log to reuse it from all of the supported per-family loggers. 6) Consolidate nf_log_packet() as an unified interface for packet logging. After this patch, if the struct nf_loginfo is available, it explicitly selects the logger that is used. 7) Move ip and ip6 logging code from xt_LOG to the corresponding per-family loggers. Thus, x_tables and nf_tables share the same code for packet logging. 8) Add generic ARP packet logger, which is used by nf_tables. The format aims to be consistent with the output of xt_LOG. 9) Add generic bridge packet logger. Again, this is used by nf_tables and it routes the packets to the real family loggers. As a result, we get consistent logging format for the bridge family. The ebt_log logging code has been intentionally left in place not to break backward compatibility since the logging output differs from xt_LOG. 10) Update nft_log to explicitly request the required family logger when needed. 11) Finish nft_log so it supports arp, ip, ip6, bridge and inet families. Allowing selection between netlink and kernel buffer ring logging. 12) Several fixes coming after the netfilter core logging changes spotted by robots. 13) Use IS_ENABLED() macros whenever possible in the netfilter tree, from Duan Jiong. 14) Removal of a couple of unnecessary branch before kfree, from Fabian Frederick. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-14netfilter: nf_tables: 64bit stats need some extra synchronizationEric Dumazet
Use generic u64_stats_sync infrastructure to get proper 64bit stats, even on 32bit arches, at no extra cost for 64bit arches. Without this fix, 32bit arches can have some wrong counters at the time the carry is propagated into upper word. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27netfilter: bridge: add generic packet loggerPablo Neira Ayuso
This adds the generic plain text packet loggger for bridged packets. It routes the logging message to the real protocol packet logger. I decided not to refactor the ebt_log code for two reasons: 1) The ebt_log output is not consistent with the IPv4 and IPv6 Netfilter packet loggers. The output is different for no good reason and it adds redundant code to handle packet logging. 2) To avoid breaking backward compatibility for applications outthere that are parsing the specific ebt_log output, the ebt_log output has been left as is. So only nftables will use the new consistent logging format for logged bridged packets. More decisions coming in this patch: 1) This also removes ebt_log as default logger for bridged packets. Thus, nf_log_packet() routes packet to this new packet logger instead. This doesn't break backward compatibility since nf_log_packet() is not used to log packets in plain text format from anywhere in the ebtables/netfilter bridge code. 2) The new bridge packet logger also performs a lazy request to register the real IPv4, ARP and IPv6 netfilter packet loggers. If the real protocol logger is no available (not compiled or the module is not available in the system, not packet logging happens. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27netfilter: log: nf_log_packet() as real unified interfacePablo Neira Ayuso
Before this patch, the nf_loginfo parameter specified the logging configuration in case the specified default logger was loaded. This patch updates the semantics of the nf_loginfo parameter in nf_log_packet() which now indicates the logger that you explicitly want to use. Thus, nf_log_packet() is exposed as an unified interface which internally routes the log message to the corresponding logger type by family. The module dependencies are expressed by the new nf_logger_find_get() and nf_logger_put() functions which bump the logger module refcount. Thus, you can not remove logger modules that are used by rules anymore. Another important effect of this change is that the family specific module is only loaded when required. Therefore, xt_LOG and nft_log will just trigger the autoload of the nf_log_{ip,ip6} modules according to the family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c filesPablo Neira Ayuso
The plain text logging is currently embedded into the xt_LOG target. In order to be able to use the plain text logging from nft_log, as a first step, this patch moves the family specific code to the following files and Kconfig symbols: 1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4 2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6 3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON These new modules will be required by xt_LOG and nft_log. This patch is based on original patch from Arturo Borrero Gonzalez. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-25netfilter: nf_log: move log buffering to core loggingPablo Neira Ayuso
This patch moves Eric Dumazet's log buffer implementation from the xt_log.h header file to the core net/netfilter/nf_log.c. This also includes the renaming of the structure and functions to avoid possible undesired namespace clashes. This change allows us to use it from the arp and bridge packet logging implementation in follow up patches.
2014-06-25netfilter: nf_log: use an array of loggers instead of listPablo Neira Ayuso
Now that legacy ulog targets are not available anymore in the tree, we can have up to two possible loggers: 1) The plain text logging via kernel logging ring. 2) The nfnetlink_log infrastructure which delivers log messages to userspace. This patch replaces the list of loggers by an array of two pointers per family for each possible logger and it also introduces a new field to the nf_logger structure which indicates the position in the logger array (based on the logger type). This prepares a follow up patch that consolidates the nf_log_packet() interface by allowing to specify the logger as parameter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-25netfilter: conntrack: remove timer from ecache extensionFlorian Westphal
This brings the (per-conntrack) ecache extension back to 24 bytes in size (was 152 byte on x86_64 with lockdep on). When event delivery fails, re-delivery is attempted via work queue. Redelivery is attempted at least every 0.1 seconds, but can happen more frequently if userspace is not congested. The nf_ct_release_dying_list() function is removed. With this patch, ownership of the to-be-redelivered conntracks (on-dying-list-with-DYING-bit not yet set) is with the work queue, which will release the references once event is out. Joint work with Pablo Neira Ayuso. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16netfilter: nf_tables: use u32 for chain use counterPablo Neira Ayuso
Since 4fefee5 ("netfilter: nf_tables: allow to delete several objects from a batch"), every new rule bumps the chain use counter. However, this is limited to 16 bits, which means that it will overrun after 2^16 rules. Use a u32 chain counter and check for overflows (just like we do for table objects). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next This small patchset contains three accumulated Netfilter/IPVS updates, they are: 1) Refactorize common NAT code by encapsulating it into a helper function, similarly to what we do in other conntrack extensions, from Florian Westphal. 2) A minor format string mismatch fix for IPVS, from Masanari Iida. 3) Add quota support to the netfilter accounting infrastructure, now you can add quotas to accounting objects via the nfnetlink interface and use them from iptables. You can also listen to quota notifications from userspace. This enhancement from Mathieu Poirier. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftablesDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/nftables updates for net-next The following patchset contains Netfilter/nftables updates for net-next, most relevantly they are: 1) Add set element update notification via netlink, from Arturo Borrero. 2) Put all object updates in one single message batch that is sent to kernel-space. Before this patch only rules where included in the batch. This series also introduces the generic transaction infrastructure so updates to all objects (tables, chains, rules and sets) are applied in an all-or-nothing fashion, these series from me. 3) Defer release of objects via call_rcu to reduce the time required to commit changes. The assumption is that all objects are destroyed in reverse order to ensure that dependencies betweem them are fulfilled (ie. rules and sets are destroyed first, then chains, and finally tables). 4) Allow to match by bridge port name, from Tomasz Bursztyka. This series include two patches to prepare this new feature. 5) Implement the proper set selection based on the characteristics of the data. The new infrastructure also allows you to specify your preferences in terms of memory and computational complexity so the underlying set type is also selected according to your needs, from Patrick McHardy. 6) Several cleanup patches for nft expressions, including one minor possible compilation breakage due to missing mark support, also from Patrick. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-19netfilter: nf_tables: defer all object release via rcuPablo Neira Ayuso
Now that all objects are released in the reverse order via the transaction infrastructure, we can enqueue the release via call_rcu to save one synchronize_rcu. For small rule-sets loaded via nft -f, it now takes around 50ms less here. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: remove skb and nlh from context structurePablo Neira Ayuso
Instead of caching the original skbuff that contains the netlink messages, this stores the netlink message sequence number, the netlink portID and the report flag. This helps to prepare the introduction of the object release via call_rcu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle elementsPablo Neira Ayuso
Leave the set content in consistent state if we fail to load the batch. Use the new generic transaction infrastructure to achieve this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle tablePablo Neira Ayuso
This patch speeds up rule-set updates and it also provides a way to revert updates and leave things in consistent state in case that the batch needs to be aborted. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle chainPablo Neira Ayuso
This patch speeds up rule-set updates and it also introduces a way to revert chain updates if the batch is aborted. The idea is to store the changes in the transaction to apply that in the commit step. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle setsPablo Neira Ayuso
This patch reworks the nf_tables API so set updates are included in the same batch that contains rule updates. This speeds up rule-set updates since we skip a dialog of four messages between kernel and user-space (two on each direction), from: 1) create the set and send netlink message to the kernel 2) process the response from the kernel that contains the allocated name. 3) add the set elements and send netlink message to the kernel. 4) process the response from the kernel (to check for errors). To: 1) add the set to the batch. 2) add the set elements to the batch. 3) add the rule that points to the set. 4) send batch to the kernel. This also introduces an internal set ID (NFTA_SET_ID) that is unique in the batch so set elements and rules can refer to new sets. Backward compatibility has been only retained in userspace, this means that new nft versions can talk to the kernel both in the new and the old fashion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: add message type to transactionsPablo Neira Ayuso
The patch adds message type to the transaction to simplify the commit the and abort routines. Yet another step forward in the generalisation of the transaction infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>