aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/netfilter
AgeCommit message (Collapse)Author
2016-12-06netfilter: x_tables: pack percpu counter allocationsFlorian Westphal
instead of allocating each xt_counter individually, allocate 4k chunks and then use these for counter allocation requests. This should speed up rule evaluation by increasing data locality, also speeds up ruleset loading because we reduce calls to the percpu allocator. As Eric points out we can't use PAGE_SIZE, page_allocator would fail on arches with 64k page size. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06netfilter: x_tables: pass xt_counters struct to counter allocatorFlorian Westphal
Keeps some noise away from a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06netfilter: x_tables: pass xt_counters struct instead of packet counterFlorian Westphal
On SMP we overload the packet counter (unsigned long) to contain percpu offset. Hide this from callers and pass xt_counters address instead. Preparation patch to allocate the percpu counters in page-sized batch chunks. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-04netfilter: conntrack: built-in support for DCCPDavide Caratti
CONFIG_NF_CT_PROTO_DCCP is no more a tristate. When set to y, connection tracking support for DCCP protocol is built-in into nf_conntrack.ko. footprint test: $ ls -l net/netfilter/nf_conntrack{_proto_dccp,}.ko \ net/ipv4/netfilter/nf_conntrack_ipv4.ko \ net/ipv6/netfilter/nf_conntrack_ipv6.ko (builtin)|| dccp | ipv4 | ipv6 | nf_conntrack ---------++--------+--------+--------+-------------- none || 469140 | 828755 | 828676 | 6141434 DCCP || - | 830566 | 829935 | 6533526 Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-10netfilter: ipset: Count non-static extension memory for userspaceJozsef Kadlecsik
Non-static (i.e. comment) extension was not counted into the memory size. A new internal counter is introduced for this. In the case of the hash types the sizes of the arrays are counted there as well so that we can avoid to scan the whole set when just the header data is requested. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Add element count to all set types headerJozsef Kadlecsik
It is better to list the set elements for all set types, thus the header information is uniform. Element counts are therefore added to the bitmap and list types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Regroup ip_set_put_extensions and add externJozsef Kadlecsik
Cleanup: group ip_set_put_extensions and ip_set_get_extensions together and add missing extern. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Split extensions into separate filesJozsef Kadlecsik
Cleanup to separate all extensions into individual files. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Use kmalloc() in comment extension helperJozsef Kadlecsik
Allocate memory with kmalloc() rather than kzalloc(): the string is immediately initialized so it is unnecessary to zero out the allocated memory area. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Improve skbinfo get/init helpersJozsef Kadlecsik
Use struct ip_set_skbinfo in struct ip_set_ext instead of open coded fields and assign structure members in get/init helpers instead of copying members one by one. Explicitly note that struct ip_set_skbinfo must be padded to prevent non-aligned access in the extension blob. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Headers file cleanupJozsef Kadlecsik
Group counter helper functions together. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Mark some helper args as const.Jozsef Kadlecsik
Mark some of the helpers arguments as const. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Remove extra whitespaces in ip_set.hJozsef Kadlecsik
Remove unnecessary whitespaces. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-03netfilter: x_tables: move hook state into xt_action_param structurePablo Neira Ayuso
Place pointer to hook state in xt_action_param structure instead of copying the fields that we need. After this change xt_action_param fits into one cacheline. This patch also adds a set of new wrapper functions to fetch relevant hook state structure fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-12netfilter: conntrack: remove packet hotpath statsFlorian Westphal
These counters sit in hot path and do show up in perf, this is especially true for 'found' and 'searched' which get incremented for every packet processed. Information like searched=212030105 new=623431 found=333613 delete=623327 does not seem too helpful nowadays: - on busy systems found and searched will overflow every few hours (these are 32bit integers), other more busy ones every few days. - for debugging there are better methods, such as iptables' trace target, the conntrack log sysctls. Nowadays we also have perf tool. This removes packet path stat counters except those that are expected to be 0 (or close to 0) on a normal system, e.g. 'insert_failed' (race happened) or 'invalid' (proto tracker rejects). The insert stat is retained for the ctnetlink case. The found stat is retained for the tuple-is-taken check when NAT has to determine if it needs to pick a different source address. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07netfilter: gre: Use consistent GRE and PTTP header structure instead of the ↵Gao Feng
ones defined by netfilter There are two existing strutures which defines the GRE and PPTP header. So use these two structures instead of the ones defined by netfilter to keep consitent with other codes. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07netfilter: gre: Use consistent GRE_* macros instead of ones defined by ↵Gao Feng
netfilter. There are already some GRE_* macros in kernel, so it is unnecessary to define these macros. And remove some useless macros Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-18netfilter: nfnetlink_acct: report overquota to the right netnsLiping Zhang
We should report the over quota message to the right net namespace instead of the init netns. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-18netfilter: x_tables: speed up jump target validationFlorian Westphal
The dummy ruleset I used to test the original validation change was broken, most rules were unreachable and were not tested by mark_source_chains(). In some cases rulesets that used to load in a few seconds now require several minutes. sample ruleset that shows the behaviour: echo "*filter" for i in $(seq 0 100000);do printf ":chain_%06x - [0:0]\n" $i done for i in $(seq 0 100000);do printf -- "-A INPUT -j chain_%06x\n" $i printf -- "-A INPUT -j chain_%06x\n" $i printf -- "-A INPUT -j chain_%06x\n" $i done echo COMMIT [ pipe result into iptables-restore ] This ruleset will be about 74mbyte in size, with ~500k searches though all 500k[1] rule entries. iptables-restore will take forever (gave up after 10 minutes) Instead of always searching the entire blob for a match, fill an array with the start offsets of every single ipt_entry struct, then do a binary search to check if the jump target is present or not. After this change ruleset restore times get again close to what one gets when reverting 36472341017529e (~3 seconds on my workstation). [1] every user-defined rule gets an implicit RETURN, so we get 300k jumps + 100k userchains + 100k returns -> 500k rule entries Fixes: 36472341017529e ("netfilter: x_tables: validate targets of jumps") Reported-by: Jeff Wu <wujiafu@gmail.com> Tested-by: Jeff Wu <wujiafu@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-03netfilter: Convert FWINV<[foo]> macros and uses to NF_INVFJoe Perches
netfilter uses multiple FWINV #defines with identical form that hide a specific structure variable and dereference it with a invflags member. $ git grep "#define FWINV" include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg)) net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg)) net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg))) net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg))) net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg))) net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg))) Consolidate these macros into a single NF_INVF macro. Miscellanea: o Neaten the alignment around these uses o A few lines are > 80 columns for intelligibility Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-29netfilter: fix IS_ERR_VALUE usagePablo Neira Ayuso
This is a forward-port of the original patch from Andrzej Hajda, he said: "IS_ERR_VALUE should be used only with unsigned long type. Otherwise it can work incorrectly. To achieve this function xt_percpu_counter_alloc is modified to return unsigned long, and its result is assigned to temporary variable to perform error checking, before assigning to .pcnt field. The patch follows conclusion from discussion on LKML [1][2]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2120927 [2]: http://permalink.gmane.org/gmane.linux.kernel/2150581" Original patch from Andrzej is here: http://patchwork.ozlabs.org/patch/582970/ This patch has clashed with input validation fixes for x_tables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, mostly from Florian Westphal to sort out the lack of sufficient validation in x_tables and connlabel preparation patches to add nf_tables support. They are: 1) Ensure we don't go over the ruleset blob boundaries in mark_source_chains(). 2) Validate that target jumps land on an existing xt_entry. This extra sanitization comes with a performance penalty when loading the ruleset. 3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables. 4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables. 5) Make sure the minimal possible target size in x_tables. 6) Similar to #3, add xt_compat_check_entry_offsets() for compat code. 7) Check that standard target size is valid. 8) More sanitization to ensure that the target_offset field is correct. 9) Add xt_check_entry_match() to validate that matches are well-formed. 10-12) Three patch to reduce the number of parameters in translate_compat_table() for {arp,ip,ip6}tables by using a container structure. 13) No need to return value from xt_compat_match_from_user(), so make it void. 14) Consolidate translate_table() so it can be used by compat code too. 15) Remove obsolete check for compat code, so we keep consistent with what was already removed in the native layout code (back in 2007). 16) Get rid of target jump validation from mark_source_chains(), obsoleted by #2. 17) Introduce xt_copy_counters_from_user() to consolidate counter copying, and use it from {arp,ip,ip6}tables. 18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump functions. 19) Move nf_connlabel_match() to xt_connlabel. 20) Skip event notification if connlabel did not change. 21) Update of nf_connlabels_get() to make the upcoming nft connlabel support easier. 23) Remove spinlock to read protocol state field in conntrack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_net64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. The temporary function nla_put_be64_32bit() is removed in this patch. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14netfilter: x_tables: introduce and use xt_copy_counters_from_userFlorian Westphal
The three variants use same copy&pasted code, condense this into a helper and use that. Make sure info.name is 0-terminated. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14netfilter: x_tables: xt_compat_match_from_user doesn't need a retvalFlorian Westphal
Always returned 0. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14netfilter: x_tables: check for bogus target offsetFlorian Westphal
We're currently asserting that targetoff + targetsize <= nextoff. Extend it to also check that targetoff is >= sizeof(xt_entry). Since this is generic code, add an argument pointing to the start of the match/target, we can then derive the base structure size from the delta. We also need the e->elems pointer in a followup change to validate matches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14netfilter: x_tables: add compat version of xt_check_entry_offsetsFlorian Westphal
32bit rulesets have different layout and alignment requirements, so once more integrity checks get added to xt_check_entry_offsets it will reject well-formed 32bit rulesets. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14netfilter: x_tables: add and use xt_check_entry_offsetsFlorian Westphal
Currently arp/ip and ip6tables each implement a short helper to check that the target offset is large enough to hold one xt_entry_target struct and that t->u.target_size fits within the current rule. Unfortunately these checks are not sufficient. To avoid adding new tests to all of ip/ip6/arptables move the current checks into a helper, then extend this helper in followup patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28netfilter: ipset: fix race condition in ipset save, swap and deleteVishwanath Pai
This fix adds a new reference counter (ref_netlink) for the struct ip_set. The other reference counter (ref) can be swapped out by ip_set_swap and we need a separate counter to keep track of references for netlink events like dump. Using the same ref counter for dump causes a race condition which can be demonstrated by the following script: ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \ counters ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset save & ipset swap hash_ip3 hash_ip2 ipset destroy hash_ip3 /* will crash the machine */ Swap will exchange the values of ref so destroy will see ref = 0 instead of ref = 1. With this fix in place swap will not succeed because ipset save still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink). Both delete and swap will error out if ref_netlink != 0 on the set. Note: The changes to *_head functions is because previously we would increment ref whenever we called these functions, we don't do that anymore. Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02netfilter: xtables: don't hook tables by defaultFlorian Westphal
delay hook registration until the table is being requested inside a namespace. Historically, a particular table (iptables mangle, ip6tables filter, etc) was registered on module load. When netns support was added to iptables only the ip/ip6tables ruleset was made namespace aware, not the actual hook points. This means f.e. that when ipt_filter table/module is loaded on a system, then each namespace on that system has an (empty) iptables filter ruleset. In other words, if a namespace sends a packet, such skb is 'caught' by netfilter machinery and fed to hooking points for that table (i.e. INPUT, FORWARD, etc). Thanks to Eric Biederman, hooks are no longer global, but per namespace. This means that we can avoid allocation of empty ruleset in a namespace and defer hook registration until we need the functionality. We register a tables hook entry points ONLY in the initial namespace. When an iptables get/setockopt is issued inside a given namespace, we check if the table is found in the per-namespace list. If not, we attempt to find it in the initial namespace, and, if found, create an empty default table in the requesting namespace and register the needed hooks. Hook points are destroyed only once namespace is deleted, there is no 'usage count' (it makes no sense since there is no 'remove table' operation in xtables api). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-02-18nfnetlink: remove nfnetlink_alloc_skbFlorian Westphal
Following mmapped netlink removal this code can be simplified by removing the alloc wrapper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-28netfilter: nfnetlink: pass down netns pointer to commit() and abort() callbacksPablo Neira Ayuso
Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-28netfilter: nfnetlink: pass down netns pointer to call() and call_rcu()Pablo Neira Ayuso
Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for the upcoming 4.5 kernel. This batch contains userspace netfilter header compilation fixes, support for packet mangling in nf_tables, the new tracing infrastructure for nf_tables and cgroup2 support for iptables. More specifically, they are: 1) Two patches to include dependencies in our netfilter userspace headers to resolve compilation problems, from Mikko Rapeli. 2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris. 3) Remove duplicate include in the netfilter reject infrastructure, from Stephen Hemminger. 4) Two patches to simplify the netfilter defragmentation code for IPv6, patch from Florian Westphal. 5) Fix root ownership of /proc/net netfilter for unpriviledged net namespaces, from Philip Whineray. 6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal. 7) Add mangling support to our nf_tables payload expression, from Patrick McHardy. 8) Introduce a new netlink-based tracing infrastructure for nf_tables, from Florian Westphal. 9) Change setter functions in nfnetlink_log to be void, from Rami Rosen. 10) Add netns support to the cttimeout infrastructure. 11) Add cgroup2 support to iptables, from Tejun Heo. 12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian. 13) Add support for mangling pkttype in the nf_tables meta expression, also from Florian. BTW, I need that you pull net into net-next, I have another batch that requires changes that I don't yet see in net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-10netfilter: nfnetlink: avoid recurrent netns lookups in call_batchPablo Neira Ayuso
Pass the net pointer to the call_batch callback functions so we can skip recurrent lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-11-23netfilter: nf_ct_sctp: move ip_ct_sctp away from UAPIMarcelo Ricardo Leitner
ip_ct_sctp is an internal structure, embedded by the union nf_conntrack_proto to store sctp-specific information at conntrack entries. It has no business with UAPI. This patch moves it from UAPI to a saner place, together with similar structs for other protocols. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-11-07netfilter: ipset: Fix extension alignmentJozsef Kadlecsik
The data extensions in ipset lacked the proper memory alignment and thus could lead to kernel crash on several architectures. Therefore the structures have been reorganized and alignment attributes added where needed. The patch was tested on armv7h by Gerhard Wiesinger and on x86_64, sparc64 by Jozsef Kadlecsik. Reported-by: Gerhard Wiesinger <lists@wiesinger.com> Tested-by: Gerhard Wiesinger <lists@wiesinger.com> Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-10-09net/nfnetlink: lockdep_nfnl_is_held can be booleanYaowei Bai
This patch makes lockdep_nfnl_is_held return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18netfilter: x_tables: Pass struct net in xt_action_paramEric W. Biederman
As xt_action_param lives on the stack this does not bloat any persistent data structures. This is a first step in making netfilter code that needs to know which network namespace it is executing in simpler. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-02netfilter: nf_conntrack: make nf_ct_zone_dflt built-inDaniel Borkmann
Fengguang reported, that some randconfig generated the following linker issue with nf_ct_zone_dflt object involved: [...] CC init/version.o LD init/built-in.o net/built-in.o: In function `ipv4_conntrack_defrag': nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt' net/built-in.o: In function `ipv6_defrag': nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt' make: *** [vmlinux] Error 1 Given that configurations exist where we have a built-in part, which is accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user() and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in area when netfilter is configured in general. Therefore, split the more generic parts into a common header under include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in section that already holds parts related to CONFIG_NF_CONNTRACK in the netfilter core. This fixes the issue on my side. Fixes: 308ac9143ee2 ("netfilter: nf_conntrack: push zone object into functions") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07netfilter: nfacct: per network namespace supportAndreas Schultz
- Move the nfnl_acct_list into the network namespace, initialize and destroy it per namespace - Keep track of refcnt on nfacct objects, the old logic does not longer work with a per namespace list - Adjust xt_nfacct to pass the namespace when registring objects Signed-off-by: Andreas Schultz <aschultz@tpip.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-15netfilter: add and use jump label for xt_teeFlorian Westphal
Don't bother testing if we need to switch to alternate stack unless TEE target is used. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-15netfilter: xtables: don't save/restore jumpstack offsetFlorian Westphal
In most cases there is no reentrancy into ip/ip6tables. For skbs sent by REJECT or SYNPROXY targets, there is one level of reentrancy, but its not relevant as those targets issue an absolute verdict, i.e. the jumpstack can be clobbered since its not used after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc). So the only special case where it is relevant is the TEE target, which returns XT_CONTINUE. This patch changes ip(6)_do_table to always use the jump stack starting from 0. When we detect we're operating on an skb sent via TEE (percpu nf_skb_duplicated is 1) we switch to an alternate stack to leave the original one alone. Since there is no TEE support for arptables, it doesn't need to test if tee is active. The jump stack overflow tests are no longer needed as well -- since ->stacksize is the largest call depth we cannot exceed it. A much better alternative to the external jumpstack would be to just declare a jumps[32] stack on the local stack frame, but that would mean we'd have to reject iptables rulesets that used to work before. Another alternative would be to start rejecting rulesets with a larger call depth, e.g. 1000 -- in this case it would be feasible to allocate the entire stack in the percpu area which would avoid one dereference. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18netfilter: xtables: fix warnings on 32bit platformsFlorian Westphal
On 32bit archs gcc complains due to cast from void* to u64. Add intermediate casts to long to silence these warnings. include/linux/netfilter/x_tables.h:376:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] include/linux/netfilter/x_tables.h:384:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] include/linux/netfilter/x_tables.h:391:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] include/linux/netfilter/x_tables.h:400:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Fixes: 71ae0dff02d756e ("netfilter: xtables: use percpu rule counters") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18netfilter: x_tables: align per cpu xt_counterEric Dumazet
Let's force a 16 bytes alignment on xt_counter percpu allocations, so that bytes and packets sit in same cache line. xt_counter being exported to user space, we cannot add __align(16) on the structure itself. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference.Eric Dumazet
After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore : Only one copy of table is kept, instead of one copy per cpu. We also can avoid a dereference if we put table data right after xt_table_info. It reduces register pressure and helps compiler. Then, we attempt a kmalloc() if total size is under order-3 allocation, to reduce TLB pressure, as in many cases, rules fit in 32 KB. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-14netfilter: ipset: Fix coding styles reported by checkpatch.plJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Prepare the ipset core to use RCU at set levelJozsef Kadlecsik
Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking accordingly. Convert the comment extension into an rcu-avare object. Also, simplify the timeout routines. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Fix parallel resizing and listing of the same setJozsef Kadlecsik
When elements added to a hash:* type of set and resizing triggered, parallel listing could start to list the original set (before resizing) and "continue" with listing the new set. Fix it by references and using the original hash table for listing. Therefore the destroying of the original hash table may happen from the resizing or listing functions. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Fix cidr handling for hash:*net* typesJozsef Kadlecsik
Commit "Simplify cidr handling for hash:*net* types" broke the cidr handling for the hash:*net* types when the sets were used by the SET target: entries with invalid cidr values were added to the sets. Reported by Jonathan Johnson. Testsuite entry is added to verify the fix. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>