aboutsummaryrefslogtreecommitdiffstats
path: root/net/sched/act_bpf.c
AgeCommit message (Collapse)Author
2015-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: arch/s390/net/bpf_jit_comp.c drivers/net/ethernet/ti/netcp_ethss.c net/bridge/br_multicast.c net/ipv4/ip_fragment.c All four conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29act_bpf: fix memory leaks when replacing bpf programsDaniel Borkmann
We currently trigger multiple memory leaks when replacing bpf actions, besides others: comm "tc", pid 1909, jiffies 4294851310 (age 1602.796s) hex dump (first 32 bytes): 01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 ................ 18 b0 98 6d 00 88 ff ff 00 00 00 00 00 00 00 00 ...m............ backtrace: [<ffffffff817e623e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8120a22d>] __vmalloc_node_range+0x1bd/0x2c0 [<ffffffff8120a37a>] __vmalloc+0x4a/0x50 [<ffffffff811a8d0a>] bpf_prog_alloc+0x3a/0xa0 [<ffffffff816c0684>] bpf_prog_create+0x44/0xa0 [<ffffffffa09ba4eb>] tcf_bpf_init+0x28b/0x3c0 [act_bpf] [<ffffffff816d7001>] tcf_action_init_1+0x191/0x1b0 [<ffffffff816d70a2>] tcf_action_init+0x82/0xf0 [<ffffffff816d4d12>] tcf_exts_validate+0xb2/0xc0 [<ffffffffa09b5838>] cls_bpf_modify_existing+0x98/0x340 [cls_bpf] [<ffffffffa09b5cd6>] cls_bpf_change+0x1a6/0x274 [cls_bpf] [<ffffffff816d56e5>] tc_ctl_tfilter+0x335/0x910 [<ffffffff816b9145>] rtnetlink_rcv_msg+0x95/0x240 [<ffffffff816df34f>] netlink_rcv_skb+0xaf/0xc0 [<ffffffff816b909e>] rtnetlink_rcv+0x2e/0x40 [<ffffffff816deaaf>] netlink_unicast+0xef/0x1b0 Issue is that the old content from tcf_bpf is allocated and needs to be released when we replace it. We seem to do that since the beginning of act_bpf on the filter and insns, later on the name as well. Example test case, after patch: # FOO="1,6 0 0 4294967295," # BAR="1,6 0 0 4294967294," # tc actions add action bpf bytecode "$FOO" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 2 ref 1 bind 0 # tc actions replace action bpf bytecode "$BAR" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe index 2 ref 1 bind 0 # tc actions replace action bpf bytecode "$FOO" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 2 ref 1 bind 0 # tc actions del action bpf index 2 [...] # echo "scan" > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak | grep "comm \"tc\"" | wc -l 0 Fixes: d23b8ad8ab23 ("tc: add BPF based action") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/bridge/br_mdb.c br_mdb.c conflict was a function call being removed to fix a bug in 'net' but whose signature was changed in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-15tc: act_bpf: fix memory leakAlexei Starovoitov
prog->bpf_ops is populated when act_bpf is used with classic BPF and prog->bpf_name is optionally used with extended BPF. Fix memory leak when act_bpf is released. Fixes: d23b8ad8ab23 ("tc: add BPF based action") Fixes: a8cb5f556b56 ("act_bpf: add initial eBPF support for actions") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net: sched: add percpu stats to actionsEric Dumazet
Reuse existing percpu infrastructure John Fastabend added for qdisc. This patch adds a new cpustats parameter to tcf_hash_create() and all actions pass false, meaning this patch should have no effect yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07bpf: make programs see skb->data == L2 for ingress and egressAlexei Starovoitov
eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data. For ingress L2 header is already pulled, whereas for egress it's present. This is known to program writers which are currently forced to use BPF_LL_OFF workaround. Since programs don't change skb internal pointers it is safe to do pull/push right around invocation of the program and earlier taps and later pt->func() will not be affected. Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick around run_filter/BPF_PROG_RUN even if skb_shared. This fix finally allows programs to use optimized LD_ABS/IND instructions without BPF_LL_OFF for higher performance. tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o w/o JIT w/JIT before 20.5 23.6 Mpps after 21.8 26.6 Mpps Old programs with BPF_LL_OFF will still work as-is. We can now undo most of the earlier workaround commit: a166151cbe33 ("bpf: fix bpf helpers to use skb->mac_header relative offsets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-16bpf: fix bpf helpers to use skb->mac_header relative offsetsAlexei Starovoitov
For the short-term solution, lets fix bpf helper functions to use skb->mac_header relative offsets instead of skb->data in order to get the same eBPF programs with cls_bpf and act_bpf work on ingress and egress qdisc path. We need to ensure that mac_header is set before calling into programs. This is effectively the first option from below referenced discussion. More long term solution for LD_ABS|LD_IND instructions will be more intrusive but also more beneficial than this, and implemented later as it's too risky at this point in time. I.e., we plan to look into the option of moving skb_pull() out of eth_type_trans() and into netif_receive_skb() as has been suggested as second option. Meanwhile, this solution ensures ingress can be used with eBPF, too, and that we won't run into ABI troubles later. For dealing with negative offsets inside eBPF helper functions, we've implemented bpf_skb_clone_unwritable() to test for unwriteable headers. Reference: http://thread.gmane.org/gmane.linux.network/359129/focus=359694 Fixes: 608cd71a9c7c ("tc: bpf: generalize pedit action") Fixes: 91bc4822c3d6 ("tc: bpf: add checksum helpers") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20act_bpf: add initial eBPF support for actionsDaniel Borkmann
This work extends the "classic" BPF programmable tc action by extending its scope also to native eBPF code! Together with commit e2e9b6541dd4 ("cls_bpf: add initial eBPF support for programmable classifiers") this adds the facility to implement fully flexible classifier and actions for tc that can be implemented in a C subset in user space, "safely" loaded into the kernel, and being run in native speed when JITed. Also, since eBPF maps can be shared between eBPF programs, it offers the possibility that cls_bpf and act_bpf can share data 1) between themselves and 2) between user space applications. That means that, f.e. customized runtime statistics can be collected in user space, but also more importantly classifier and action behaviour could be altered based on map input from the user space application. For the remaining details on the workflow and integration, see the cls_bpf commit e2e9b6541dd4. Preliminary iproute2 part can be found under [1]. [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-act Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17act_bpf: allow non-default TC_ACT opcodes as BPF exec outcomeDaniel Borkmann
Revisiting commit d23b8ad8ab23 ("tc: add BPF based action") with regards to eBPF support, I was thinking that it might be better to improve return semantics from a BPF program invoked through BPF_PROG_RUN(). Currently, in case filter_res is 0, we overwrite the default action opcode with TC_ACT_SHOT. A default action opcode configured through tc's m_bpf can be: TC_ACT_RECLASSIFY, TC_ACT_PIPE, TC_ACT_SHOT, TC_ACT_UNSPEC, TC_ACT_OK. In cls_bpf, we have the possibility to overwrite the default class associated with the classifier in case filter_res is _not_ 0xffffffff (-1). That allows us to fold multiple [e]BPF programs into a single one, where they would otherwise need to be defined as a separate classifier with its own classid, needlessly redoing parsing work, etc. Similarly, we could do better in act_bpf: Since above TC_ACT* opcodes are exported to UAPI anyway, we reuse them for return-code-to-tc-opcode mapping, where we would allow above possibilities. Thus, like in cls_bpf, a filter_res of 0xffffffff (-1) means that the configured _default_ action is used. Any unkown return code from the BPF program would fail in tcf_bpf() with TC_ACT_UNSPEC. Should we one day want to make use of TC_ACT_STOLEN or TC_ACT_QUEUED, which both have the same semantics, we have the option to either use that as a default action (filter_res of 0xffffffff) or non-default BPF return code. All that will allow us to transparently use tcf_bpf() for both BPF flavours. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-26net: act_bpf: fix size mismatch on filter preparationDaniel Borkmann
Similarly as in cls_bpf, also this code needs to reject mismatches. Reference: http://article.gmane.org/gmane.linux.network/347406 Fixes: d23b8ad8ab23 ("tc: add BPF based action") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17tc: add BPF based actionJiri Pirko
This action provides a possibility to exec custom BPF code. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>