summaryrefslogtreecommitdiffstats
path: root/net/mptcp/pm_netlink.c
AgeCommit message (Collapse)Author
2022-02-19mptcp: fix race in incoming ADD_ADDR option processingPaolo Abeni
If an MPTCP endpoint received multiple consecutive incoming ADD_ADDR options, mptcp_pm_add_addr_received() can overwrite the current remote address value after the PM lock is released in mptcp_pm_nl_add_addr_received() and before such address is echoed. Fix the issue caching the remote address value a little earlier and always using the cached value after releasing the PM lock. Fixes: f7efc7771eac ("mptcp: drop argument port from mptcp_pm_announce_addr") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19mptcp: fix race in overlapping signal eventsPaolo Abeni
After commit a88c9e496937 ("mptcp: do not block subflows creation on errors"), if a signal address races with a failing subflow creation, the subflow creation failure control path can trigger the selection of the next address to be announced while the current announced is still pending. The above will cause the unintended suppression of the ADD_ADDR announce. Fix the issue skipping the to-be-suppressed announce before it will mark an endpoint as already used. The relevant announce will be triggered again when the current one will complete. Fixes: a88c9e496937 ("mptcp: do not block subflows creation on errors") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mptcp: netlink: process IPv6 addrs in creating listening socketsKishen Maloor
This change updates mptcp_pm_nl_create_listen_socket() to create listening sockets bound to IPv6 addresses (where IPv6 is supported). Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Acked-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-20mptcp: fix removing ids bitmap settingGeliang Tang
In mptcp_pm_nl_rm_addr_or_subflow(), the bit of rm_list->ids[i] in the id_avail_bitmap should be set, not rm_list->ids[1]. This patch fixed it. Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-20mptcp: fix msk traversal in mptcp_nl_cmd_set_flags()Paolo Abeni
The MPTCP endpoint list is under RCU protection, guarded by the pernet spinlock. mptcp_nl_cmd_set_flags() traverses the list without acquiring the spin-lock nor under the RCU critical section. This change addresses the issue performing the lookup and the endpoint update under the pernet spinlock. Fixes: 0f9f696a502e ("mptcp: add set_flags command in PM netlink") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07mptcp: cleanup MPJ subflow list handlingPaolo Abeni
We can simplify the join list handling leveraging the mptcp_release_cb(): if we can acquire the msk socket lock at mptcp_finish_join time, move the new subflow directly into the conn_list, otherwise place it on join_list and let the release_cb process such list. Since pending MPJ connection are now always processed in a timely way, we can avoid flushing the join list every time we have to process all the current subflows. Additionally we can now use the mptcp data lock to protect the join_list, removing the additional spin lock. Finally, the MPJ handshake is now always finalized under the msk socket lock, we can drop the additional synchronization between mptcp_finish_join() and mptcp_close(). Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07mptcp: do not block subflows creation on errorsPaolo Abeni
If the MPTCP configuration allows for multiple subflows creation, and the first additional subflows never reach the fully established status - e.g. due to packets drop or reset - the in kernel path manager do not move to the next subflow. This patch introduces a new PM helper to cope with MPJ subflow creation failure and delay and hook it where appropriate. Such helper triggers additional subflow creation, as needed and updates the PM subflow counter, if the current one is closing. Additionally start all the needed additional subflows as soon as the MPTCP socket is fully established, so we don't have to cope with slow MPJ handshake blocking the next subflow creation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07mptcp: keep track of local endpoint still available for each mskPaolo Abeni
Include into the path manager status a bitmap tracking the list of local endpoints still available - not yet used - for the relevant mptcp socket. Keep such map updated at endpoint creation/deletion time, so that we can easily skip already used endpoint at local address selection time. The endpoint used by the initial subflow is lazyly accounted at subflow creation time: the usage bitmap is be up2date before endpoint selection and we avoid such unneeded task in some relevant scenarios - e.g. busy servers accepting incoming subflows but not creating any additional ones nor annuncing additional addresses. Overall this allows for fair local endpoints usage in case of subflow failure. As a side effect, this patch also enforces that each endpoint is used at most once for each mptcp connection. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07mptcp: fix per socket endpoint accountingPaolo Abeni
Since full-mesh endpoint support, the reception of a single ADD_ADDR option can cause multiple subflows creation. When such option is accepted we increment 'add_addr_accepted' by one. When we received a paired RM_ADDR option, we deleted all the relevant subflows, decrementing 'add_addr_accepted' by one for each of them. We have a similar issue for 'local_addr_used' Fix them moving the pm endpoint accounting outside the subflow traversal. Fixes: 1a0d6136c5f0 ("mptcp: local addresses fullmesh") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-17mptcp: clean up harmless false expressionsJean Sacren
entry->addr.id is u8 with a range from 0 to 255 and MAX_ADDR_ID is 255. We should drop both false expressions of (entry->addr.id > MAX_ADDR_ID). We should also remove the obsolete parentheses in the first if branch. Use U8_MAX for MAX_ADDR_ID and add a comment to show the link to mptcp_addr_info.id as suggested by Mr. Matthieu Baerts. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-13mptcp: never allow the PM to close a listener subflowPaolo Abeni
Currently, when deleting an endpoint the netlink PM treverses all the local MPTCP sockets, regardless of their status. If an MPTCP listener socket is bound to the IP matching the delete endpoint, the listener TCP socket will be closed. That is unexpected, the PM should only affect data subflows. Additionally, syzbot was able to trigger a NULL ptr dereference due to the above: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 1 PID: 6550 Comm: syz-executor122 Not tainted 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__lock_acquire+0xd7d/0x54a0 kernel/locking/lockdep.c:4897 Code: 0f 0e 41 be 01 00 00 00 0f 86 c8 00 00 00 89 05 69 cc 0f 0e e9 bd 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <80> 3c 02 00 0f 85 f3 2f 00 00 48 81 3b 20 75 17 8f 0f 84 52 f3 ff RSP: 0018:ffffc90001f2f818 EFLAGS: 00010016 RAX: dffffc0000000000 RBX: 0000000000000018 RCX: 0000000000000000 RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 000000000000000a R12: 0000000000000000 R13: ffff88801b98d700 R14: 0000000000000000 R15: 0000000000000001 FS: 00007f177cd3d700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f177cd1b268 CR3: 000000001dd55000 CR4: 0000000000350ee0 Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5637 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5602 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:162 finish_wait+0xc0/0x270 kernel/sched/wait.c:400 inet_csk_wait_for_connect net/ipv4/inet_connection_sock.c:464 [inline] inet_csk_accept+0x7de/0x9d0 net/ipv4/inet_connection_sock.c:497 mptcp_accept+0xe5/0x500 net/mptcp/protocol.c:2865 inet_accept+0xe4/0x7b0 net/ipv4/af_inet.c:739 mptcp_stream_accept+0x2e7/0x10e0 net/mptcp/protocol.c:3345 do_accept+0x382/0x510 net/socket.c:1773 __sys_accept4_file+0x7e/0xe0 net/socket.c:1816 __sys_accept4+0xb0/0x100 net/socket.c:1846 __do_sys_accept net/socket.c:1864 [inline] __se_sys_accept net/socket.c:1861 [inline] __x64_sys_accept+0x71/0xb0 net/socket.c:1861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f177cd8b8e9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f177cd3d308 EFLAGS: 00000246 ORIG_RAX: 000000000000002b RAX: ffffffffffffffda RBX: 00007f177ce13408 RCX: 00007f177cd8b8e9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007f177ce13400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f177ce1340c R13: 00007f177cde1004 R14: 6d705f706374706d R15: 0000000000022000 </TASK> Fix the issue explicitly skipping MPTCP socket in TCP_LISTEN status. Reported-and-tested-by: syzbot+e4d843bb96a9431e6331@syzkaller.appspotmail.com Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Fixes: 740d798e8767 ("mptcp: remove id 0 address") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/ebc7594cdd420d241fb2172ddb8542ba64717657.1639238695.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: allow changing the "backup" bit by endpoint idDavide Caratti
a non-zero 'id' is sufficient to identify MPTCP endpoints: allow changing the value of 'backup' bit by simply specifying the endpoint id. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/158 Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-16mptcp: Make mptcp_pm_nl_mp_prio_send_ack() staticMat Martineau
This function is only used within pm_netlink.c now. Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16mptcp: increase default max additional subflows to 2Paolo Abeni
The current default does not allowing additional subflows, mostly as a safety restriction to avoid uncontrolled resource consumption on busy servers. Still the system admin and/or the application have to opt-in to MPTCP explicitly. After that, they need to change (increase) the default maximum number of additional subflows. Let set that to reasonable default, and make end-users life easier. Additionally we need to update some self-tests accordingly. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24mptcp: allow changing the 'backup' bit when no sockets are openDavide Caratti
current Linux refuses to change the 'backup' bit of MPTCP endpoints, i.e. using MPTCP_PM_CMD_SET_FLAGS, unless it finds (at least) one subflow that matches the endpoint address. There is no reason for that, so we can just ignore the return value of mptcp_nl_addr_backup(). In this way, endpoints can reconfigure their 'backup' flag even if no MPTCP sockets are open (or more generally, in case the MP_PRIO message is not sent out). Fixes: 0f9f696a502e ("mptcp: add set_flags command in PM netlink") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-03mptcp: Only send extra TCP acks in eligible socket statesMat Martineau
Recent changes exposed a bug where specifically-timed requests to the path manager netlink API could trigger a divide-by-zero in __tcp_select_window(), as syzkaller does: divide error: 0000 [#1] SMP KASAN NOPTI CPU: 0 PID: 9667 Comm: syz-executor.0 Not tainted 5.14.0-rc6+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:__tcp_select_window+0x509/0xa60 net/ipv4/tcp_output.c:3016 Code: 44 89 ff e8 c9 29 e9 fd 45 39 e7 0f 8d 20 ff ff ff e8 db 28 e9 fd 44 89 e3 e9 13 ff ff ff e8 ce 28 e9 fd 44 89 e0 44 89 e3 99 <f7> 7c 24 04 29 d3 e9 fc fe ff ff e8 b7 28 e9 fd 44 89 f1 48 89 ea RSP: 0018:ffff888031ccf020 EFLAGS: 00010216 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000040000 RDX: 0000000000000000 RSI: ffff88811532c080 RDI: 0000000000000002 RBP: 0000000000000000 R08: ffffffff835807c2 R09: 0000000000000000 R10: 0000000000000004 R11: ffffed1020b92441 R12: 0000000000000000 R13: 1ffff11006399e08 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fa4c8344700(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2f424000 CR3: 000000003e4e2003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: tcp_select_window net/ipv4/tcp_output.c:264 [inline] __tcp_transmit_skb+0xc00/0x37a0 net/ipv4/tcp_output.c:1351 __tcp_send_ack.part.0+0x3ec/0x760 net/ipv4/tcp_output.c:3972 __tcp_send_ack net/ipv4/tcp_output.c:3978 [inline] tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3978 mptcp_pm_nl_addr_send_ack+0x1ab/0x380 net/mptcp/pm_netlink.c:654 mptcp_pm_remove_addr+0x161/0x200 net/mptcp/pm.c:58 mptcp_nl_remove_id_zero_address+0x197/0x460 net/mptcp/pm_netlink.c:1328 mptcp_nl_cmd_del_addr+0x98b/0xd40 net/mptcp/pm_netlink.c:1359 genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731 genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792 netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504 genl_rcv+0x24/0x40 net/netlink/genetlink.c:803 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0x14e/0x190 net/socket.c:724 ____sys_sendmsg+0x709/0x870 net/socket.c:2403 ___sys_sendmsg+0xff/0x170 net/socket.c:2457 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae mptcp_pm_nl_addr_send_ack() was attempting to send a TCP ACK on the first subflow in the MPTCP socket's connection list without validating that the subflow was in a suitable connection state. To address this, always validate subflow state when sending extra ACKs on subflows for address advertisement or subflow priority change. Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/229 Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: remove MPTCP_ADD_ADDR_IPV6 and MPTCP_ADD_ADDR_PORTYonglong Li
MPTCP_ADD_ADDR_IPV6 and MPTCP_ADD_ADDR_PORT are not necessary, we can get these info from pm.local or pm.remote. Drop mptcp_pm_should_add_signal_ipv6 and mptcp_pm_should_add_signal_port too. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: make MPTCP_ADD_ADDR_SIGNAL and MPTCP_ADD_ADDR_ECHO separateYonglong Li
Use MPTCP_ADD_ADDR_SIGNAL only for the action of sending ADD_ADDR, and use MPTCP_ADD_ADDR_ECHO only for the action of sending ADD_ADDR echo. Use msk->pm.local to save the announced ADD_ADDR address only, and reuse msk->pm.remote to save the announced ADD_ADDR_ECHO address. To prepare for the next patch. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/ptp/Kconfig: 55c8fca1dae1 ("ptp_pch: Restore dependency on PCI") e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19mptcp: fix memory leak on address flushPaolo Abeni
The endpoint cleanup path is prone to a memory leak, as reported by syzkaller: BUG: memory leak unreferenced object 0xffff88810680ea00 (size 64): comm "syz-executor.6", pid 6191, jiffies 4295756280 (age 24.138s) hex dump (first 32 bytes): 58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de Xu}<...."....... 01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00 ................ backtrace: [<0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline] [<0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170 [<00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731 [<00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] [<00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792 [<0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504 [<00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803 [<000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340 [<0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929 [<00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline] [<00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724 [<000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403 [<000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457 [<00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486 [<00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae We should not require an allocation to cleanup stuff. Rework the code a bit so that the additional RCU work is no more needed. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18mptcp: local addresses fullmeshGeliang Tang
In mptcp_pm_nl_add_addr_received(), fill a temporary allocate array of all local address corresponding to the fullmesh endpoint. If such array is empty, keep the current behavior. Elsewhere loop on such array and create a subflow for each local address towards the given remote address Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18mptcp: remote addresses fullmeshGeliang Tang
This patch added and managed a new per endpoint flag, named MPTCP_PM_ADDR_FLAG_FULLMESH. In mptcp_pm_create_subflow_or_signal_addr(), if such flag is set, instead of: remote_address((struct sock_common *)sk, &remote); fill a temporary allocated array of all known remote address. After releaseing the pm lock loop on such array and create a subflow for each remote address from the given local. Note that the we could still use an array even for non 'fullmesh' endpoint: with a single entry corresponding to the primary MPC subflow remote address. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18mptcp: drop flags and ifindex argumentsGeliang Tang
This patch added a new helper mptcp_pm_get_flags_and_ifindex_by_id(), and used it in __mptcp_subflow_connect() to get the flags and ifindex values. Then the two arguments flags and ifindex of __mptcp_subflow_connect() can be dropped. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14mptcp: add mibs for stale subflows processingPaolo Abeni
This allows monitoring exceptional events like active backup scenarios. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14mptcp: faster active backup recoveryPaolo Abeni
The msk can use backup subflows to transmit in-sequence data only if there are no other active subflow. On active backup scenario, the MPTCP connection can do forward progress only due to MPTCP retransmissions - rtx can pick backup subflows. This patch introduces a new flag flow MPTCP subflows: if the underlying TCP connection made no progresses for long time, and there are other less problematic subflows available, the given subflow become stale. Stale subflows are not considered active: if all non backup subflows become stale, the MPTCP scheduler can pick backup subflows for plain transmissions. Stale subflows can return in active state, as soon as any reply from the peer is observed. Active backup scenarios can now leverage the available b/w with no restrinction. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03mptcp: drop unused rcu member in mptcp_pm_addr_entryGeliang Tang
kfree_rcu() had been removed from pm_netlink.c, so this rcu field in struct mptcp_pm_addr_entry became useless. Let's drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20210802231914.54709-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-22mptcp: add deny_join_id0 in mptcp_options_receivedGeliang Tang
This patch added a new flag named deny_join_id0 in struct mptcp_options_received. Set it when MP_CAPABLE with the flag MPTCP_CAP_DENYJOIN_ID0 is received. Also add a new flag remote_deny_join_id0 in struct mptcp_pm_data. When the flag deny_join_id0 is set, set this remote_deny_join_id0 flag. In mptcp_pm_create_subflow_or_signal_addr, if the remote_deny_join_id0 flag is set, and the remote address id is zero, stop this connection. Suggested-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22mptcp: use fast lock for subflows when possiblePaolo Abeni
There are a bunch of callsite where the ssk socket lock is acquired using the full-blown version eligible for the fast variant. Let's move to the latter. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-28mptcp: make sure flag signal is set when add addr with portJianguo Wu
When add address with port, it is mean to create a listening socket, and send an ADD_ADDR to remote, so it must have flag signal set, add this check in mptcp_pm_parse_addr(). Fixes: a77e9179c7651 ("mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink") Acked-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: remove redundant initialization in pm_nl_init_net()Jianguo Wu
Memory of struct pm_nl_pernet{} is allocated by kzalloc() in setup_net()->ops_init(), so it's no need to reset counters and zero bitmap in pm_nl_init_net(). Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-25mptcp: validate 'id' when stopping the ADD_ADDR retransmit timerDavide Caratti
when Linux receives an echo-ed ADD_ADDR, it checks the IP address against the list of "announced" addresses. In case of a positive match, the timer that handles retransmissions is stopped regardless of the 'Address Id' in the received packet: this behaviour does not comply with RFC8684 3.4.1. Fix it by validating the 'Address Id' in received echo-ed ADD_ADDRs. Tested using packetdrill, with the following captured output: unpatched kernel: Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0 In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 3013740213], length 0 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0 In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 3013740213], length 0 ^^^ retransmission is stopped here, but 'Address Id' is 90 patched kernel: Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0 In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 1672384568], length 0 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0 In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 1672384568], length 0 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0 In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 198.51.100.2,mptcp dss ack 1672384568], length 0 ^^^ retransmission is stopped here, only when both 'Address Id' and 'IP Address' match Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07mptcp: move flags and ifindex out of mptcp_addr_infoGeliang Tang
This patch moved the flags and ifindex fields from struct mptcp_addr_info to struct mptcp_pm_addr_entry. Add the flags and ifindex values as two new parameters to __mptcp_subflow_connect. In mptcp_pm_create_subflow_or_signal_addr, pass the local address entry's flags and ifindex fields to __mptcp_subflow_connect. In mptcp_pm_nl_add_addr_received, just pass two zeros to it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02mptcp: add mptcp reset option supportFlorian Westphal
The MPTCP reset option allows to carry a mptcp-specific error code that provides more information on the nature of a connection reset. Reset option data received gets stored in the subflow context so it can be sent to userspace via the 'subflow closed' netlink event. When a subflow is closed, the desired error code that should be sent to the peer is also placed in the subflow context structure. If a reset is sent before subflow establishment could complete, e.g. on HMAC failure during an MP_JOIN operation, the mptcp skb extension is used to store the reset information. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: remove id 0 addressGeliang Tang
This patch added a new function mptcp_nl_remove_id_zero_address to remove the id 0 address. In this function, traverse all the existing msk sockets to find the msk matched the input IP address. Then fill the removing list with id 0, and pass it to mptcp_pm_remove_addr and mptcp_pm_remove_subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: unify RM_ADDR and RM_SUBFLOW receivingGeliang Tang
There are some duplicate code in mptcp_pm_nl_rm_addr_received and mptcp_pm_nl_rm_subflow_received. This patch unifies them into a new function named mptcp_pm_nl_rm_addr_or_subflow. In it, use the input parameter rm_type to identify it's now removing an address or a subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: remove all subflows involving id 0 addressGeliang Tang
There's only one subflow involving the non-zero id address, but there may be multi subflows involving the id 0 address. Here's an example: local_id=0, remote_id=0 local_id=1, remote_id=0 local_id=0, remote_id=1 If the removing address id is 0, all the subflows involving the id 0 address need to be removed. In mptcp_pm_nl_rm_addr_received/mptcp_pm_nl_rm_subflow_received, the "break" prevents the iteration to the next subflow, so this patch dropped them. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: rename mptcp_pm_nl_add_addr_send_ackGeliang Tang
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: send ack for rm_addrGeliang Tang
This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for RM_ADDR too. In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send the ACK packet. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: drop useless addr_signal clearGeliang Tang
msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: export lookup_anno_list_by_saddrGeliang Tang
This patch exported the static function lookup_anno_list_by_saddr, and renamed it to mptcp_lookup_anno_list_by_saddr. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: move to next addr when timeoutGeliang Tang
This patch called mptcp_pm_subflow_established to move to the next address when an ADD_ADDR has been retransmitted the maximum number of times. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: skip connecting the connected addressGeliang Tang
This patch added a new helper named lookup_subflow_by_daddr to find whether the destination address is in the msk's conn_list. In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check whether the announced address is already connected. If it is, skip connecting this address and send out the echo. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: drop argument port from mptcp_pm_announce_addrGeliang Tang
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the port field of another argument 'addr' instead. Fixes: 0f5c9e3f079f ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12mptcp: remove a list of addrs when flushingGeliang Tang
This patch invoked mptcp_nl_remove_addrs_list to remove a list of addresses when the netlink flushes addresses, instead of using mptcp_nl_remove_subflow_and_signal_addr to remove them one by one. And dropped the unused parameter net in __flush_addrs too. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12mptcp: remove multi addresses and subflows in PMGeliang Tang
This patch implemented the function to remove a list of addresses and subflows, named mptcp_nl_remove_addrs_list, which had a input parameter rm_list as the removing addresses list. In mptcp_nl_remove_addrs_list, traverse all the existing msk sockets to invoke mptcp_pm_remove_addrs_and_subflows to remove a list of addresses for each msk socket. In mptcp_pm_remove_addrs_and_subflows, traverse all the addresses in the removing addresses list, to find whether this address is in the conn_list or anno_list. If it is, put the address ID into the removing address list or the removing subflow list, and pass the two lists to mptcp_pm_remove_addr and mptcp_pm_remove_subflow. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12mptcp: remove multi subflows in PMGeliang Tang
This patch dealt with removing multi subflows in PM: In mptcp_pm_remove_subflow, changed the input parameter local_id as an list of removing address ids, and passed the list to mptcp_pm_nl_rm_subflow_received. In mptcp_pm_nl_rm_subflow_received, iterated each address id from the received ids list. Then shut down and closed each address id's subsocket. In mptcp_nl_remove_subflow_and_signal_addr, put the single address id into an ids list, and passed it to mptcp_pm_remove_subflow. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12mptcp: remove multi addresses in PMGeliang Tang
This patch dropped the member rm_id of struct mptcp_pm_data. Use rm_list_rx in mptcp_pm_nl_rm_addr_received instead of using rm_id. In mptcp_pm_nl_rm_addr_received, iterated each address id from pm.rm_list_rx, then shut down and closed each address id's subsocket. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12mptcp: add rm_list_tx in mptcp_pm_dataGeliang Tang
This patch added a new member rm_list_tx for struct mptcp_pm_data as the removing address list on the outgoing direction. Initialize its nr field to zero in mptcp_pm_data_init. In mptcp_pm_remove_anno_addr, put the single address id into an removing list, and passed it to mptcp_pm_remove_addr. In mptcp_pm_remove_addr, save the input rm_list to rm_list_tx in struct mptcp_pm_data. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15mptcp: add local addr info in mptcp_infoGeliang Tang
Add mptcpi_local_addr_used and mptcpi_local_addr_max in struct mptcp_info. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>