summaryrefslogtreecommitdiffstats
path: root/Documentation/networking/ip-sysctl.rst
AgeCommit message (Collapse)Author
2021-04-14doc: move seg6_flowlabel to seg6-sysctl.rstNicolas Dichtel
Let's have all seg6 sysctl at the same place. Fixes: a6dc6670cd7e ("ipv6: sr: Add documentation for seg_flowlabel sysctl") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
2021-02-11tcp: fix tcp_rmem documentationEric Dumazet
tcp_rmem[1] has been changed to 131072, we should update the documentation to reflect this. Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Zhibin Liu <zhibinliu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09Documentation: networking: ip-sysctl: Document src_valid_mark sysctlJay Vosburgh
Provide documentation for src_valid_mark sysctl, which was added in commit 28f6aeea3f12 ("net: restore ip source validation"). Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08IPv6: Extend 'fib_notify_on_flag_change' sysctlAmit Cohen
Add the value '2' to 'fib_notify_on_flag_change' to allow sending notifications only for failed route installation. Separate value is added for such notifications because there are less of them, so they do not impact performance and some users will find them more important. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08IPv4: Extend 'fib_notify_on_flag_change' sysctlAmit Cohen
Add the value '2' to 'fib_notify_on_flag_change' to allow sending notifications only for failed route installation. Separate value is added for such notifications because there are less of them, so they do not impact performance and some users will find them more important. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-02net: ipv6: Emit notification when fib hardware flags are changedAmit Cohen
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Move fib6_info_hw_flags_set() to C file because it is no longer a short function. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02net: ipv4: Emit notification when fib hardware flags are changedAmit Cohen
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01docs: networking: swap words in icmp_errors_use_inbound_ifaddr docVincent Bernat
Signed-off-by: Vincent Bernat <vincent@bernat.ch> Link: https://lore.kernel.org/r/20210130190518.854806-1-vincent@bernat.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/can/dev.c b552766c872f ("can: dev: prevent potential information leak in can_fill_info()") 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir") 0a042c6ec991 ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 57ac4a31c483 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") 214baf22870c ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c 20776b465c0c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") ffb68fc58e96 ("net: switchdev: remove the transaction structure from port object notifiers") bae33f2b5afe ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26net: allow user to set metric on default route learned via Router AdvertisementPraveen Chaudhary
For IPv4, default route is learned via DHCPv4 and user is allowed to change metric using config etc/network/interfaces. But for IPv6, default route can be learned via RA, for which, currently a fixed metric value 1024 is used. Ideally, user should be able to configure metric on default route for IPv6 similar to IPv4. This patch adds sysctl for the same. Logs: For IPv4: Config in etc/network/interfaces: auto eth0 iface eth0 inet dhcp metric 4261413864 IPv4 Kernel Route Table: $ ip route list default via 172.21.47.1 dev eth0 metric 4261413864 FRR Table, if a static route is configured: [In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.] Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, > - selected route, * - FIB route S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03 K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m i.e. User can prefer Default Router learned via Routing Protocol in IPv4. Similar behavior is not possible for IPv6, without this fix. After fix [for IPv6]: sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705 IP monitor: [When IPv6 RA is received] default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high Kernel IPv6 routing table $ ip -6 route list default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high FRR Table, if a static route is configured: [In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.] Codes: K - kernel route, C - connected, S - static, R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, > - selected route, * - FIB route S>* ::/0 [20/0] is directly connected, eth0, 00:00:06 K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m If the metric is changed later, the effect will be seen only when next IPv6 RA is received, because the default route must be fully controlled by RA msg. Below metric is changed from 1996489705 to 1996489704. $ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704 net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704 IP monitor: [On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric] Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com> Signed-off-by: Zhenggen Xu <zxu@linkedin.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-23doc: networking: ip-sysctl: Document conf/all/disable_ipv6 and ↵Pali Rohár
conf/default/disable_ipv6 This patch adds documentation for sysctl conf/all/disable_ipv6 and conf/default/disable_ipv6 settings which is currently missing. Signed-off-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20210121150244.20483-1-pali@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11net: evaluate net.ipvX.conf.all.ignore_routes_with_linkdownVincent Bernat
Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl ignores a route whose interface is down. It is provided as a per-interface sysctl. However, while a "all" variant is exposed, it was a noop since it was never evaluated. We use the usual "or" logic for this kind of sysctls. Tested with: ip link add type veth # veth0 + veth1 ip link add type veth # veth1 + veth2 ip link set up dev veth0 ip link set up dev veth1 # link-status paired with veth0 ip link set up dev veth2 ip link set up dev veth3 # link-status paired with veth2 # First available path ip -4 addr add 203.0.113.${uts#H}/24 dev veth0 ip -6 addr add 2001:db8:1::${uts#H}/64 dev veth0 # Second available path ip -4 addr add 192.0.2.${uts#H}/24 dev veth2 ip -6 addr add 2001:db8:2::${uts#H}/64 dev veth2 # More specific route through first path ip -4 route add 198.51.100.0/25 via 203.0.113.254 # via veth0 ip -6 route add 2001:db8:3::/56 via 2001:db8:1::ff # via veth0 # Less specific route through second path ip -4 route add 198.51.100.0/24 via 192.0.2.254 # via veth2 ip -6 route add 2001:db8:3::/48 via 2001:db8:2::ff # via veth2 # H1: enable on "all" # H2: enable on "veth0" for v in ipv4 ipv6; do case $uts in H1) sysctl -qw net.${v}.conf.all.ignore_routes_with_linkdown=1 ;; H2) sysctl -qw net.${v}.conf.veth0.ignore_routes_with_linkdown=1 ;; esac done set -xe # When veth0 is up, best route is through veth0 ip -o route get 198.51.100.1 | grep -Fw veth0 ip -o route get 2001:db8:3::1 | grep -Fw veth0 # When veth0 is down, best route should be through veth2 on H1/H2, # but on veth0 on H2 ip link set down dev veth1 # down veth0 ip route show [ $uts != H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth0 [ $uts != H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth0 [ $uts = H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth2 [ $uts = H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth2 Without this patch, the two last lines would fail on H1 (the one using the "all" sysctl). With the patch, everything succeeds as expected. Also document the sysctl in `ip-sysctl.rst`. Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") Signed-off-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30sctp: enable udp tunneling socksXin Long
This patch is to enable udp tunneling socks by calling sctp_udp_sock_start() in sctp_ctrlsock_init(), and sctp_udp_sock_stop() in sctp_ctrlsock_exit(). Also add sysctl udp_port to allow changing the listening sock's port by users. Wit this patch, the whole sctp over udp feature can be enabled and used. v1->v2: - Also update ctl_sock udp_port in proc_sctp_do_udp_port() where netns udp_port gets changed. v2->v3: - Call htons() when setting sk udp_port from netns udp_port. v3->v4: - Not call sctp_udp_sock_start() when new_value is 0. - Add udp_port entry in ip-sysctl.rst. v4->v5: - Not call sctp_udp_sock_start/stop() in sctp_ctrlsock_init/exit(). - Improve the description of udp_port in ip-sysctl.rst. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30sctp: add encap_port for netns sock asoc and transportXin Long
encap_port is added as per netns/sock/assoc/transport, and the latter one's encap_port inherits the former one's by default. The transport's encap_port value would mostly decide if one packet should go out with udp encapsulated or not. This patch also allows users to set netns' encap_port by sysctl. v1->v2: - Change to define encap_port as __be16 for sctp_sock, asoc and transport. v2->v3: - No change. v3->v4: - Add 'encap_port' entry in ip-sysctl.rst. v4->v5: - Improve the description of encap_port in ip-sysctl.rst. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-16icmp: randomize the global rate limiterEric Dumazet
Keyu Man reported that the ICMP rate limiter could be used by attackers to get useful signal. Details will be provided in an upcoming academic publication. Our solution is to add some noise, so that the attackers no longer can get help from the predictable token bucket limiter. Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-04Documentation: networking: ip-sysctl: drop doubled wordRandy Dunlap
Drop the doubled word "that". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06ipv6: Implement draft-ietf-6man-rfc4941bisFernando Gont
Implement the upcoming rev of RFC4941 (IPv6 temporary addresses): https://tools.ietf.org/html/draft-ietf-6man-rfc4941bis-09 * Reduces the default Valid Lifetime to 2 days The number of extra addresses employed when Valid Lifetime was 7 days exacerbated the stress caused on network elements/devices. Additionally, the motivation for temporary addresses is indeed privacy and reduced exposure. With a default Valid Lifetime of 7 days, an address that becomes revealed by active communication is reachable and exposed for one whole week. The only use case for a Valid Lifetime of 7 days could be some application that is expecting to have long lived connections. But if you want to have a long lived connections, you shouldn't be using a temporary address in the first place. Additionally, in the era of mobile devices, general applications should nevertheless be prepared and robust to address changes (e.g. nodes swap wifi <-> 4G, etc.) * Employs different IIDs for different prefixes To avoid network activity correlation among addresses configured for different prefixes * Uses a simpler algorithm for IID generation No need to store "history" anywhere Signed-off-by: Fernando Gont <fgont@si6networks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30tcp: add hrtimer slack to sack compressionEric Dumazet
Add a sysctl to control hrtimer slack, default of 100 usec. This gives the opportunity to reduce system overhead, and help very short RTT flows. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30docs: networking: convert tcp-thin.txt to ReSTMauro Carvalho Chehab
Not much to be done here: - add SPDX header; - adjust identation, whitespaces and blank lines where needed; - add to networking/index.rst. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-28docs: networking: convert ip-sysctl.txt to ReSTMauro Carvalho Chehab
- add SPDX header; - adjust titles and chapters, adding proper markups; - mark code blocks and literals as such; - mark lists as such; - mark tables as such; - use footnote markup; - adjust identation, whitespaces and blank lines; - add to networking/index.rst. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>