aboutsummaryrefslogtreecommitdiffstats
path: root/include/net/tls.h
AgeCommit message (Collapse)Author
2020-12-08net/tls: Protect from calling tls_dev_del for TLS RX twiceMaxim Mikityanskiy
[ Upstream commit 025cc2fb6a4e84e9a0552c0017dcd1c24b7ac7da ] tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after calling tls_dev_del if TLX TX offload is also enabled. Clearing tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a time frame when tls_device_down may get called and call tls_dev_del for RX one extra time, confusing the driver, which may lead to a crash. This patch corrects this racy behavior by adding a flag to prevent tls_device_down from calling tls_dev_del the second time. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20201125221810.69870-1-saeedm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-28net/tls: make sure offload also gets the keys wipedJakub Kicinski
[ Upstream commit acd3e96d53a24d219f720ed4012b62723ae05da1 ] Commit 86029d10af18 ("tls: zero the crypto information from tls_context before freeing") added memzero_explicit() calls to clear the key material before freeing struct tls_context, but it missed tls_device.c has its own way of freeing this structure. Replace the missing free. Fixes: 86029d10af18 ("tls: zero the crypto information from tls_context before freeing") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11net/tls: replace the sleeping lock around RX resync with a bit lockJakub Kicinski
[ Upstream commit e52972c11d6b1262964db96d65934196db621685 ] Commit 38030d7cb779 ("net/tls: avoid NULL-deref on resync during device removal") tried to fix a potential NULL-dereference by taking the context rwsem. Unfortunately the RX resync may get called from soft IRQ, so we can't use the rwsem to protect from the device disappearing. Because we are guaranteed there can be only one resync at a time (it's called from strparser) use a bit to indicate resync is busy and make device removal wait for the bit to get cleared. Note that there is a leftover "flags" field in struct tls_context already. Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27net/tls: prevent bad memory access in tls_is_sk_tx_device_offloaded()Jakub Kicinski
[ Upstream commit b4f47f3848eb70986f75d06112af7b48b7f5f462 ] Unlike '&&' operator, the '&' does not have short-circuit evaluation semantics. IOW both sides of the operator always get evaluated. Fix the wrong operator in tls_is_sk_tx_device_offloaded(), which would lead to out-of-bounds access for for non-full sockets. Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-13tls: zero the crypto information from tls_context before freeingSabrina Dubroca
This contains key material in crypto_send_aes_gcm_128 and crypto_recv_aes_gcm_128. Introduce union tls_crypto_context, and replace the two identical unions directly embedded in struct tls_context with it. We can then use this union to clean up the memory in the new tls_ctx_free() function. Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-13net/tls: Combined memory allocation for decryption requestVakul Garg
For preparing decryption request, several memory chunks are required (aead_req, sgin, sgout, iv, aad). For submitting the decrypt request to an accelerator, it is required that the buffers which are read by the accelerator must be dma-able and not come from stack. The buffers for aad and iv can be separately kmalloced each, but it is inefficient. This patch does a combined allocation for preparing decryption request and then segments into aead_req || sgin || sgout || iv || aad. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16tls: Add rx inline crypto offloadBoris Pismenny
This patch completes the generic infrastructure to offload TLS crypto to a network device. It enables the kernel to skip decryption and authentication of some skbs marked as decrypted by the NIC. In the fast path, all packets received are decrypted by the NIC and the performance is comparable to plain TCP. This infrastructure doesn't require a TCP offload engine. Instead, the NIC only decrypts packets that contain the expected TCP sequence number. Out-Of-Order TCP packets are provided unmodified. As a result, at the worst case a received TLS record consists of both plaintext and ciphertext packets. These partially decrypted records must be reencrypted, only to be decrypted. The notable differences between SW KTLS Rx and this offload are as follows: 1. Partial decryption - Software must handle the case of a TLS record that was only partially decrypted by HW. This can happen due to packet reordering. 2. Resynchronization - tls_read_size calls the device driver to resynchronize HW after HW lost track of TLS record framing in the TCP stream. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16tls: Split tls_sw_release_resources_rxBoris Pismenny
This patch splits tls_sw_release_resources_rx into two functions one which releases all inner software tls structures and another that also frees the containing structure. In TLS_DEVICE we will need to release the software structures without freeeing the containing structure, which contains other information. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16tls: Split decrypt_skb to two functionsBoris Pismenny
Previously, decrypt_skb also updated the TLS context. Now, decrypt_skb only decrypts the payload using the current context, while decrypt_skb_update also updates the state. Later, in the tls_device Rx flow, we will use decrypt_skb directly. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16tls: Refactor tls_offload variable namesBoris Pismenny
For symmetry, we rename tls_offload_context to tls_offload_context_tx before we add tls_offload_context_rx. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLLLinus Torvalds
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-11tls: fix NULL pointer dereference on pollDaniel Borkmann
While hacking on kTLS, I ran into the following panic from an unprivileged netserver / netperf TCP session: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0 Oops: 0010 [#1] SMP KASAN PTI CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139 Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016 RIP: 0010: (null) Code: Bad RIP value. RSP: 0018:ffff88036abcf740 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26 RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280 RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48 R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280 R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350 FS: 00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? tls_sw_poll+0xa4/0x160 [tls] ? sock_poll+0x20a/0x680 ? do_select+0x77b/0x11a0 ? poll_schedule_timeout.constprop.12+0x130/0x130 ? pick_link+0xb00/0xb00 ? read_word_at_a_time+0x13/0x20 ? vfs_poll+0x270/0x270 ? deref_stack_reg+0xad/0xe0 ? __read_once_size_nocheck.constprop.6+0x10/0x10 [...] Debugging further, it turns out that calling into ctx->sk_poll() is invalid since sk_poll itself is NULL which was saved from the original TCP socket in order for tls_sw_poll() to invoke it. Looks like the recent conversion from poll to poll_mask callback started in 152524231023 ("net: add support for ->poll_mask in proto_ops") missed to eventually convert kTLS, too: TCP's ->poll was converted over to the ->poll_mask in commit 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask") and therefore kTLS wrongly saved the ->poll old one which is now NULL. Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN | POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in tcp_poll_mask() as well that is mangled here. Fixes: 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Watson <davejwatson@fb.com> Tested-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17tls: don't use stack memory in a scatterlistMatt Mullins
scatterlist code expects virt_to_page() to work, which fails with CONFIG_VMAP_STACK=y. Fixes: c46234ebb4d1e ("tls: RX path for ktls") Signed-off-by: Matt Mullins <mmullins@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Overlapping changes in selftests Makefile. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01net/tls: Don't recursively call push_record during tls_write_space callbacksDave Watson
It is reported that in some cases, write_space may be called in do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again: [ 660.468802] ? do_tcp_sendpages+0x8d/0x580 [ 660.468826] ? tls_push_sg+0x74/0x130 [tls] [ 660.468852] ? tls_push_record+0x24a/0x390 [tls] [ 660.468880] ? tls_write_space+0x6a/0x80 [tls] ... tls_push_sg already does a loop over all sending sg's, so ignore any tls_write_space notifications until we are done sending. We then have to call the previous write_space to wake up poll() waiters after we are done with the send loop. Reported-by: Andre Tomt <andre@tomt.net> Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01net/tls: Add generic NIC offload infrastructureIlya Lesokhin
This patch adds a generic infrastructure to offload TLS crypto to a network device. It enables the kernel TLS socket to skip encryption and authentication operations on the transmit side of the data path. Leaving those computationally expensive operations to the NIC. The NIC offload infrastructure builds TLS records and pushes them to the TCP layer just like the SW KTLS implementation and using the same API. TCP segmentation is mostly unaffected. Currently the only exception is that we prevent mixed SKBs where only part of the payload requires offload. In the future we are likely to add a similar restriction following a change cipher spec record. The notable differences between SW KTLS and NIC offloaded TLS implementations are as follows: 1. The offloaded implementation builds "plaintext TLS record", those records contain plaintext instead of ciphertext and place holder bytes instead of authentication tags. 2. The offloaded implementation maintains a mapping from TCP sequence number to TLS records. Thus given a TCP SKB sent from a NIC offloaded TLS socket, we can use the tls NIC offload infrastructure to obtain enough context to encrypt the payload of the SKB. A TLS record is released when the last byte of the record is ack'ed, this is done through the new icsk_clean_acked callback. The infrastructure should be extendable to support various NIC offload implementations. However it is currently written with the implementation below in mind: The NIC assumes that packets from each offloaded stream are sent as plaintext and in-order. It keeps track of the TLS records in the TCP stream. When a packet marked for offload is transmitted, the NIC encrypts the payload in-place and puts authentication tags in the relevant place holders. The responsibility for handling out-of-order packets (i.e. TCP retransmission, qdisc drops) falls on the netdev driver. The netdev driver keeps track of the expected TCP SN from the NIC's perspective. If the next packet to transmit matches the expected TCP SN, the driver advances the expected TCP SN, and transmits the packet with TLS offload indication. If the next packet to transmit does not match the expected TCP SN. The driver calls the TLS layer to obtain the TLS record that includes the TCP of the packet for transmission. Using this TLS record, the driver posts a work entry on the transmit queue to reconstruct the NIC TLS state required for the offload of the out-of-order packet. It updates the expected TCP SN accordingly and transmits the now in-order packet. The same queue is used for packet transmission and TLS context reconstruction to avoid the need for flushing the transmit queue before issuing the context reconstruction request. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01net/tls: Split conf to rx + txBoris Pismenny
In TLS inline crypto, we can have one direction in software and another in hardware. Thus, we split the TLS configuration to separate structures for receive and transmit. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31tls: support for Inline tls recordAtul Gupta
Facility to register Inline TLS drivers to net/tls. Setup TLS_HW_RECORD prot to listen on offload device. Cases handled - Inline TLS device exists, setup prot for TLS_HW_RECORD - Atleast one Inline TLS exists, sets TLS_HW_RECORD. - If non-inline device establish connection, move to TLS_SW_TX Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23tls: RX path for ktlsDave Watson
Add rx path for tls software implementation. recvmsg, splice_read, and poll implemented. An additional sockopt TLS_RX is added, with the same interface as TLS_TX. Either TLX_RX or TLX_TX may be provided separately, or together (with two different setsockopt calls with appropriate keys). Control messages are passed via CMSG in a similar way to transmit. If no cmsg buffer is passed, then only application data records will be passed to userspace, and EIO is returned for other types of alerts. EBADMSG is passed for decryption errors, and EMSGSIZE is passed for framing too big, and EBADMSG for framing too small (matching openssl semantics). EINVAL is returned for TLS versions that do not match the original setsockopt call. All are unrecoverable. strparser is used to parse TLS framing. Decryption is done directly in to userspace buffers if they are large enough to support it, otherwise sk_cow_data is called (similar to ipsec), and buffers are decrypted in place and copied. splice_read always decrypts in place, since no buffers are provided to decrypt in to. sk_poll is overridden, and only returns POLLIN if a full TLS message is received. Otherwise we wait for strparser to finish reading a full frame. Actual decryption is only done during recvmsg or splice_read calls. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23tls: Refactor variable namesDave Watson
Several config variables are prefixed with tx, drop the prefix since these will be used for both tx and rx. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23tls: Pass error code explicitly to tls_err_abortDave Watson
Pass EBADMSG explicitly to tls_err_abort. Receive path will pass additional codes - EMSGSIZE if framing is larger than max TLS record size, EINVAL if TLS version mismatch. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23tls: Move cipher info to a separate structDave Watson
Separate tx crypto parameters to a separate cipher_context struct. The same parameters will be used for rx using the same struct. tls_advance_record_sn is modified to only take the cipher info. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-31tls: Add support for encryption using async offload acceleratorVakul Garg
Async crypto accelerators (e.g. drivers/crypto/caam) support offloading GCM operation. If they are enabled, crypto_aead_encrypt() return error code -EINPROGRESS. In this case tls_do_encryption() needs to wait on a completion till the time the response for crypto offload request is received. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15net/tls: Fix inverted error codes to avoid endless loopr.hering@avm.de
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs. Socket error codes must be inverted by Kernel TLS before returning because they are stored with positive sign. If returned non-inverted they are interpreted as number of bytes sent, causing endless looping of the splice mechanic behind sendfile(). Signed-off-by: Robert Hering <r.hering@avm.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15uapi: fix linux/tls.h userspace compilation errorDmitry V. Levin
Move inclusion of a private kernel header <net/tcp.h> from uapi/linux/tls.h to its only user - net/tls.h, to fix the following linux/tls.h userspace compilation error: /usr/include/linux/tls.h:41:21: fatal error: net/tcp.h: No such file or directory As to this point uapi/linux/tls.h was totaly unusuable for userspace, cleanup this header file further by moving other redundant includes to net/tls.h. Fixes: 3c4d7559159b ("tls: kernel TLS support") Cc: <stable@vger.kernel.org> # v4.13+ Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-14tls: Move tls_make_aad to header to allow sharingIlya Lesokhin
move tls_make_aad as it is going to be reused by the device offload code and rx path. Remove unused recv parameter. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-14tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used.Ilya Lesokhin
Previously the TLS ulp context would leak if we attached a TLS ulp to a socket but did not use the TLS_TX setsockopt, or did use it but it failed. This patch solves the issue by overriding prot[TLS_BASE_TX].close and fixing tls_sk_proto_close to work properly when its called with ctx->tx_conf == TLS_BASE_TX. This patch also removes ctx->free_resources as we can use ctx->tx_conf to obtain the relevant information. Fixes: 3c4d7559159b ('tls: kernel TLS support') Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-14tls: Add function to update the TLS socket configurationIlya Lesokhin
The tx configuration is now stored in ctx->tx_conf. And sk->sk_prot is updated trough a function This will simplify things when we add rx and support for different possible tx and rx cross configurations. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15tls: kernel TLS supportDave Watson
Software implementation of transport layer security, implemented using ULP infrastructure. tcp proto_ops are replaced with tls equivalents of sendmsg and sendpage. Only symmetric crypto is done in the kernel, keys are passed by setsockopt after the handshake is complete. All control messages are supported via CMSG data - the actual symmetric encryption is the same, just the message type needs to be passed separately. For user API, please see Documentation patch. Pieces that can be shared between hw and sw implementation are in tls_main.c Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>