xref: /OK3568_Linux_fs/kernel/Documentation/networking/tls.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. _kernel_tls:
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun==========
4*4882a593SmuzhiyunKernel TLS
5*4882a593Smuzhiyun==========
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunOverview
8*4882a593Smuzhiyun========
9*4882a593Smuzhiyun
10*4882a593SmuzhiyunTransport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs over
11*4882a593SmuzhiyunTCP. TLS provides end-to-end data integrity and confidentiality.
12*4882a593Smuzhiyun
13*4882a593SmuzhiyunUser interface
14*4882a593Smuzhiyun==============
15*4882a593Smuzhiyun
16*4882a593SmuzhiyunCreating a TLS connection
17*4882a593Smuzhiyun-------------------------
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunFirst create a new TCP socket and set the TLS ULP.
20*4882a593Smuzhiyun
21*4882a593Smuzhiyun.. code-block:: c
22*4882a593Smuzhiyun
23*4882a593Smuzhiyun  sock = socket(AF_INET, SOCK_STREAM, 0);
24*4882a593Smuzhiyun  setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunSetting the TLS ULP allows us to set/get TLS socket options. Currently
27*4882a593Smuzhiyunonly the symmetric encryption is handled in the kernel.  After the TLS
28*4882a593Smuzhiyunhandshake is complete, we have all the parameters required to move the
29*4882a593Smuzhiyundata-path to the kernel. There is a separate socket option for moving
30*4882a593Smuzhiyunthe transmit and the receive into the kernel.
31*4882a593Smuzhiyun
32*4882a593Smuzhiyun.. code-block:: c
33*4882a593Smuzhiyun
34*4882a593Smuzhiyun  /* From linux/tls.h */
35*4882a593Smuzhiyun  struct tls_crypto_info {
36*4882a593Smuzhiyun          unsigned short version;
37*4882a593Smuzhiyun          unsigned short cipher_type;
38*4882a593Smuzhiyun  };
39*4882a593Smuzhiyun
40*4882a593Smuzhiyun  struct tls12_crypto_info_aes_gcm_128 {
41*4882a593Smuzhiyun          struct tls_crypto_info info;
42*4882a593Smuzhiyun          unsigned char iv[TLS_CIPHER_AES_GCM_128_IV_SIZE];
43*4882a593Smuzhiyun          unsigned char key[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
44*4882a593Smuzhiyun          unsigned char salt[TLS_CIPHER_AES_GCM_128_SALT_SIZE];
45*4882a593Smuzhiyun          unsigned char rec_seq[TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE];
46*4882a593Smuzhiyun  };
47*4882a593Smuzhiyun
48*4882a593Smuzhiyun
49*4882a593Smuzhiyun  struct tls12_crypto_info_aes_gcm_128 crypto_info;
50*4882a593Smuzhiyun
51*4882a593Smuzhiyun  crypto_info.info.version = TLS_1_2_VERSION;
52*4882a593Smuzhiyun  crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128;
53*4882a593Smuzhiyun  memcpy(crypto_info.iv, iv_write, TLS_CIPHER_AES_GCM_128_IV_SIZE);
54*4882a593Smuzhiyun  memcpy(crypto_info.rec_seq, seq_number_write,
55*4882a593Smuzhiyun					TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE);
56*4882a593Smuzhiyun  memcpy(crypto_info.key, cipher_key_write, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
57*4882a593Smuzhiyun  memcpy(crypto_info.salt, implicit_iv_write, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
58*4882a593Smuzhiyun
59*4882a593Smuzhiyun  setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info));
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunTransmit and receive are set separately, but the setup is the same, using either
62*4882a593SmuzhiyunTLS_TX or TLS_RX.
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunSending TLS application data
65*4882a593Smuzhiyun----------------------------
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunAfter setting the TLS_TX socket option all application data sent over this
68*4882a593Smuzhiyunsocket is encrypted using TLS and the parameters provided in the socket option.
69*4882a593SmuzhiyunFor example, we can send an encrypted hello world record as follows:
70*4882a593Smuzhiyun
71*4882a593Smuzhiyun.. code-block:: c
72*4882a593Smuzhiyun
73*4882a593Smuzhiyun  const char *msg = "hello world\n";
74*4882a593Smuzhiyun  send(sock, msg, strlen(msg));
75*4882a593Smuzhiyun
76*4882a593Smuzhiyunsend() data is directly encrypted from the userspace buffer provided
77*4882a593Smuzhiyunto the encrypted kernel send buffer if possible.
78*4882a593Smuzhiyun
79*4882a593SmuzhiyunThe sendfile system call will send the file's data over TLS records of maximum
80*4882a593Smuzhiyunlength (2^14).
81*4882a593Smuzhiyun
82*4882a593Smuzhiyun.. code-block:: c
83*4882a593Smuzhiyun
84*4882a593Smuzhiyun  file = open(filename, O_RDONLY);
85*4882a593Smuzhiyun  fstat(file, &stat);
86*4882a593Smuzhiyun  sendfile(sock, file, &offset, stat.st_size);
87*4882a593Smuzhiyun
88*4882a593SmuzhiyunTLS records are created and sent after each send() call, unless
89*4882a593SmuzhiyunMSG_MORE is passed.  MSG_MORE will delay creation of a record until
90*4882a593SmuzhiyunMSG_MORE is not passed, or the maximum record size is reached.
91*4882a593Smuzhiyun
92*4882a593SmuzhiyunThe kernel will need to allocate a buffer for the encrypted data.
93*4882a593SmuzhiyunThis buffer is allocated at the time send() is called, such that
94*4882a593Smuzhiyuneither the entire send() call will return -ENOMEM (or block waiting
95*4882a593Smuzhiyunfor memory), or the encryption will always succeed.  If send() returns
96*4882a593Smuzhiyun-ENOMEM and some data was left on the socket buffer from a previous
97*4882a593Smuzhiyuncall using MSG_MORE, the MSG_MORE data is left on the socket buffer.
98*4882a593Smuzhiyun
99*4882a593SmuzhiyunReceiving TLS application data
100*4882a593Smuzhiyun------------------------------
101*4882a593Smuzhiyun
102*4882a593SmuzhiyunAfter setting the TLS_RX socket option, all recv family socket calls
103*4882a593Smuzhiyunare decrypted using TLS parameters provided.  A full TLS record must
104*4882a593Smuzhiyunbe received before decryption can happen.
105*4882a593Smuzhiyun
106*4882a593Smuzhiyun.. code-block:: c
107*4882a593Smuzhiyun
108*4882a593Smuzhiyun  char buffer[16384];
109*4882a593Smuzhiyun  recv(sock, buffer, 16384);
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunReceived data is decrypted directly in to the user buffer if it is
112*4882a593Smuzhiyunlarge enough, and no additional allocations occur.  If the userspace
113*4882a593Smuzhiyunbuffer is too small, data is decrypted in the kernel and copied to
114*4882a593Smuzhiyunuserspace.
115*4882a593Smuzhiyun
116*4882a593Smuzhiyun``EINVAL`` is returned if the TLS version in the received message does not
117*4882a593Smuzhiyunmatch the version passed in setsockopt.
118*4882a593Smuzhiyun
119*4882a593Smuzhiyun``EMSGSIZE`` is returned if the received message is too big.
120*4882a593Smuzhiyun
121*4882a593Smuzhiyun``EBADMSG`` is returned if decryption failed for any other reason.
122*4882a593Smuzhiyun
123*4882a593SmuzhiyunSend TLS control messages
124*4882a593Smuzhiyun-------------------------
125*4882a593Smuzhiyun
126*4882a593SmuzhiyunOther than application data, TLS has control messages such as alert
127*4882a593Smuzhiyunmessages (record type 21) and handshake messages (record type 22), etc.
128*4882a593SmuzhiyunThese messages can be sent over the socket by providing the TLS record type
129*4882a593Smuzhiyunvia a CMSG. For example the following function sends @data of @length bytes
130*4882a593Smuzhiyunusing a record of type @record_type.
131*4882a593Smuzhiyun
132*4882a593Smuzhiyun.. code-block:: c
133*4882a593Smuzhiyun
134*4882a593Smuzhiyun  /* send TLS control message using record_type */
135*4882a593Smuzhiyun  static int klts_send_ctrl_message(int sock, unsigned char record_type,
136*4882a593Smuzhiyun                                    void *data, size_t length)
137*4882a593Smuzhiyun  {
138*4882a593Smuzhiyun        struct msghdr msg = {0};
139*4882a593Smuzhiyun        int cmsg_len = sizeof(record_type);
140*4882a593Smuzhiyun        struct cmsghdr *cmsg;
141*4882a593Smuzhiyun        char buf[CMSG_SPACE(cmsg_len)];
142*4882a593Smuzhiyun        struct iovec msg_iov;   /* Vector of data to send/receive into.  */
143*4882a593Smuzhiyun
144*4882a593Smuzhiyun        msg.msg_control = buf;
145*4882a593Smuzhiyun        msg.msg_controllen = sizeof(buf);
146*4882a593Smuzhiyun        cmsg = CMSG_FIRSTHDR(&msg);
147*4882a593Smuzhiyun        cmsg->cmsg_level = SOL_TLS;
148*4882a593Smuzhiyun        cmsg->cmsg_type = TLS_SET_RECORD_TYPE;
149*4882a593Smuzhiyun        cmsg->cmsg_len = CMSG_LEN(cmsg_len);
150*4882a593Smuzhiyun        *CMSG_DATA(cmsg) = record_type;
151*4882a593Smuzhiyun        msg.msg_controllen = cmsg->cmsg_len;
152*4882a593Smuzhiyun
153*4882a593Smuzhiyun        msg_iov.iov_base = data;
154*4882a593Smuzhiyun        msg_iov.iov_len = length;
155*4882a593Smuzhiyun        msg.msg_iov = &msg_iov;
156*4882a593Smuzhiyun        msg.msg_iovlen = 1;
157*4882a593Smuzhiyun
158*4882a593Smuzhiyun        return sendmsg(sock, &msg, 0);
159*4882a593Smuzhiyun  }
160*4882a593Smuzhiyun
161*4882a593SmuzhiyunControl message data should be provided unencrypted, and will be
162*4882a593Smuzhiyunencrypted by the kernel.
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunReceiving TLS control messages
165*4882a593Smuzhiyun------------------------------
166*4882a593Smuzhiyun
167*4882a593SmuzhiyunTLS control messages are passed in the userspace buffer, with message
168*4882a593Smuzhiyuntype passed via cmsg.  If no cmsg buffer is provided, an error is
169*4882a593Smuzhiyunreturned if a control message is received.  Data messages may be
170*4882a593Smuzhiyunreceived without a cmsg buffer set.
171*4882a593Smuzhiyun
172*4882a593Smuzhiyun.. code-block:: c
173*4882a593Smuzhiyun
174*4882a593Smuzhiyun  char buffer[16384];
175*4882a593Smuzhiyun  char cmsg[CMSG_SPACE(sizeof(unsigned char))];
176*4882a593Smuzhiyun  struct msghdr msg = {0};
177*4882a593Smuzhiyun  msg.msg_control = cmsg;
178*4882a593Smuzhiyun  msg.msg_controllen = sizeof(cmsg);
179*4882a593Smuzhiyun
180*4882a593Smuzhiyun  struct iovec msg_iov;
181*4882a593Smuzhiyun  msg_iov.iov_base = buffer;
182*4882a593Smuzhiyun  msg_iov.iov_len = 16384;
183*4882a593Smuzhiyun
184*4882a593Smuzhiyun  msg.msg_iov = &msg_iov;
185*4882a593Smuzhiyun  msg.msg_iovlen = 1;
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun  int ret = recvmsg(sock, &msg, 0 /* flags */);
188*4882a593Smuzhiyun
189*4882a593Smuzhiyun  struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
190*4882a593Smuzhiyun  if (cmsg->cmsg_level == SOL_TLS &&
191*4882a593Smuzhiyun      cmsg->cmsg_type == TLS_GET_RECORD_TYPE) {
192*4882a593Smuzhiyun      int record_type = *((unsigned char *)CMSG_DATA(cmsg));
193*4882a593Smuzhiyun      // Do something with record_type, and control message data in
194*4882a593Smuzhiyun      // buffer.
195*4882a593Smuzhiyun      //
196*4882a593Smuzhiyun      // Note that record_type may be == to application data (23).
197*4882a593Smuzhiyun  } else {
198*4882a593Smuzhiyun      // Buffer contains application data.
199*4882a593Smuzhiyun  }
200*4882a593Smuzhiyun
201*4882a593Smuzhiyunrecv will never return data from mixed types of TLS records.
202*4882a593Smuzhiyun
203*4882a593SmuzhiyunIntegrating in to userspace TLS library
204*4882a593Smuzhiyun---------------------------------------
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunAt a high level, the kernel TLS ULP is a replacement for the record
207*4882a593Smuzhiyunlayer of a userspace TLS library.
208*4882a593Smuzhiyun
209*4882a593SmuzhiyunA patchset to OpenSSL to use ktls as the record layer is
210*4882a593Smuzhiyun`here <https://github.com/Mellanox/openssl/commits/tls_rx2>`_.
211*4882a593Smuzhiyun
212*4882a593Smuzhiyun`An example <https://github.com/ktls/af_ktls-tool/commits/RX>`_
213*4882a593Smuzhiyunof calling send directly after a handshake using gnutls.
214*4882a593SmuzhiyunSince it doesn't implement a full record layer, control
215*4882a593Smuzhiyunmessages are not supported.
216*4882a593Smuzhiyun
217*4882a593SmuzhiyunStatistics
218*4882a593Smuzhiyun==========
219*4882a593Smuzhiyun
220*4882a593SmuzhiyunTLS implementation exposes the following per-namespace statistics
221*4882a593Smuzhiyun(``/proc/net/tls_stat``):
222*4882a593Smuzhiyun
223*4882a593Smuzhiyun- ``TlsCurrTxSw``, ``TlsCurrRxSw`` -
224*4882a593Smuzhiyun  number of TX and RX sessions currently installed where host handles
225*4882a593Smuzhiyun  cryptography
226*4882a593Smuzhiyun
227*4882a593Smuzhiyun- ``TlsCurrTxDevice``, ``TlsCurrRxDevice`` -
228*4882a593Smuzhiyun  number of TX and RX sessions currently installed where NIC handles
229*4882a593Smuzhiyun  cryptography
230*4882a593Smuzhiyun
231*4882a593Smuzhiyun- ``TlsTxSw``, ``TlsRxSw`` -
232*4882a593Smuzhiyun  number of TX and RX sessions opened with host cryptography
233*4882a593Smuzhiyun
234*4882a593Smuzhiyun- ``TlsTxDevice``, ``TlsRxDevice`` -
235*4882a593Smuzhiyun  number of TX and RX sessions opened with NIC cryptography
236*4882a593Smuzhiyun
237*4882a593Smuzhiyun- ``TlsDecryptError`` -
238*4882a593Smuzhiyun  record decryption failed (e.g. due to incorrect authentication tag)
239*4882a593Smuzhiyun
240*4882a593Smuzhiyun- ``TlsDeviceRxResync`` -
241*4882a593Smuzhiyun  number of RX resyncs sent to NICs handling cryptography
242