| 3d1b37c7 | 04-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm64: SHA-224/SHA-256 using ARMv8-A cryptographic extensions
Import SHA-2 assembly code from the Linux kernel (linaro contribution). Enabled with CFG_CRYPTO_SHA256_ARM64_CE=y, set by default on HiK
arm64: SHA-224/SHA-256 using ARMv8-A cryptographic extensions
Import SHA-2 assembly code from the Linux kernel (linaro contribution). Enabled with CFG_CRYPTO_SHA256_ARM64_CE=y, set by default on HiKey. Performance gains compared to the C implementation are as follows (sha-perf results for SHA-256 on HiKey in MiB/s):
Size | Accelerated? (KiB) | No Yes ------+------------- 1 | 11.4 18.3 2 | 16.8 35.6 4 | 21.8 66.8 8 | 25.7 118.9 16 | 28.3 195.5 32 | 29.7 289.7 64 | 30.5 383.3 128 | 30.9 456.9 256 | 31.2 505.3 384 | 31.2 520.7
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 06c5ab4d | 03-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm: update SHA-256 32-bit CE implementation to process multiple blocks
Adjust the 32-bit ARMv8 Crypto Extensions version of the SHA-256 "compress" function to accept multiple blocks of input data.
arm: update SHA-256 32-bit CE implementation to process multiple blocks
Adjust the 32-bit ARMv8 Crypto Extensions version of the SHA-256 "compress" function to accept multiple blocks of input data. Rename a couple of files in preparation for the 64-bit implementation which will follow, and for consistency with SHA-1.
Performances with various buffer sizes were measured on HiKey with sha-perf. Values are in MiB/s, column 'n' means no acceleration, 'y (before)' is the parent commit's accelerated code, and 'y (after)' is this commit.
Size | CFG_CRYPTO_SHA256_ARM32_CE=? (KiB) | n | y (before) | y (after) ------+-------+------------+----------- 1 | 17.8 | 31.9 | 36.3 2 | 22.9 | 52.1 | 67.4 4 | 26.9 | 78.9 | 117.5 8 | 29.4 | 105.2 | 188.4 16 | 30.9 | 125.3 | 268.5 32 | 31.7 | 139.4 | 341.7 64 | 32.1 | 147.8 | 401.4 128 | 32.4 | 152.4 | 438.7 256 | 32.5 | 154.8 | 460.6 384 | 32.5 | 155.4 | 467.0
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 23900b59 | 03-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm: update SHA-1 32-bit CE implementation to process multiple blocks
The assembly code in sha1_armv8a_ce_a32.S is updated so that sha1_ce_transform() can process multiple blocks of data in a single
arm: update SHA-1 32-bit CE implementation to process multiple blocks
The assembly code in sha1_armv8a_ce_a32.S is updated so that sha1_ce_transform() can process multiple blocks of data in a single call. Performances are significantly improved, and the code is unified with the 64-bit implementation.
Hashing throughput (MiB/s) reported by sha-perf on HiKey:
Size | CFG_CRYPTO_SHA1_ARM32_CE=? (KiB) | n | y (parent) | y (this commit) ------+-------+------------+---------------- 1 | 18.8 | 32.6 | 37.2 2 | 24.9 | 53.8 | 68.7 4 | 30.1 | 80.1 | 121.7 8 | 33.6 | 106.0 | 198.1 16 | 35.6 | 126.3 | 284.4 32 | 36.7 | 140.3 | 365.1 64 | 37.3 | 149.0 | 430.0 128 | 37.6 | 153.6 | 471.9 256 | 37.8 | 156.0 | 496.1 384 | 37.8 | 156.6 | 505.1
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| de51851c | 10-Jul-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm64: SHA-1 using ARMv8-A cryptographic extensions
- LibTomCrypt: add a new macro, HASH_PROCESS_NBLOCKS, similar to HASH_PROCESS but accepts a function that digests n blocks of data, not just 1. -
arm64: SHA-1 using ARMv8-A cryptographic extensions
- LibTomCrypt: add a new macro, HASH_PROCESS_NBLOCKS, similar to HASH_PROCESS but accepts a function that digests n blocks of data, not just 1. - Import sha1_ce_transform() from the Linux kernel (Linaro contribution) which implements the main SHA-1 transform in assembler using the ARMv-8 cryptographic extensions. - Acceleration is enabled by setting CFG_CRYPTO_SHA1_ARM64_CE=y (this is the default when PLATFORM=hikey).
Performance was compared to the plain C version using sha-perf (https://github.com/linaro-swg/sha-perf.git). Average hashing speed on HiKey is (MiB/s):
Size | Accelerated? (KiB) | No Yes ------+------------- 1 | 12.3 18.4 2 | 18.6 35.6 4 | 24.9 66.9 8 | 29.9 118.0 16 | 33.3 192.9 32 | 35.3 282.6 64 | 36.4 369.6 128 | 37.0 436.4 256 | 37.3 479.9 384 | 37.4 494.4
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 4051a2a1 | 25-Jul-2015 |
Peng Fan <van.freenix@gmail.com> |
arm: mm: v7 panic when device's va conflicts with TA address space
If mm->va is smaller than 32M, then mm->va will conflict with user TA address space. This mapping will be overridden/hidden later w
arm: mm: v7 panic when device's va conflicts with TA address space
If mm->va is smaller than 32M, then mm->va will conflict with user TA address space. This mapping will be overridden/hidden later when a user TA is loaded since these low addresses are used as TA virtual address space.
Some SoCs have devices at low addresses, so we need to map at least those devices at a virtual address which isn't the same as the physical.
TODO: support mapping devices at a virtual address which isn't the same as the physical address.
Signed-off-by: Peng Fan <van.freenix@gmail.com> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform)
show more ...
|
| c65f865d | 23-Jul-2015 |
Peng Fan <van.freenix@gmail.com> |
arm: mm: lpae use XLAT_ENTRY_SIZE to replace sizeof(uint64_t)
Use XLAT_ENTRY_SIZE to replace sizeof(uint64_t). XLAT_ENTRY_SIZE is better than sizeof(uint64_t), although they have same value.
Signed
arm: mm: lpae use XLAT_ENTRY_SIZE to replace sizeof(uint64_t)
Use XLAT_ENTRY_SIZE to replace sizeof(uint64_t). XLAT_ENTRY_SIZE is better than sizeof(uint64_t), although they have same value.
Signed-off-by: Peng Fan <van.freenix@gmail.com> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform)
show more ...
|
| f2a8bde3 | 25-Jul-2015 |
Peng Fan <van.freenix@gmail.com> |
arm: mm: lpae clear mmu table when initialization
Clear the tables when initialization to avoid junk data which may crash system when setting ttbrx.
To ARMv7, non-lpae, this commit 'bc4de3134468a4b
arm: mm: lpae clear mmu table when initialization
Clear the tables when initialization to avoid junk data which may crash system when setting ttbrx.
To ARMv7, non-lpae, this commit 'bc4de3134468a4b1760e6fd5cf09377bf7a7e7c3' fix an issue when setting ttbr0 which crash system, because of junk data in table.
This patch is to fix such issue.
Signed-off-by: Peng Fan <van.freenix@gmail.com> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform)
show more ...
|
| a50aa518 | 23-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
Fix des3_cbc_mac in case of 112bits key
In DES3, a key of 112 bits is made of 2 56 bits keys. DES3 can be run using only 2 56 bit keys, with the 3rd key being equal to the first.
Fix #408
Reviewed
Fix des3_cbc_mac in case of 112bits key
In DES3, a key of 112 bits is made of 2 56 bits keys. DES3 can be run using only 2 56 bit keys, with the 3rd key being equal to the first.
Fix #408
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| 1962351e | 22-Jul-2015 |
Jens Wiklander <jens.wiklander@linaro.org> |
libmpa: optimize size in mpa_get_str()
Save 4098 bytes of unpageable memory by removing option to group hex numbers in mpa_get_str().
Note API change in libmpa, dropping groupsize parameter for mpa
libmpa: optimize size in mpa_get_str()
Save 4098 bytes of unpageable memory by removing option to group hex numbers in mpa_get_str().
Note API change in libmpa, dropping groupsize parameter for mpa_get_str()
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Jens Wiklander <jens.wiklander@linaro.org> (QEMU) Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 60fc60b3 | 22-Jul-2015 |
Jens Wiklander <jens.wiklander@linaro.org> |
core: optimize size with const crypto_ops
Optimize size of unpaged data by making crypto_ops const.
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Pascal Brand <pascal.brand
core: optimize size with const crypto_ops
Optimize size of unpaged data by making crypto_ops const.
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 5df97482 | 21-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
Add Post-Actions on acipher crypto algorithms
In order to check that all temporary variables, used in acipher computation, are correctly released, tee_ltc_acipher_postactions() has been added. It ra
Add Post-Actions on acipher crypto algorithms
In order to check that all temporary variables, used in acipher computation, are correctly released, tee_ltc_acipher_postactions() has been added. It raises an assert in case some temporary variables have not been released.
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| e374ac3b | 21-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
Remove ECC self-test TA
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Signed-off-by: Pascal Brand <pascal.brand@st.com> |
| 6d914f61 | 17-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
ECC: optimize the pool of temporary variables
ECC is using a lot (80) temporary variables. These variables are taken from a static pool, each being of the maximum key size supported in OP-TEE: 4096b
ECC: optimize the pool of temporary variables
ECC is using a lot (80) temporary variables. These variables are taken from a static pool, each being of the maximum key size supported in OP-TEE: 4096bits, times 2 to include wrapping multiplication in temporary computation.
With the introduction of being able to get temporary variables of a given size, the current patch optimize the use of the variables in case of ECC.
Thanks to this patch, the number of temporary variables is back to 50, and the emulated esram size (QEMU / FVP / HiKey) is back to 200KB.
Note that further optimization can be performed, for ECC and also for other algorithms (RSA,...).
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| df6be4e1 | 17-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
mpa: allocator for temporary variables
Tempory variables, in math, are taken in a pool. Each variable has its size maximized, that is 4096 * 2 in order to make overflowed operations.
However, in mo
mpa: allocator for temporary variables
Tempory variables, in math, are taken in a pool. Each variable has its size maximized, that is 4096 * 2 in order to make overflowed operations.
However, in most of the cases, like ECC, such big variable is not necessary.
This patch introduce an allocator to get temporary variables of given size, which is an enabler to reduce the number of required memory for temporary variables
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| 543d7e74 | 16-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
ECC: ECDH at GP level
Following derive key algorithm of Global Platform Internal Core API v1.1 are implemented: TEE_ALG_ECDH_P192 TEE_ALG_ECDH_P224 TEE_ALG_ECDH_P256 TEE_ALG_ECDH_P38
ECC: ECDH at GP level
Following derive key algorithm of Global Platform Internal Core API v1.1 are implemented: TEE_ALG_ECDH_P192 TEE_ALG_ECDH_P224 TEE_ALG_ECDH_P256 TEE_ALG_ECDH_P384 TEE_ALG_ECDH_P521
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| 00d2e232 | 17-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
Fix Key-Pair Parts for Operation Modes
Table 6-6 "Key-Pair Parts for Operation Modes" of Internal Core API v1.1 shows that public key is used in case of encrypt / verify, but that a key pair can be
Fix Key-Pair Parts for Operation Modes
Table 6-6 "Key-Pair Parts for Operation Modes" of Internal Core API v1.1 shows that public key is used in case of encrypt / verify, but that a key pair can be given anyhow, only the public key part being used.
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU platform) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| bd8e4ba7 | 09-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
Remove temporary traces
Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (STM boards) Tested-by: Cedric Chaumont <cedric.chaumont@lin
Remove temporary traces
Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (STM boards) Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (ARM Juno board) Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| c988227a | 15-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
ECC: ECDSA at GP level
Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Sign
ECC: ECDSA at GP level
Reviewed-by: Cedric Chaumont <cedric.chaumont@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| b64d6909 | 02-Jul-2015 |
Cedric Chaumont <cedric.chaumont@st.com> |
GP11 : Time functions fix/panic reason
Signed-off-by: Cedric Chaumont <cedric.chaumont@st.com> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.
GP11 : Time functions fix/panic reason
Signed-off-by: Cedric Chaumont <cedric.chaumont@st.com> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (STM boards) Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (ARM Juno board)
show more ...
|
| bf494894 | 02-Jul-2015 |
Pascal Brand <pascal.brand@st.com> |
ECC: DH implementation and self tests
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Jerome Forissier <jerome.forissie
ECC: DH implementation and self tests
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Tested-by: Jerome Forissier <jerome.forissier@linaro.org> (HiKey 32 & 64-bit) Tested-by: Pascal Brand <pascal.brand@linaro.org> (QEMU) Signed-off-by: Pascal Brand <pascal.brand@st.com>
show more ...
|
| 1d8052f0 | 02-Jul-2015 |
SY Chiu <sy.chiu@linaro.org> |
SE API: Use tee_svc_copy_kaddr_to_user32() to avoid buffer overflow
Note: buffer overflow is supposed to happen when we have 64-bit kernel and 32-bit TA, but SE API can only be tested on QEMU which
SE API: Use tee_svc_copy_kaddr_to_user32() to avoid buffer overflow
Note: buffer overflow is supposed to happen when we have 64-bit kernel and 32-bit TA, but SE API can only be tested on QEMU which cannot hosts 64-bit kernel for now. Thus, the test is just make sure the change doesn't corrupt SE API implementation.
Signed-off-by: SY Chiu <sy.chiu@linaro.org> Tested-by: SY Chiu <sy.chiu@linaro.org> (QEMU+jcardsim) Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
show more ...
|
| a75f2e14 | 07-Jul-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
Build for PLATFORM=vexpress-qemu_virt by default
Also, for STM platforms, set CROSS_COMPILE=arm-linux-gnueabihf- by default (which is a more standard prefix for the 32-bit compiler).
Signed-off-by:
Build for PLATFORM=vexpress-qemu_virt by default
Also, for STM platforms, set CROSS_COMPILE=arm-linux-gnueabihf- by default (which is a more standard prefix for the 32-bit compiler).
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 12e66b6f | 02-Jul-2015 |
Cedric Chaumont <cedric.chaumont@st.com> |
GP11 : Asymmetric functions fix/panic reason
Signed-off-by: Cedric Chaumont <cedric.chaumont@st.com> Reviewed-by: Joakim Bech <joakim.bech@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.
GP11 : Asymmetric functions fix/panic reason
Signed-off-by: Cedric Chaumont <cedric.chaumont@st.com> Reviewed-by: Joakim Bech <joakim.bech@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (STM boards) Tested-by: Cedric Chaumont <cedric.chaumont@linaro.org> (ARM Juno board)
show more ...
|
| e1d75590 | 26-Jun-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm64: AES XTS using ARMv8-A cryptographic extensions
This completes the work started with commit: 7e8f94166c6f ("arm64: AES using ARMv8-A cryptographic extensions").
The ltc_cipher_descriptor stru
arm64: AES XTS using ARMv8-A cryptographic extensions
This completes the work started with commit: 7e8f94166c6f ("arm64: AES using ARMv8-A cryptographic extensions").
The ltc_cipher_descriptor structure of LibTomCrypt is updated to include pointers to accelerated XTS routines, which can handle multiple blocks of data. The actual processing is done in assembly by ce_aes_xts_encrypt() and ce_aes_xts_decrypt().
aes-perf results on HiKey are now on par with other AES modes. In the table below, XTS is non-accelerated (CFG_CRYPTO_AES_ARM64_CE=n), XTS+ is commit 7e8f94166c6f, and XTS++ is this commit.
Average encryption speed (MiB/s):
Size | Mode (KiB) | XTS XTS+ XTS++ ------+------------------ 1 | 9.2 13.0 21.3 2 | 11.7 18.3 41.4 4 | 13.6 23.0 78.3 8 | 14.7 26.3 141.4 16 | 15.4 28.4 236.6 32 | 15.8 29.6 362.2 64 | 16.0 30.3 495.3 128 | 16.1 30.6 605.8
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
show more ...
|
| b418dfe6 | 07-Jul-2015 |
Xiaoqiang Du <xiaoqiang.du@linaro.org> |
arm32 core_mmu_v7.c: bugfix map_page_memarea()
Fixes the problem that some page entries can not be mapped in map_page_memarea().
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by
arm32 core_mmu_v7.c: bugfix map_page_memarea()
Fixes the problem that some page entries can not be mapped in map_page_memarea().
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org> Tested-by: Pascal Brand <pascal.brand@linaro.org> (STM platform) Signed-off-by: Xiaoqiang Du <xiaoqiang.du@linaro.org>
show more ...
|