| 3d1b37c7 | 04-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm64: SHA-224/SHA-256 using ARMv8-A cryptographic extensions
Import SHA-2 assembly code from the Linux kernel (linaro contribution). Enabled with CFG_CRYPTO_SHA256_ARM64_CE=y, set by default on HiK
arm64: SHA-224/SHA-256 using ARMv8-A cryptographic extensions
Import SHA-2 assembly code from the Linux kernel (linaro contribution). Enabled with CFG_CRYPTO_SHA256_ARM64_CE=y, set by default on HiKey. Performance gains compared to the C implementation are as follows (sha-perf results for SHA-256 on HiKey in MiB/s):
Size | Accelerated? (KiB) | No Yes ------+------------- 1 | 11.4 18.3 2 | 16.8 35.6 4 | 21.8 66.8 8 | 25.7 118.9 16 | 28.3 195.5 32 | 29.7 289.7 64 | 30.5 383.3 128 | 30.9 456.9 256 | 31.2 505.3 384 | 31.2 520.7
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 06c5ab4d | 03-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm: update SHA-256 32-bit CE implementation to process multiple blocks
Adjust the 32-bit ARMv8 Crypto Extensions version of the SHA-256 "compress" function to accept multiple blocks of input data.
arm: update SHA-256 32-bit CE implementation to process multiple blocks
Adjust the 32-bit ARMv8 Crypto Extensions version of the SHA-256 "compress" function to accept multiple blocks of input data. Rename a couple of files in preparation for the 64-bit implementation which will follow, and for consistency with SHA-1.
Performances with various buffer sizes were measured on HiKey with sha-perf. Values are in MiB/s, column 'n' means no acceleration, 'y (before)' is the parent commit's accelerated code, and 'y (after)' is this commit.
Size | CFG_CRYPTO_SHA256_ARM32_CE=? (KiB) | n | y (before) | y (after) ------+-------+------------+----------- 1 | 17.8 | 31.9 | 36.3 2 | 22.9 | 52.1 | 67.4 4 | 26.9 | 78.9 | 117.5 8 | 29.4 | 105.2 | 188.4 16 | 30.9 | 125.3 | 268.5 32 | 31.7 | 139.4 | 341.7 64 | 32.1 | 147.8 | 401.4 128 | 32.4 | 152.4 | 438.7 256 | 32.5 | 154.8 | 460.6 384 | 32.5 | 155.4 | 467.0
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|
| 23900b59 | 03-Aug-2015 |
Jerome Forissier <jerome.forissier@linaro.org> |
arm: update SHA-1 32-bit CE implementation to process multiple blocks
The assembly code in sha1_armv8a_ce_a32.S is updated so that sha1_ce_transform() can process multiple blocks of data in a single
arm: update SHA-1 32-bit CE implementation to process multiple blocks
The assembly code in sha1_armv8a_ce_a32.S is updated so that sha1_ce_transform() can process multiple blocks of data in a single call. Performances are significantly improved, and the code is unified with the 64-bit implementation.
Hashing throughput (MiB/s) reported by sha-perf on HiKey:
Size | CFG_CRYPTO_SHA1_ARM32_CE=? (KiB) | n | y (parent) | y (this commit) ------+-------+------------+---------------- 1 | 18.8 | 32.6 | 37.2 2 | 24.9 | 53.8 | 68.7 4 | 30.1 | 80.1 | 121.7 8 | 33.6 | 106.0 | 198.1 16 | 35.6 | 126.3 | 284.4 32 | 36.7 | 140.3 | 365.1 64 | 37.3 | 149.0 | 430.0 128 | 37.6 | 153.6 | 471.9 256 | 37.8 | 156.0 | 496.1 384 | 37.8 | 156.6 | 505.1
Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Pascal Brand <pascal.brand@linaro.org>
show more ...
|