-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Hacl_AES_128_GCM_NI and Hacl_AES_128_GCM_M32 #418
base: dev
Are you sure you want to change the base?
Conversation
We require contributors to sign our Contributor License Agreement https://github.com/cryspen/hacl/blob/main/CLA.md ensuring that the contribution can be licensed under Apache 2.0 and MIT. In order for us to review and merge your code, please mention @cryspen/core in a comment below to get yourself added. |
We require contributors to sign our Contributor License Agreement https://github.com/cryspen/hacl/blob/main/CLA.md ensuring that the contribution can be licensed under Apache 2.0 and MIT. In order for us to review and merge your code, please mention @cryspen/core in a comment below to get yourself added. |
Pull Request Test Coverage Report for Build 5808703668
💛 - Coveralls |
Thanks! I'll try to look at this today or tomorrow. |
MSVC specific tests fail along with some other tests, I'm going to investigate what triggers that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks really good already!
I left some comments mostly on the build and structure. I'll do another pass in the coming days on the actual code.
cpu-features/src/cpu-features.c
Outdated
@@ -186,15 +218,37 @@ hacl_init_cpu_features() | |||
_sse42 = (ecx & ECX_SSE4_2) != 0; | |||
#endif | |||
|
|||
#if defined(CPU_FEATURES_LINUX) && defined(CPU_FEATURES_ARM64) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has only been used for tests so far and is hence not entirely complete.
If we want to use it for the AES code we should add the Windows bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't have arm64 machine with Windows on it neither have access to such instance. I'd like to know any possible options to achieve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine not doing Windows on Arm, but we should use your new AESGCM implementation on Windows when we can't use the evercrypt version. So we should make sure that the feature detection here works on Windows x86/x64. Alternatively we'll always use the portable implementation on Windows, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this the default execution flow already, hacl_init_cpu_features()
doesn't do any configuration on Windows which means the global variables are left with the initial value 0
. It flows down ultimately to the portable implementation.
At first, I thought you would mean to implement feature detection on Windows ARM. It makes sense to have it works at least for x86 which is more important. I will see what I can do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. You can use the __cpuid
function on Windows (https://learn.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=msvc-170)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__cpuid
function is a good choice to avoid having assembly file that utilizes cpuid
instruction since inline assembly isn't supported for MSVC x64. We still need to have inline assembly of cpuid for other compilers on Windows, fortunately we can use cpuid()
that already exist for Linux but I don't understand why this function preserves ebx
register for 32-bit version https://github.com/cryspen/hacl-packages/blob/main/cpu-features/src/cpu-features.c#L65 and not do that for x64 version. I disassembled x64 version of the function built with clang and the generated machine code does preserve rbx
register since it's specified in the input operands https://github.com/cryspen/hacl-packages/blob/main/cpu-features/src/cpu-features.c#L51
Technically, both versions are supposed to work properly I just think it's good to have identical scheme for both if possible. 32-bit scheme seems more reasonable to me because I'm not sure why the compiler would preserve input operand of inline assembly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change #elif defined(__GNUC__)
to #elif defined(__linux__)
mamonet@4b5f3ed#diff-fff0c10580c2c65492a930241a62b51ac2cc56c8f0ff4ea59f5a42e38d43bd62L30
Let me know if it implies other conditions I didn't cover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work.
I am having difficulty reproducing opam build error, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adressing the comments.
A few more thoughts.
cpu-features/src/cpu-features.c
Outdated
@@ -186,15 +218,37 @@ hacl_init_cpu_features() | |||
_sse42 = (ecx & ECX_SSE4_2) != 0; | |||
#endif | |||
|
|||
#if defined(CPU_FEATURES_LINUX) && defined(CPU_FEATURES_ARM64) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine not doing Windows on Arm, but we should use your new AESGCM implementation on Windows when we can't use the evercrypt version. So we should make sure that the feature detection here works on Windows x86/x64. Alternatively we'll always use the portable implementation on Windows, right?
CMakeLists.txt
Outdated
@@ -284,6 +321,37 @@ if(TOOLCHAIN_CAN_COMPILE_VALE) | |||
set(HACL_CAN_COMPILE_VALE 1) | |||
endif() | |||
|
|||
if(TOOLCHAIN_CAN_COMPILE_AESNI_PCLMUL AND TOOLCHAIN_CAN_COMPILE_VEC128 AND ((ARCHITECTURE MATCHES intel AND TOOLCHAIN_CAN_COMPILE_VEC256) OR HACL_TARGET_ARCHITECTURE MATCHES ${HACL_ARCHITECTURE_ARM64})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we break this line? It's a little long and hard to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
benchmarks/aesgcm.cc
Outdated
bytes ciphertext(state.range(0) + 16, 0); | ||
|
||
for (auto _ : state) { | ||
Lib_IntVector_Intrinsics_vec128 *ctx = (Lib_IntVector_Intrinsics_vec128 *)KRML_HOST_CALLOC((uint32_t)352U, sizeof (uint8_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be an alloc function such that the caller/user doesn't need to know the magic numbers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a convenient way to add such function. I don't seem to be able to alloc/dealloc memory buffers at Low*-level, these functionalities are specific to this library which only utilizes NI and CT64 function from inside EverCrypt functions that the user are supposed to use instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can look at other alloc functions like
hacl-packages/src/Hacl_Hash_SHA3.c
Line 175 in 81303b8
Hacl_Streaming_Keccak_state *Hacl_Streaming_Keccak_malloc(Spec_Hash_Definitions_hash_alg a) |
hacl-packages/src/Hacl_Hash_Blake2s_128.c
Line 480 in 81303b8
Lib_IntVector_Intrinsics_vec128 *Hacl_Blake2s_128_blake2s_malloc(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented malloc/free for AES_GCM context on hacl side mamonet/hacl-star@c9880f1
I will update it for this library once the issue of verifying changes of EverCrypt.AEAD is solved.
Lib_IntVector_Intrinsics_vec128 *ctx = (Lib_IntVector_Intrinsics_vec128 *)KRML_HOST_CALLOC((uint32_t)352U, sizeof (uint8_t)); | ||
Hacl_AES_128_GCM_NI_aes128_gcm_init(ctx, key.data()); | ||
Hacl_AES_128_GCM_NI_aes128_gcm_encrypt(ctx, plaintext.size(), ciphertext.data(), plaintext.data(), 0, NULL, nonce.size(), nonce.data()); | ||
KRML_HOST_FREE(ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to the alloc, do we need a free function here.
src/Hacl_AES_128_BitSlice.c
Outdated
|
||
#include "internal/Hacl_Lib.h" | ||
|
||
typedef struct __uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type needs a better name :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a KaRaMeL artifact.
We can try to rename things in Low* to make this look a bit better, but this may need an upstream change in KaRaMeL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can pick a name for this, by writing
let meaningful_name = uint64 & uint64 & ... & uint64
(or whatever the structural type is, on the right-hand side)
this has been done for a while in HACL to improve code quality... hit me up if it doesn't work or if you need a different kind of fix
src/Hacl_AES_128_BitSlice.c
Outdated
} | ||
__uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t; | ||
|
||
static __uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t_uint64_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
The ocaml build seems to fail. |
I updated the PR with the following changes
New implementations have passed Wycheproof tests successfully. GHASH performance measured on intel i5-10300H, built with clang -O3
|
The imported C files of AESGCM are generated from this branch https://github.com/mamonet/hacl-star/tree/pnmadelaine_aes that has the latest changes committed here. PreComp implementation of ghash is removed and replaced with CT64 which is ~2.5x faster than the former. Also, hardware-accelerated implementation and CT64 share the same algorithm of 128x128->256 multiplication phase and identical 256->128 reduction phase that increases the shared code base and remove a considerable amount of distinctive definitions, properties, lemmas, and pre-computed tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get the remaining comments resolved, then we can merge this.
cpu-features/src/cpu-features.c
Outdated
@@ -186,15 +218,37 @@ hacl_init_cpu_features() | |||
_sse42 = (ecx & ECX_SSE4_2) != 0; | |||
#endif | |||
|
|||
#if defined(CPU_FEATURES_LINUX) && defined(CPU_FEATURES_ARM64) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. You can use the __cpuid
function on Windows (https://learn.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=msvc-170)
#include "internal/Hacl_Spec.h" | ||
#include "config.h" | ||
#include "hacl-cpu-features.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, we need a strategy for handling these changes.
I'm ok with landing this for now. But we need to resolve this quickly on the hacl-star side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jonathan has a clue to solve this issue, we are going to go over this tomorrow. I will keep you posted once an update comes out.
Rust and ocaml appear to be missing the hacl CPU detection
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get another review you need re-request review.
Just a few comments here.
cpu-features/src/cpu-features.c
Outdated
@@ -186,15 +218,37 @@ hacl_init_cpu_features() | |||
_sse42 = (ecx & ECX_SSE4_2) != 0; | |||
#endif | |||
|
|||
#if defined(CPU_FEATURES_LINUX) && defined(CPU_FEATURES_ARM64) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work.
benchmarks/aesgcm.cc
Outdated
bytes ciphertext(state.range(0) + 16, 0); | ||
|
||
for (auto _ : state) { | ||
Lib_IntVector_Intrinsics_vec128 *ctx = (Lib_IntVector_Intrinsics_vec128 *)KRML_HOST_CALLOC((uint32_t)352U, sizeof (uint8_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can look at other alloc functions like
hacl-packages/src/Hacl_Hash_SHA3.c
Line 175 in 81303b8
Hacl_Streaming_Keccak_state *Hacl_Streaming_Keccak_malloc(Spec_Hash_Definitions_hash_alg a) |
hacl-packages/src/Hacl_Hash_Blake2s_128.c
Line 480 in 81303b8
Lib_IntVector_Intrinsics_vec128 *Hacl_Blake2s_128_blake2s_malloc(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Mamone,
Thanks for this PR! I'm very excited to see support for AES-GCM improving in HACL.
All of the initial modules seem good. My main question revolves around the EverCrypt layer... did you edit those files by hand? If yes, I can help you make sure the changes are in the original F* source and generate correct C code, including suitable ifdef guards and runtime tests.
I'm generally available on the Everest slack... let me know how I can help!
Cheers,
Jonathan
include/msvc/libintvector.h
Outdated
#define Lib_IntVector_Intrinsics_vec128_shuffle32(x0, x1, x2, x3, x4) \ | ||
(_mm_shuffle_epi32(x0, _MM_SHUFFLE(x1,x2,x3,x4))) | ||
((uint32x4_t){((uint32x4_t)x0)[x1],((uint32x4_t)x0)[x2],((uint32x4_t)x0)[x3],((uint32x4_t)x0)[x4]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why bother casting x0 hre to uint32x4_t since vec128 is already defined to be that type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Thanks!
I apologize for the late response, I only got notification for your main comment and just noticed the code comments.
|
||
void Hacl_Gf128_CT64_gcm_update_blocks(uint64_t *ctx, uint32_t len, uint8_t *text); | ||
|
||
extern void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably inadvertently used a let f = g
binding in HACL* -- I would suggest adding the entire list of function arguments to to avoid generating an extern... hit me up on slack if this doesn't make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Module 'Hacl.Gf128.CT64' is forked from 'Hacl.Gf128.NI' the latter seems to handle the case properly https://github.com/mamonet/hacl-star/blob/pnmadelaine_aes/code/gf128/Hacl.Gf128.NI.fst#L39-L41 but I don't understand the difference between the two modules.
#define Spec_Cipher_Expansion_AESNI_PCLMUL_AES128 3 | ||
#define Spec_Cipher_Expansion_AESNI_PCLMUL_AES256 4 | ||
#define Spec_Cipher_Expansion_CT64_AES128 5 | ||
#define Spec_Cipher_Expansion_CT64_AES256 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the corresponding source code change here: hacl-star/hacl-star@pnmadelaine_aes...mamonet:hacl-star:pnmadelaine_aes is this handwritten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I didn't know the F* source of this module resides in providers/evercrypt
directory. However, I can tell reflecting the changes in hacl side is a little bit challenging, hopefully we can address this tomorrow.
@@ -26,8 +26,15 @@ | |||
#include "EverCrypt_AEAD.h" | |||
|
|||
#include "internal/Vale.h" | |||
#ifdef HACL_CAN_COMPILE_AESNI_PCLMUL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the majority of the changes in this file are hand-written... is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there's no reason to hand-write the changes other than not noticing the F* source file.
@franziskuskiefer I discussed the issue of checking runtime cpu features inside Evercrypt module with Jonathan yesterday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Let me first summarise my understanding of your comment
You're saying that the new feature detection will only be for ARMv8, and thus won't give us much over using the portable when not on x86 with AES-NI.
We'll still want to use the ARMv8 version directly without evercrypt. We just don't expose it in evercrypt.
That's fine with me and will simplify the evercrypt integration significantly.
This patch adds the following changes: