Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 64-byte seed form of ML-KEM private keys #1985

Open
bifurcation opened this issue Nov 8, 2024 · 7 comments · May be fixed by #1994
Open

Support 64-byte seed form of ML-KEM private keys #1985

bifurcation opened this issue Nov 8, 2024 · 7 comments · May be fixed by #1994
Labels
enhancement New feature or request help wanted Asking for support from non-core team

Comments

@bifurcation
Copy link

Describe the bug
There is increasing community consensus that ML-KEM private keys should be stored in the form of 64-byte seeds (i.e., the inputs to ML-KEM.KeyGen_internal()), not in the expanded form described in FIPS 203. See, for example:

https://words.filippo.io/dispatches/ml-kem-seeds/
https://datatracker.ietf.org/meeting/121/materials/slides-121-pquip-fips-issues-with-deploying-ml-kem-and-ml-dsa-04

It is impossible to implement protocols that are defined in terms of seeds. The OQS_KEM_ml_kem_XXX_keypair() method produces an expanded private key, and OQS_KEM_ml_kem_XXX_decaps() consumes an expanded private key.

The minimal possible fix here would be to split out a method that creates an expanded private key from a seed. That way an app could mostly just use the current API, except that when generating a completely fresh key, they would have to fill the seed with random data. To even that out, you could make a parallel seed-only API, either by refactoring the existing interface or creating a parallel one. Anything short of removing the current API of course, will leave the risk that a caller will use expanded keys, with the associated risk of corruption.

If we can agree on an approach here, I would be happy to send a PR.

To Reproduce
N/A

Expected behavior
N/A

Screenshots
N/A

Environment (please complete the following information):

  • liboqs version: main branch

Additional context
N/A

@baentsch
Copy link
Member

baentsch commented Nov 9, 2024

Thanks for that proposal and offer to contribute. Can I ask how you'd want to go about this? Would you want to provide patches to the upstream code liboqs pulls in? Are you willing to maintain them in the face of upstream code changes? Either way, this proposal goes against two of the most core liboqs design principles, though:

  1. Algorithm independence: all algs behave identically under the same API
  2. No algorithm maintenance: all core crypto code is developed and maintained in a responsible manner by upstream sources specialized in this, not by liboqs.

If we drop 1) adding algorithm-specific APIs integration of liboqs to other applications becomes much harder and more support-intensive for the OQS team ("which algorithm are you using which API with?").
If we drop 2), liboqs becomes a totally different project by taking ownership of the implementations of specific algorithms. I'm not convinced the project has the right people for this on board.

Thus, what about the suggestion that you propose a general change of (private) key format to the upstreams that liboqs pulls its code from? If those change their logic to generate/return/process private keys in the "seed" format "under the hood", liboqs can be maintained in its current form.

@bifurcation
Copy link
Author

Thanks for the clarification @baentsch. I didn't have the context that liboqs isn't owning these implementations.

It actually looks like no upstream changes are needed, because the required API is already supported by the underlying implementation. If I understand correctly, The coins argument to pqcrystals_kyberXXX_ref_keypair_derand() is what I have been referring to as the "seed". So we should be able to respect principle (2) with no problem.

Principle (1) then seems like it would guide towards the "parallel seed-only API" approach. We would create another set of algorithm identifiers (say, ML-KEM-XXX-seedonly, following the current), which would use the derand API for key generation and decapsulation (encap would be identical to current ML-KEM).

Does that seem acceptable?

@baentsch
Copy link
Member

Thanks for these explanations, @bifurcation . This proposal looks very good, if I get it right: It would allow liboqs to retain its (KEM) API (keygen/encaps/decaps), require no patches to the upstream code (but use different functions/params) and require downstream code just to change algorithm name (and happily make use of shorter private keys), right? That really sounds too good to be true :-) So I guess to judge that on merit, allow me to come back to your original offer

I would be happy to send a PR

--> looking forward to reviewing that. It will be unique in that it actually creates 2 implementations out of 1 upstream when running copy_from_upstream, so it may not be quite as straightforward as we think....

@baentsch baentsch added enhancement New feature or request help wanted Asking for support from non-core team labels Nov 11, 2024
@bhess
Copy link
Member

bhess commented Nov 11, 2024

Thanks @bifurcation for this proposal!

Principle (1) then seems like it would guide towards the "parallel seed-only API" approach. We would create another set of algorithm identifiers (say, ML-KEM-XXX-seedonly, following the current), which would use the derand API for key generation and decapsulation (encap would be identical to current ML-KEM).

I like the idea to have separate algorithm identifiers to support the seed-only variant.
Do I understand the proposal correctly that the OQS_KEM_keypair won't be changed and in this case just returns the 64-byte private key as secret_key? While encaps would be identical to current ML-KEM, it seems that decaps would need to be changed because the upstream API expects the expanded private key as input. With the current upstream implementation this could be solved by calling pqcrystals_kyberXXX_ref_keypair_derand as part of decaps.

Related to this - as a heads-up - there is ongoing discussion in the PQCP ML-KEM implementation pq-code-package/tsc#4 on extending the upstream API. PQCP is a candidate to replace the pq-crystals upstream in liboqs in the future. Some of the goals are to efficiently accommodate aspects like seed-only representation of private keys, handing expanded keys and supporting key validation. Some of the adaptations there would help us here.

@baentsch
Copy link
Member

it seems that decaps would need to be changed because the upstream API expects the expanded private key as input. With the current upstream implementation this could be solved by calling pqcrystals_kyberXXX_ref_keypair_derand as part of decaps

This is exactly how I understand the proposal by @bifurcation : Same OQS API calling a different pqcrystals API for the new key type.

Some of the goals are to efficiently accommodate aspects like seed-only representation of private keys, handing expanded keys and supporting key validation. Some of the adaptations there would help us here

That now begs the question: What upstreams does OQS keep supporting and (by/until) when? It surely would be great if PQCP would provide API, support and quality warranties OQS currently doesn't get from any upstream (and accordingly, cannot provide such qualities on to users of liboqs). Maybe worth while a separate discussion thread @SWilson4 @dstebila ?

@dstebila
Copy link
Member

This discussion seems related to a few other discussions in liboqs and PQ Code Package: pq-code-package/tsc#4 and #1877.

@baentsch
Copy link
Member

Thanks for the pointer to the PQCP discussion, @dstebila. Do you also read it such that PQCP may decide to support only one SK representation option (e.g., as per NIST guidance)? If so, I see a problem triggered by the reminder of @SWilson4 as to the original mission of OQS, namely to support research -- and as such to enable all options all algorithms permit: If PQCP were to only support one variant, where does OQS get the other from? Keep using a different upstream just for that seems inefficient / unnecessary work.

@bifurcation bifurcation linked a pull request Nov 13, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Asking for support from non-core team
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

4 participants