Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFT-457: Remove bn254 #123

Merged
merged 1 commit into from
Jun 12, 2024
Merged

BFT-457: Remove bn254 #123

merged 1 commit into from
Jun 12, 2024

Conversation

aakoshh
Copy link
Contributor

@aakoshh aakoshh commented Jun 10, 2024

What ❔

Removes the bn254 crypto module. It's a separate PR so the code is easy to restore, should we need this module again.

Why ❔

Because #121 introduces ECDSA for attester signatures and bn254 will not be used for anything any more.

@aakoshh aakoshh mentioned this pull request Jun 11, 2024
8 tasks
@aakoshh aakoshh marked this pull request as ready for review June 11, 2024 11:05
@aakoshh aakoshh requested a review from brunoffranca as a code owner June 11, 2024 11:05
Copy link
Member

@brunoffranca brunoffranca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's merge after the other PR.

Base automatically changed from 457-ecdsa to main June 12, 2024 15:40
@brunoffranca brunoffranca requested a review from pompon0 as a code owner June 12, 2024 15:40
brunoffranca pushed a commit that referenced this pull request Jun 12, 2024
## What ❔

Adds `secp256k1` keys and signatures and uses it in the `attester` role
instead of `bn254`.
- [x] Add `secp256k1::{SecretKey, PublicKey, Signature,
AggregateSignature}`
- [x] Replace `bn254` with `secp256k1` in the `attester`
- [x] Write unit tests for serialization roundtrips
- [x] Write unit tests for signing and verifying messages
- [x] Get some test vectors for Ethereum to ensure that the signatures
are compatible with the Solidity semantics (e.g. shifting the recovery
ID by 27; see
[ECDSA.sol](https://github.com/OpenZeppelin/openzeppelin-contracts/blob/master/contracts/utils/cryptography/ECDSA.sol)
for reference).
- [x] Investigate failing genesis and other tests
- [x] Remove `bn254` - deferred to
#123
- [x] Investigate failing Twins test

### Caveats

#### Inefficient aggregate signature verification

The aggregate signature is just a list of bytes, but it mimics the API
of the `bn256` version which really supported aggregation and was
insensitive to message and key ordering during verification. To fulfil
that expectation it has to match every message against every signature
until a matching pair is found, which can result in repeated futile
signature verifications.

The API also supports every member signing different messages, which was
utilised to aggregate the signatures of validators into a quorum
certificate even when they all signed different justifications. In
practice we have attesters signing the same batch, so we could get rid
of this option. If we did, then then aggregate signature in the batch
could become a `Vec<(PublicKey, Signature)>`, which would be less costly
to verify in Solidity than what is currently happening in Rust (matching
every signatory against every possible signature), although a final
Solidity payload transformation can be done by the L1 batch submitter as
well.

On a similar note, the seralized signature aims to be compatible with
Solidity by shifting the recovery ID by 27, but this could also be
deferred for later, to keep this closer to being standard.

#### Change in protobuf

~~I did not rename the `bn254` field in the protobuf format to
`secp256k1`, just copied the same comment that the field name is wrong
from the validator role proto file where we have a similar situation.
I'll leave it up to you if breaking the JSON schema is worse than
misleading names.~~

As requested, a new field has been added. 

### Unrelated test failures

I encountered some test failures that seemingly have nothing to do with
my changes. I reckon the reason is that a different number of calls are
made to `rng`, which affects subsequently random values. For example
during the generation of `Genesis` the new attester keys generation, as
a side effect, changes the leader schedule which is generated a few
lines down from it, which then causes a test to fail due to the value of
the schedule, not the attesters. Notably this happened to tests which
have no repetition, ie. they only exercise one random sample.

We discussed once that testing with a deterministic but slowly changing
random seed could surface such test failures on CI, perhaps on a nightly
run, so as not to disrupt people's workflow. This was considered
undesireable as it 1) might provide a false sense of confidence to some
test authors that coverage will increase with time, enticing them to
write lower quality tests and 2) annoy fellow developers when it fails
on CI. Nevertheless this is an example that even with a static seed,
unrelated test failures can pop up as we evolve the code. One could even
argue that the static seed gives test authors a false sense of "mission
accomplished".

#### Twins 

One of the tests that failed was the twins network. It generated a
scenario in which node A could not gossip a block 98 to node B, and node
B got blocked indefinitely waiting for block 98 to appear in its store.
Since the setup dictated that every node has only 1 gossip peer, no
other node tried to send block 98 to B (or vice versa B would not have
tried to get it from anybody else). For some reason A didn't have the
block and possibly was also blocked waiting for it; it didn't seem to
participate actively in the rounds. I'm not sure why nobody gossiped the
block to A, but what is sure is that the current mock gossip mechanism
in the test would be insufficient to keep things going even if they did,
because a gossip from A to B is only triggered once, when A tries to
send a message to B.

As a seemingly sensible workaround I increased the number of gossip
peers to be larger than the amount of twins in the network, ie. more
than the faulty replicas, which seems like a good policy anyway.

It works now but we could potentially turn the gossip to be more
independent from messaging than it currently is. Or a simpler
alternative would be to asynchronously wait until the source has the
block, and then send it to the target, rather than give up immediately.

## Why ❔

So that we can send batch signatures to L1 and verify them in smart
contracts, where the Secp256k1 curve is readily supported by a
precompile.
Copy link
Contributor

@pompon0 pompon0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need my review, please merge main into this branch first.

@aakoshh aakoshh merged commit 21deed0 into main Jun 12, 2024
5 checks passed
@aakoshh aakoshh deleted the 457-rm-bn254 branch June 12, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants