feat: Integrate op-node with op-signer for block payload signing #103

mininny · 2024-10-02T21:43:41Z

Description

This PR introduces how we can integrate op-node with op-signer when signing the block payload for sequencer gossiping.

tynes · 2024-10-02T23:48:22Z

This is awesome! Great design doc

zhwrd · 2024-10-15T22:45:45Z

security/op-node-op-signer-integration.md

+# Risks & Uncertainties
+
+1. Because op-signer will sign any payloads sent to its RPC, we must ensure that only trusted entities can access the RPC of op-signer. Otherwise, an attacker might request op-signer to sign malicious block payloads, and use its signature to propagate invalid block data to peers in the network. To prevent this, we must ensure that we have safe mechanisms in place to only allow the connection between trusted entities. 
+2. As we create secure connection between the op-signer and op-node (whether by mTLS or other methods), we must ensure that the time for signing the payload is quick enough. It must be always faster than the block build time, and quick enough for good gossiping across the network.


Think we need to double check a few other things related to performance, capacity and availability since the signing frequency will be increased a lot from what we do with batcher, proposer & challenger today. Also none of those services have as strict uptime requirements as a sequencer does.

What is the downstream rate limit / quota for signing requests to HSM backed key in KMS? Do we expect to be within that quota?

What is the impact to the sequencing if we have an op-signer or KMS outage. Should we have a fallback mechanism?

What is the acceptable max latency for signing requests?

The rate limit of hsm is about 50 QPS for assymetric signing, so it's enough for signing every 2 seconds per block.

The HSM has uptime SLA as 99.95%-99.99%: https://cloud.google.com/kms/sla?hl=en. AWS provides better uptime with 99.999%.

I couldn't find any concrete information about the latency of HSM signing request, but I guess we could do some testings and figure out. I think the acceptable max latency is 2 seconds since we have to sign every new block, and I assume mTLS latency + HSM signing operation to be less than 2 seconds at the most.

As you pointed out, the availability of the op-signer and KMS integration is very important as it is directly connected with the sequencer's ability to propagate new l2 blocks. And it is true that HSM is the single point of failure here. In order to mitigate this, we could consider using multi-region HSM as well as multi-provider KMS by integrating with other cloud providers like AWS.

However, I think it's also worth noting that this integration is optional and only crucial when the operator of the sequencer and signer is different. For crucial chains, we should still use the fixed secret key on the sequencer as the main code path.

alfonso-op

Thanks @mininny, great doc. I left some comments.

I agree with Zach there are a few other performance and availability considerations we'll want to expand on.

alfonso-op · 2024-10-16T12:23:41Z

security/op-node-op-signer-integration.md

+
+### Considerations
+
+Above signing structure is a custom scheme, and does not align with other more official signing schemes like the ones introduced by EIP-712 or ERC-191, so we can’t use something like `eth_sign` or `eth_signTypedData`. 


It seems desirable to try and align on a more standard signing scheme. Provided it doesn't add considerable complexity / additional engineering effort, I recommend we implement against a standard and preserve backwards compatibility for older versions.

I agree on using standard signing scheme and adding backwards compatibility to existing scheme when possible. I did look at other schemes like eth_sign and eth_signTypedData, but they require more payload and parsing as they contain ethereum-related information necessary for signing.

I think this rpc is specific to the op-node's p2p signing behavior, and will most likely just be handled by the op-signer(or may beother optimism-related clients in the future), rather than being used in general clients like geth, etc. It'd be best to minimize the payload size to keep the signature latency as minimal possible, and that would require optimism-specific rpc.

Rather, what do you think about changing the rpc name from eth_signBlockPayload to something like optimism_signBlockPayload to remove the impression that this operation is related to ethereum when it's not?

+1 one introducing it into the optimism RPC namespace, or a new opsigner namespace, rather than the ethereum namespace.

Signing with eth_signTypedData over typed data would make sense, but that API also has a few flaws (more complicated schema decoding and communication, on every signing request). We can look into unifying that in a future upgrade, if desired. Perhaps optimism_signBlockPayload can become a front-end, wrapping eth_signTypedData and applying the schema, type checks and logging needed for block signing.

I updated the design doc and the implementation to use the new opsigner namespace: ethereum-optimism/infra@dda416c. And left unifying the signing scheme to a possible future work in the docs

+1 to not adding to the eth namespace. It doesn't look like neither ethereum-optimism/infra#59 in the signer client nor ethereum-optimism/optimism#12325 docs were updated to the new name.

@trianglesphere Oh you're right, the code were updated but not the descriptions. I just updated the description for those PRs to the new opsigner_signBlockPayload

alfonso-op · 2024-10-16T12:35:48Z

security/op-node-op-signer-integration.md

+```
+This means we are sending more data to the op-signer, but may be more secure since the op-signer controls all of the logic for signing the payload. 
+
+# Risks & Uncertainties


Another potential risk to keep in mind is ensuring we enforce the invariant that only the active sequencer is able to successfully sign blocks without odd race conditions (e.g., during sequencer leadership transfer).

To prevent such issues, does the design require any further integration / communication with op-conductor, or will this be guaranteed with the existing op-node implementation?

alfonso-op · 2024-10-16T12:36:57Z

security/op-node-op-signer-integration.md

+for each signing request, op-signer will look at the requesting server's domain, and look at this yaml file for a matching config. Then, it will look for the gcp kms key that matches the `key` name.
+
+
+## Security


Just a note that we'll want to conduct an FMA as part of this project.

Of course! I will write it before we deploy it :) thanks

Good callout, @alfonso-op, I will mention this in the FMA of SiaB project!

alfonso-op · 2024-10-16T12:38:05Z

security/op-node-op-signer-integration.md

+
+The signing of block payload in op-node is a custom signing structure where the signing hash is: `[32]byte keccak256Hash([32]byte domain, [32]byte chainId, [32]byte keccak256(payloadBytes))` (see op-node/p2p/signer.go). This signing hash is signed with the private key, to produce a ECDSA signature of 65 bytes. 
+
+# Proposed Solution


Given the security-sensitive changes involved, these changes will require governance approval. We'll want to factor this into planning and timelines cc @taempark

I think the scope this design doc is just adding more functionality to the op-signer. So I'm not sure if we'd need the governance approval for this feature. But, I think mentioning this to the FMA seem to be necessary.

One point I've thought about the governance approval is relevant is when we hand over the sequencer fully. I briefly discussed this with @mslipper, but that seem to not need the governance approval as well. But the decision of the conversation was let's double-check with @ben-chain just in case.

raffaele-oplabs · 2024-10-16T13:24:28Z

security/op-node-op-signer-integration.md

+
+Thankfully, op-signer already has mTLS support, as integrated in op-batcher and op-proposer. 
+
+Here, users can specify


'user can specify': does this refers to op-signer configuration service?

Yes, the tls certificate has to be set in both the op-signer / op-node service. Perhaps the wording users can is little misleading, as it's more of a must

raffaele-oplabs · 2024-10-16T13:38:08Z

security/op-node-op-signer-integration.md

+
+# Risks & Uncertainties
+
+1. Because op-signer will sign any payloads sent to its RPC, we must ensure that only trusted entities can access the RPC of op-signer. Otherwise, an attacker might request op-signer to sign malicious block payloads, and use its signature to propagate invalid block data to peers in the network. To prevent this, we must ensure that we have safe mechanisms in place to only allow the connection between trusted entities. 


I have read the code of op-signer and it technically does not manage authorization as far as I see.
Any authenticated user ( the one that have a valid client certificate) will be able to access the key with the same name as the one embedded in the certificate.
This may work in the assumption that any client own the key with the same name, and any client can perform all operations with that key.
This may be a valid assumption, I just want to make sure this is something we are aware in the moment we manage the certificates signing. As I was not able to see if there is a revocation list, the whole system security depends on how well the KMS for the client certificate is managed, and that there is an explicit written procedure that spell out the assumption that the authorization works with "DNSNames[0]" = access to a specific key+ all the functionalities of op-signer.
On top of that op-signer at the moment does not log network informations for auditing purposes like origin ip, or certificate signature as far as I can see.

Yes, I agree that we have an assumption that client with the specific DNS name + correct mTLS certificate will get access to all of the functionalities of op-signer using that specific key.

I'm all in for adding logs to op-signer for log network information and incoming request information.

Given that we safely set the mTLS certificate pair on the op-node and op-signer, what other security measures do you think we need to apply here?

Before proceeding I think we should look at this more carefully. Or add a blocker to this before we are ready to deploy. At the moment, for how this is implemented, is possible for anyone able to deploy in the same kluster, to obtain a valid certificate with the characteristic mentioned.
@alfonso-op how could we manage this? As part of the MFA mentioned above?

@raffaele-oplabs yeah I suggest we cover as part of the FMA. We should allocate some extra time, since we know there is some additional security hardening we will have to implement here.

We could start with request logs and alerts, but should brainstorm other potential approaches.

security/op-node-op-signer-integration.md

protolambda · 2024-10-16T21:47:11Z

security/op-node-op-signer-integration.md

+## New RPC: eth_signBlockPayload
+
+new eth-domain RPC (or under some other namespace)
+
+```go
+func eth_signBlockPayload(signingHash [32]byte) (sig *[65]byte, err error) 
+// where signingHash is keccak256Hash([32]byte domain, [32]byte chainId, [32]byte keccak256(payloadBytes))
+```


I think it's better if the 3 attributes are communicated over RPC fully. That way the op-signer server side can log the request, like it does with transaction signing requests, to clarify what is being signed.

Additionally, the chainID should be checked, as well as the signingHash, to not accidentally sign things not intended to be signed. E.g. when reusing the same sequencer key for multiple chains.

In a later protocol upgrade we might also want to structure it even more, to bind the block-number to the domain, or add one more layer of hashing to the payloadBytes part (like a head and a payload-body). That way the signing request can be more expressive, while summarizing the body content, to not pass through the full payload content which might otherwise slow things down.

protolambda · 2024-10-16T21:54:06Z

security/op-node-op-signer-integration.md

+
+### Considerations
+
+Above signing structure is a custom scheme, and does not align with other more official signing schemes like the ones introduced by EIP-712 or ERC-191, so we can’t use something like `eth_sign` or `eth_signTypedData`. 


+1 one introducing it into the optimism RPC namespace, or a new opsigner namespace, rather than the ethereum namespace.

Signing with eth_signTypedData over typed data would make sense, but that API also has a few flaws (more complicated schema decoding and communication, on every signing request). We can look into unifying that in a future upgrade, if desired. Perhaps optimism_signBlockPayload can become a front-end, wrapping eth_signTypedData and applying the schema, type checks and logging needed for block signing.

protolambda · 2024-10-16T21:56:57Z

security/op-node-op-signer-integration.md

+	payloadHash [32]byte `json:"payloadHash"`
+}
+```
+This means we are sending more data to the op-signer, but may be more secure since the op-signer controls all of the logic for signing the payload. 


I think op-signer being configured with expected parameters and checking the RPC method arguments match is a better trade-off. That way you can detect misuse of the API, rather than producing a signature for the assumed domain and finding it's invalid much later when first doing a verification of the produced signature.

Perhaps the RPC should also include the "sender address", so we can assert that matches the expected address. And then it can be used as an intent, in case this same op-signer RPC server can sign for multiple different p2p signer keys.

Thanks for the comment! then I'll modify the design doc and code to send the domain, chainId, payloadBytesHash and senderAddress, and add validation for these parameters. Then we could add configurable parameter for chainId and senderAddress when spinning up the op-signer, and validate the incoming rpc requests to match these configurations :)

And maybe we should bundle them in a struct, rather than specify them as individual function arguments. This will make it easier to extend the input in the future, by adding optional struct fields.

This is updated in the design doc as well as the op-node pr here! ethereum-optimism/optimism@12c4820

Is the purpose of the sender address to verify if the request is valid in an expected restricted environment? I mean, if it has the security purpose, attackers can easily change the address of a request.

The address should not be trusted, but it should help specify which account is meant to be used, if there are multiple options. The eth signing API supports a from-address, so it makes sense to do the same here.

feat: op-node op-signer integration

bf4c4bb

mininny force-pushed the feat/op-node-op-signer-integration branch from 71302c4 to bf4c4bb Compare October 2, 2024 21:59

mininny mentioned this pull request Oct 4, 2024

op-signer, op-node: Integrate op-node with op-signer for block payload signing ethereum-optimism/optimism#12325

Open

Add section about usage of op-signer

4d2a8b6

mininny mentioned this pull request Oct 7, 2024

feat: Integrate op-node with op-signer for block payload signing ethereum-optimism/infra#59

Open

zhwrd reviewed Oct 15, 2024

View reviewed changes

raffaele-oplabs self-assigned this Oct 16, 2024

alfonso-op reviewed Oct 16, 2024

View reviewed changes

raffaele-oplabs reviewed Oct 16, 2024

View reviewed changes

protolambda reviewed Oct 16, 2024

View reviewed changes

Apply review comments

0266c3c

mininny force-pushed the feat/op-node-op-signer-integration branch from 5ac3619 to 0266c3c Compare October 18, 2024 22:07

protolambda approved these changes Oct 23, 2024

View reviewed changes

mininny merged commit 4e9938a into ethereum-optimism:main Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Integrate op-node with op-signer for block payload signing #103

feat: Integrate op-node with op-signer for block payload signing #103

mininny commented Oct 2, 2024

tynes commented Oct 2, 2024

zhwrd Oct 15, 2024 •

edited

Loading

mininny Oct 16, 2024

alfonso-op left a comment

alfonso-op Oct 16, 2024

mininny Oct 16, 2024

protolambda Oct 16, 2024

mininny Oct 18, 2024

trianglesphere Oct 23, 2024

mininny Oct 23, 2024

alfonso-op Oct 16, 2024

alfonso-op Oct 16, 2024

mininny Oct 16, 2024

taempark Oct 16, 2024

alfonso-op Oct 16, 2024

taempark Oct 17, 2024

raffaele-oplabs Oct 16, 2024

mininny Oct 16, 2024

raffaele-oplabs Oct 16, 2024

mininny Oct 16, 2024

raffaele-oplabs Oct 21, 2024

alfonso-op Oct 21, 2024

protolambda Oct 16, 2024

protolambda Oct 16, 2024

protolambda Oct 16, 2024

mininny Oct 17, 2024

protolambda Oct 17, 2024

mininny Oct 18, 2024

ImTei Oct 21, 2024 •

edited

Loading

protolambda Oct 23, 2024


		### Considerations

		Above signing structure is a custom scheme, and does not align with other more official signing schemes like the ones introduced by EIP-712 or ERC-191, so we can’t use something like `eth_sign` or `eth_signTypedData`.

		for each signing request, op-signer will look at the requesting server's domain, and look at this yaml file for a matching config. Then, it will look for the gcp kms key that matches the `key` name.


		## Security


		The signing of block payload in op-node is a custom signing structure where the signing hash is: `[32]byte keccak256Hash([32]byte domain, [32]byte chainId, [32]byte keccak256(payloadBytes))` (see op-node/p2p/signer.go). This signing hash is signed with the private key, to produce a ECDSA signature of 65 bytes.

		# Proposed Solution


		Thankfully, op-signer already has mTLS support, as integrated in op-batcher and op-proposer.

		Here, users can specify


		# Risks & Uncertainties

		1. Because op-signer will sign any payloads sent to its RPC, we must ensure that only trusted entities can access the RPC of op-signer. Otherwise, an attacker might request op-signer to sign malicious block payloads, and use its signature to propagate invalid block data to peers in the network. To prevent this, we must ensure that we have safe mechanisms in place to only allow the connection between trusted entities.

feat: Integrate op-node with op-signer for block payload signing #103

feat: Integrate op-node with op-signer for block payload signing #103

Conversation

mininny commented Oct 2, 2024

tynes commented Oct 2, 2024

zhwrd Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alfonso-op left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ImTei Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhwrd Oct 15, 2024 •

edited

Loading

ImTei Oct 21, 2024 •

edited

Loading