CDK Data distribution #172
vcastellm
started this conversation in
Architecture
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
CDK Data distribution
Background
CDK is not aware of the chain data, it's the responsibility of the sequencer to store the data as the single source of truth.
cdk-erigon
implements the Datastreamer protocol in order to transmit this data to downstream clients so the data gets distributed.While using this mechanism works for the FEPs path using cdk-erigon, in PP we're opening the door to use any vanilla EVM client and to arbitrary external State Transition Function (games, dbs, non-EVM chains), so we can not rely on any arbitrary system to implement a chain data distribution mechanism.
In this proposal, we want to introduce a solution for distributing the chain data on the CDK side so it's standardized and tailor made for CDK, so this complexity is abstracted for any other system and they don't need to think on solving the issue.
Goals & Non-goals
Proposal
Networking architecture
The proposal is to use a p2p mechanism to favor descentralization of the chain data between CDK nodes.
We propose a modular approach for the data distribution, in way that different backend adaptors can be developed to transmit the data, abstracting the data from the transport. This is to facilitate future data backends that supports consensus engines like CometBFT to allow the trustless decentralization of the sequencing.
Internally cdk will handle the data serialized and deserialized as protobuf types for good efficiency, and will provide the data using an interface to backend adaptors.
The first proposed adaptor is to implement something not overly complex, efficient and trusted but descentralized:
Recovering some past attempts to solve the issue, in Edge we had a system that worked pretty well for this, using a combination of Libp2p for node discovery and gRPC for data transfer, Edge was efficient and descentralized at the same time in distributing the chain data.
Libp2p is the P2P solution of choice for modern decentralized node applications, and gRPC is a state-of-the-art RPC protocol that use HTTP2 and supports bidirectional straming of data.
This implementation would be ported to the
cdk
Rust code, and both libraries exists for Rust and are well supported.Integration with Execution Clients
Different execution clients can already have its own data distribution system, so the CDK will need to implement "Adaptors" for these protocols, in order to read the data from it, distribute them between the CDK nodes and feed the Execution Clients with the data back.
Another possibility is that the data system in the EC, is run in parallel, that's not an issue for CDK but a read-only adaptor will need to be in place anyway.
Example use cases
Vanilla EVM clients
Reth, Erigon, geth use devp2p for data replication.
CDK can have an RPC adaptor that reads data from the EC using RPC for the sequencer, it will distribute the data to other CDK nodes running as a sidecar and feed the data to the EC using a method accepted by the EC.
Arbitrary application
For example a game can expose an API to read and write data to the application, we can create an adaptor for CDK that use the interface and move the data to the internal CDK format, it will be replicated to other CDK nodes and will sync other instances of that application locally.
Beta Was this translation helpful? Give feedback.
All reactions