Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implementing Redis-Raft commands to build a cluster. #136

Merged
merged 76 commits into from
Mar 8, 2024

Conversation

KKorpse
Copy link
Collaborator

@KKorpse KKorpse commented Jan 20, 2024

建集群需要实现的三条指令

  • 启动节点:RAFT.CLUSTER INIT <id>
// init server1 
redis-server \
    --port 5001 --dbfilename raft1.rdb \
    --loadmodule <path-to>/redisraft.so \
    --raft.log-filename raftlog1.db \
    --raft.addr localhost:5001

// init server2 
redis-server \
    --port 5002 --dbfilename raft2.rdb \
    --loadmodule <path-to>/redisraft.so \
    --raft.log-filename raftlog2.db \
    --raft.addr localhost:5002

// init server3
redis-server \
    --port 5003 --dbfilename raft3.rdb \
    --loadmodule <path-to>/redisraft.so \
    --raft.log-filename raftlog3.db \
    --raft.addr localhost:5003
    
// connect server1
redis-cli -p 5001 raft.cluster init
  • 加入节点:RAFT.CLUSTER JOIN [addr:port]
redis-cli -p 5002 RAFT.CLUSTER JOIN localhost:5001
  • 其中 RAFT.CLUSTER JOIN 的原理是向目标节点发送 RAFT.NODE ADD 指令(具体细节看这里Raft ),故也需要实现这条指令:RAFT.NODE ADD [id] [address:port]
    # redis 中 join 指令的实现
    ret = redisAsyncCommand(ConnGetRedisCtx(conn), handleNodeAddResponse, conn,
                            "RAFT.NODE ADD %d %s:%u", rr->config.id,
                            rr->config.addr.host, rr->config.addr.port);

主要改动

编译问题:

我直接将我目前的提交 rebase 到了潘磊的分支上,但 pikiwidb 可执行文件好像还没有链接 braft 库,需要进一步修改才能编译成功。

TARGET_LINK_LIBRARIES(pikiwidb net; dl; leveldb; fmt; pikiwidb-folly; storage; rocksdb;)

引入了四个文件:

src/cmd_raft.cc
src/cmd_raft.h
  • Raft 指令的实现
src/raft.h
src/raft.cc
  • braft 的状态机类的实现,用于担任 pikwidb 和 braft 之间的接口
  • 在 join 指令执行时,暂存发送该指令的 client 的一些信息,用于异步回复该 client,具体细节后面解释。

指令实现

RAFT.CLUSTER INIT <id>

  • 当前初始化时,使用的是 config.ip 作为当前机器地址,但是这个地址好像默认是 127.0.0.1,用在集群中会有问题,需要修改。
  • 端口不能和 pikiwidb 一样,目前使用的是 config.port + 10
butil::Status PRaft::Init(std::string& cluster_id) {
  assert(clust_id.size() == RAFT_DBID_LEN);
  this->_dbid = cluster_id;

  // FIXME: g_config.ip is default to 127.0.0.0, which may not work in cluster.
  auto raw_addr = g_config.ip + ":" + std::to_string(g_config.port + RAFT_PORT_OFFSET);
  butil::EndPoint addr(butil::my_ip(), g_config.port + RAFT_PORT_OFFSET);

RAFT.NODE ADD [id] [address:port] 向集群中添加节点的操作。

  • 如果当前节点是 Leader,执行成员变更操作
  • 如果当前节点非 Leader,需在回复中告知当前 Leader 的地址,暂未实现。

RAFT.CLUSTER JOIN [addr:port]

比较麻烦的一个指令,需要向 Leader 发送 ADD 请求。

参考现有的主从实现,会在收到指令时,建立一个连接,建立连接前,会将当前 client 指针等信息保存在 PRaft 中,用于稍后主动回复。

    auto on_new_conn = [](TcpConnection* obj) {
      if (g_pikiwidb) {
        g_pikiwidb->OnNewConnection(obj);
      }
    };
    auto fail_cb = [&](EventLoop* loop, const char* peer_ip, int port) {
      PRAFT.OnJoinCmdConnectionFailed(loop, peer_ip, port);
    };

    auto loop = EventLoop::Self();
    auto peer_ip = GetIpFromEndPoint(addr);
    auto port = GetPortFromEndPoint(addr);
    PRAFT.GetJoinCtx().Set(client, peer_ip, port);
    loop->Connect(peer_ip.c_str(), port, on_new_conn, fail_cb);

当前线程先不回复 client,所以把消息清空。

    client->Clear();

连接建立成功后,会走到 g_pikiwidb->OnNewConnection(obj) 这里

void PikiwiDB::OnNewConnection(pikiwidb::TcpConnection* obj) {
  // ...
  client->OnConnect();
  // ...
}

最终在 OnConnect() 中调用 PRaft 的接口,向主节点发送 ADD 请求。

void PClient::OnConnect() {
  // ...
  } else if (isJoinCmdTarget()) {
    SetName("ClusterJoinCmdConnection");
    PRAFT.SendNodeAddRequest();
  // ...
}

而 pikiwidb 收到消息时,会判断当前连接是否属于 Join 指令发起的连接,如果是,则处理对应的回复。并且从 PRaft 中获取之前发送 join 指令的 client,向其发送回复,这里指令流程才完整结束。

  • ProcessClusterJoinCmdResponse() 函数的具体逻辑还未实现。
int PClient::handlePacket(const char* start, int bytes) {
  // ...
  if (isJoinCmdTarget()) {
    // Proccees the packet at one turn.
    return PRAFT.ProcessClusterJoinCmdResponse(start, bytes);
  }
  // ...
}

panlei-coder and others added 20 commits January 9, 2024 21:13
* feat: import braft

* fix: fix braft.cmake

* fix:fix import braft cmake

* fix: fix fetchcontent_populate

* fix: fix find gflags brpc braft cmake

* add independent code and fix brpc

* fix: delete unnecessary comments

* fix: fix fetch content to ExternalProject_Add

* fix: add counter.pb.h and counter.pb.cc

---------

Co-authored-by: century <[email protected]>
…ation#134)

* feat: import braft

* fix: fix braft.cmake

* fix:fix import braft cmake

* fix: fix fetchcontent_populate

* fix: fix find gflags brpc braft cmake

* add independent code and fix brpc

* fix: delete unnecessary comments

* fix: fix fetch content to ExternalProject_Add

* fix: add counter.pb.h and counter.pb.cc

* fix: fixed a bug where the compiler could not find glog

---------

Co-authored-by: century <[email protected]>
When implementing, I just pulled in the braft library directly. gotta roll back those changes I made to the cmake file for that.
@github-actions github-actions bot added the ✏️ Feature New feature or request label Jan 20, 2024
@longfar-ncy
Copy link
Collaborator

longfar-ncy commented Jan 21, 2024

交流一个问题:cmd_raft.h 放这里肯定没问题,raft.h/cc 是放这里合适还是放 storage 层更合适?on_apply 实际上是对rocksdb的写入,我个人是觉得放storage层或许更合适一点?(仅个人意见)

@KKorpse
Copy link
Collaborator Author

KKorpse commented Jan 21, 2024

交流一个问题:cmd_raft.h 放这里肯定没问题,raft.h/cc 是放这里合适还是放 storage 层更合适?on_apply 实际上是对rocksdb的写入,我个人是觉得放storage层或许更合适一点?(仅个人意见)

好像 storage 确实更好一点,目前只有上层调用 raft 的逻辑。

src/cmd_table_manager.cc Show resolved Hide resolved
src/praft/praft.h Outdated Show resolved Hide resolved
failed Outdated Show resolved Hide resolved
namespace pikiwidb {

#define RAFT_DBID_LEN 32
#define RAFT_PORT_OFFSET 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用配置文件传入,不要使用固定的漂移量

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成配置偏移量了

tests/counter/CMakeLists.txt Outdated Show resolved Hide resolved
src/praft/praft.h Outdated Show resolved Hide resolved
PClient* client_ = nullptr;
std::string peer_ip_;
int port_ = 0;
std::mutex mtx_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

锁放在要保护的成员上面

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

src/praft/praft.h Show resolved Hide resolved
src/cmd_raft.cc Show resolved Hide resolved
src/cmd_raft.cc Show resolved Hide resolved
@dingxiaoshuai123 dingxiaoshuai123 deleted the branch OpenAtomFoundation:import-braft March 6, 2024 08:53
@panlei-coder panlei-coder reopened this Mar 6, 2024
@dingxiaoshuai123 dingxiaoshuai123 merged commit 65e299c into OpenAtomFoundation:import-braft Mar 8, 2024
2 checks passed
dingxiaoshuai123 added a commit to dingxiaoshuai123/pikiwidb that referenced this pull request Mar 25, 2024
…oundation#136)

* feat: Implementing Redis-Raft commands to build a cluster

Co-authored-by: panlei-coder <[email protected]>
Co-authored-by: century <[email protected]>
Co-authored-by: panlei-coder <[email protected]>
Co-authored-by: alexstocks <[email protected]>
Co-authored-by: dingxiaoshuai123 <[email protected]>
Co-authored-by: longfar <[email protected]>
@Mixficsol Mixficsol mentioned this pull request Apr 15, 2024
24 tasks
AlexStocks added a commit that referenced this pull request May 6, 2024
* feat:import braft (#130)

* feat:  Implementing Redis-Raft commands to build a cluster. (#136)

Co-authored-by: panlei-coder <[email protected]>
Co-authored-by: century <[email protected]>
Co-authored-by: panlei-coder <[email protected]>
Co-authored-by: alexstocks <[email protected]>
Co-authored-by: dingxiaoshuai123 <[email protected]>
Co-authored-by: longfar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants