Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd: add pd microservices section #16257

Closed
wants to merge 9 commits into from

Conversation

rleungx
Copy link
Member

@rleungx rleungx commented Jan 23, 2024

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions (in Chinese).

  • master (the latest development version)
  • v7.6 (TiDB 7.6 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.4 (TiDB 7.4 versions)
  • v7.3 (TiDB 7.3 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)
  • v5.2 (TiDB 5.2 versions)
  • v5.1 (TiDB 5.1 versions)
  • v5.0 (TiDB 5.0 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Copy link

ti-chi-bot bot commented Jan 23, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 23, 2024
Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

## 使用限制

1. TSO 微服务目前不支持动态启停,开启或关闭需要重启 PD 集群。
2. 只有 TiDB 通过服务发现直接连接 TSO 微服务,其他的组件是通过请求转发的方式,将请求通过 PD 转发到 TSO 微服务获取时间戳。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要引入 API server 的概念么?

`mode` 支持三种选项,具体如下:

- `api`:若开启微服务,PD 自身将默认以 api mode 启动。启动后不再提供原本 TSO 分配的功能,需要在集群中同时部署 TSO 微服务。另外,原本的调度功能则根据是否部署 scheduling 微服务动态提供。
- `tso`:若 mode 为 tso,即开启 TSO 微服务,提供时间戳分配的功能,需要 PD 以 api mode 启动。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要指明无法在线地变更 tso 服务么?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我补一下

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用限制里面写了


此外,由于调度微服务是通过服务发现动态开启的,所以如果调度微服务发生宕机后,默认 PD 会继续提供调度的服务。但考虑到允许调度微服务和 PD 使用不同的版本,不同版本的调度逻辑可能会发生变化。所以也提供了开关,在这种情况下,禁止 PD 提供调度服务,防止调度逻辑出现跳变。可以通过设置 `pd-ctl config set enable-scheduling-fallback false` 进行关闭。该参数默认为 true。

## 使用方法
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要提供一个 example 或者最佳实践?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

等 operator 先合

@qiancai qiancai added the v8.0 This PR/issue applies to TiDB v8.0. label Feb 2, 2024
Signed-off-by: Ryan Leung <[email protected]>
@qiancai qiancai self-assigned this Feb 21, 2024
@qiancai qiancai self-requested a review February 21, 2024 09:12
@qiancai qiancai added the translation/doing This PR’s assignee is translating this PR. label Feb 21, 2024
@ti-chi-bot ti-chi-bot bot removed the missing-translation-status This PR does not have translation status info. label Feb 21, 2024
Signed-off-by: Ryan Leung <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>
TOC.md Outdated Show resolved Hide resolved

# 使用 PD 微服务模式

从 v8.0.0 开始,PD 支持微服务模式。该模式可自定义将 PD 自身拆解成多个组件进行部署,具体包括
Copy link
Collaborator

@qiancai qiancai Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
从 v8.0.0 开始,PD 支持微服务模式。该模式可自定义将 PD 自身拆解成多个组件进行部署,具体包括
从 v8.0.0 开始,PD 支持微服务模式。该模式可将 PD 的时间戳分配和集群调度功能拆分为以下微服务单独部署,从而与 PD 的路由功能解耦,让 PD 专注于元数据的路由服务。

Comment on lines 10 to 11
- TSO 微服务:为整个集群提供单调递增的时间戳分配
- Scheduling 微服务:为整个集群提供调度功能,包括但不限于负载均衡,热点,副本修复,副本放置等
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- TSO 微服务:为整个集群提供单调递增的时间戳分配
- Scheduling 微服务:为整个集群提供调度功能,包括但不限于负载均衡,热点,副本修复副本放置等
- TSO 微服务:为整个集群提供单调递增的时间戳分配
- Scheduling 微服务:为整个集群提供调度功能,包括但不限于负载均衡、热点处理、副本修复副本放置等

- TSO 微服务:为整个集群提供单调递增的时间戳分配
- Scheduling 微服务:为整个集群提供调度功能,包括但不限于负载均衡,热点,副本修复,副本放置等

每种微服务都以独立进程的方式部署,若指定的副本数量大于 1 时,提供主备的容灾模式。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问这里,“若指定的副本数量大于 1 时”, 是指微服务的副本吗。 “提供主备的容灾模式。” 是谁提供呢

Comment on lines 48 to 51
1. 是不是所有的集群都适合通过微服务的模式部署?

不是。微服务主要解决的是 PD 出现瓶颈后导致服务质量下降的问题。如果瓶颈本身不在 PD,则无需开启,微服务本身会增加组件数量,提高运维成本。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. 是不是所有的集群都适合通过微服务的模式部署?
不是。微服务主要解决的是 PD 出现瓶颈后导致服务质量下降的问题。如果瓶颈本身不在 PD,则无需开启,微服务本身会增加组件数量,提高运维成本。

Comment on lines 19 to 21
- 可以避免 PD 集群压力过大而导致 TSO 分配的长尾或者抖动现象
- 可以避免因为调度模块故障导致整个集群服务不可用的问题
- 可以延缓 PD 自身单点瓶颈的问题
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 可以避免 PD 集群压力过大而导致 TSO 分配的长尾或者抖动现象
- 可以避免因为调度模块故障导致整个集群服务不可用的问题
- 可以延缓 PD 自身单点瓶颈的问题
- PD 集群压力过大而导致 TSO 分配的长尾或者抖动现象
- 调度模块故障导致整个集群服务不可用的问题
- PD 自身单点瓶颈的问题


## 使用场景

PD 微服务可以通过将重要模块,如 TSO 分配,调度等模块,与 PD 自身的路由功能进行解耦,让 PD 专注于元数据的路由服务。利用该特性:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PD 微服务可以通过将重要模块,如 TSO 分配,调度等模块,与 PD 自身的路由功能进行解耦,让 PD 专注于元数据的路由服务。利用该特性:
PD 微服务通常用于解决 PD 出现性能瓶颈的问题,提高 PD 服务质量。利用该特性,你可以避免以下问题


1. TSO 微服务目前不支持动态启停,开启或关闭需要重启 PD 集群。
2. 只有 TiDB 通过服务发现直接连接 TSO 微服务,其他的组件是通过请求转发的方式,将请求通过 PD 转发到 TSO 微服务获取时间戳。
3. 当前微服务与 dr-auto sync 特性不兼容。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. 当前微服务与 dr-auto sync 特性不兼容。
3. 当前微服务与 [同步部署模式 (DR Auto-Sync) ](/two-data-centers-in-one-city-deployment.md#简介) 特性不兼容。


## 使用方法

目前仅支持通过 tidb-operator 进行部署。
Copy link
Collaborator

@qiancai qiancai Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
目前仅支持通过 tidb-operator 进行部署。
目前 PD 微服务仅支持通过 TiDB Operator 进行部署。

Copy link
Contributor

@HuSharp HuSharp Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要介绍 playground 不?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我觉得暂时不用吧

Comment on lines 52 to 54
2. 什么情况算是达到了 PD 的瓶颈?

集群自身状态正常的前提下,PD 监控面板中 `TiDB - PD server TSO handle time` 一项出现明显延迟上涨或者 `Heartbeat - TiKV side heartbeat statistics` 一项出现大量 pending
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. 什么情况算是达到了 PD 的瓶颈?
集群自身状态正常的前提下,PD 监控面板中 `TiDB - PD server TSO handle time` 一项出现明显延迟上涨或者 `Heartbeat - TiKV side heartbeat statistics` 一项出现大量 pending
- 如何判断 PD 是否达到了性能瓶颈?
在集群自身状态正常的前提下,可以查看 Grafana PD 面板中的监控指标。如果 `TiDB - PD server TSO handle time` 指标出现明显延迟上涨或 `Heartbeat - TiKV side heartbeat statistics` 指标出现大量 pending,说明 PD 达到了性能瓶颈。


目前仅支持通过 tidb-operator 进行部署。

若开启微服务,PD 启动后不再提供原本 TSO 分配的功能,需要在集群中部署 TSO 微服务。另外,如果集群中部署了 scheduling 微服务,则由 scheduling 微服务提供调度功能,否则,由 PD 提供调度功能。
Copy link
Collaborator

@qiancai qiancai Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
若开启微服务,PD 启动后不再提供原本 TSO 分配的功能,需要在集群中部署 TSO 微服务。另外,如果集群中部署了 scheduling 微服务,则由 scheduling 微服务提供调度功能,否则,由 PD 提供调度功能。
开启微服务并重启 PD 后,PD 不再提供 TSO 分配功能,需要在集群中部署 TSO 微服务。此外,如果集群中部署了 scheduling 微服务,则由 scheduling 微服务提供调度功能,否则,由 PD 提供调度功能。

@@ -0,0 +1,58 @@
---
title: 使用 PD 微服务提高服务质量
summary: 介绍如何开启 PD 微服务模式提高服务质量
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary: 介绍如何开启 PD 微服务模式提高服务质量
summary: 介绍如何开启 PD 微服务模式提高服务质量


> **注意:**
>
> 由于 scheduling 微服务支持动态切换,即如果 scheduling 微服务进程关闭后,PD 会继续提供调度的服务。所以如果 scheduling 微服务和 PD 使用不同的 binary 版本,为防止调度逻辑出现变化。可以通过设置 `pd-ctl config set enable-scheduling-fallback false` 禁止 scheduling 微服务进程关闭后 PD 提供调度服务。该参数默认为 true。
Copy link
Collaborator

@qiancai qiancai Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> 由于 scheduling 微服务支持动态切换,即如果 scheduling 微服务进程关闭后,PD 会继续提供调度的服务。所以如果 scheduling 微服务和 PD 使用不同的 binary 版本,为防止调度逻辑出现变化可以通过设置 `pd-ctl config set enable-scheduling-fallback false` 禁止 scheduling 微服务进程关闭后 PD 提供调度服务。该参数默认为 true。
> Scheduling 微服务支持动态切换,即如果 Scheduling 微服务进程关闭后,PD 默认会继续提供调度的服务。如果 Scheduling 微服务和 PD 使用不同的 binary 版本,为防止调度逻辑出现变化可以通过设置 `pd-ctl config set enable-scheduling-fallback false` 禁止 Scheduling 微服务进程关闭后 PD 提供调度服务。该参数默认为 `true`

pd-microservices.md Show resolved Hide resolved
Signed-off-by: Ryan Leung <[email protected]>
Comment on lines 31 to 34
1. TSO 微服务目前不支持动态启停,开启或关闭需要重启 PD 集群。
2. 只有 TiDB 通过服务发现直接连接 TSO 微服务,其他的组件是通过请求转发的方式,将请求通过 PD 转发到 TSO 微服务获取时间戳。
3. 当前微服务与 [同步部署模式 (DR Auto-Sync) ](/two-data-centers-in-one-city-deployment.md#简介) 特性不兼容。
4. 与 TiDB 系统变量 `tidb_enable_tso_follower_proxy` 不兼容。
Copy link
Collaborator

@qiancai qiancai Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. TSO 微服务目前不支持动态启停,开启或关闭需要重启 PD 集群。
2. 只有 TiDB 通过服务发现直接连接 TSO 微服务,其他的组件是通过请求转发的方式,将请求通过 PD 转发到 TSO 微服务获取时间戳。
3. 当前微服务与 [同步部署模式 (DR Auto-Sync) ](/two-data-centers-in-one-city-deployment.md#简介) 特性不兼容。
4. 与 TiDB 系统变量 `tidb_enable_tso_follower_proxy` 不兼容。
- TSO 微服务目前不支持动态启停,开启或关闭 TSO 微服务后需要重启 PD 集群。
- 只有 TiDB Server 可以通过服务发现直接连接 TSO 微服务,其它组件则需要将请求通过 PD 转发到 TSO 微服务获取时间戳。
- 当前微服务与[同步部署模式 (DR Auto-Sync)](/two-data-centers-in-one-city-deployment.md#简介) 特性不兼容。
- 与 TiDB 系统变量 [`tidb_enable_tso_follower_proxy`](/system-variables.md#tidb_enable_tso_follower_proxy-从-v530-版本开始引入) 不兼容。

Signed-off-by: Ryan Leung <[email protected]>
@rleungx rleungx removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 28, 2024
@rleungx rleungx marked this pull request as ready for review February 29, 2024 10:08
@qiancai qiancai changed the title Add pd microservices section pd: add pd microservices section Mar 6, 2024
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
pd-microservices.md Outdated Show resolved Hide resolved
Copy link

ti-chi-bot bot commented Mar 8, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from qiancai, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

ti-chi-bot bot commented Mar 8, 2024

@rleungx: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify 5e20124 link true /test pull-verify

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 11, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 15, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 18, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 18, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 20, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 21, 2024
github-actions bot pushed a commit to qiancai/pingcap-docsite-preview that referenced this pull request Mar 21, 2024
@qiancai
Copy link
Collaborator

qiancai commented Mar 25, 2024

Close this PR as its content has been moved to #15975

@qiancai qiancai closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M Denotes a PR that changes 30-99 lines, ignoring generated files. translation/doing This PR’s assignee is translating this PR. v8.0 This PR/issue applies to TiDB v8.0.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants