Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PD may be panic when enabling tidb_enable_tso_follower_proxy and restart PD #8950

Closed
okJiang opened this issue Dec 26, 2024 · 1 comment · Fixed by #8951
Closed

PD may be panic when enabling tidb_enable_tso_follower_proxy and restart PD #8950

okJiang opened this issue Dec 26, 2024 · 1 comment · Fixed by #8951
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. impact/panic report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@okJiang
Copy link
Member

okJiang commented Dec 26, 2024

Bug Report

What did you do?

  1. enable tidb_enable_tso_follower_proxy
  2. many tidb instances(40+) and the write traffic is very high.
  3. restart PD, if it is the PD leader, the trigger probability will also increase.

What did you expect to see?

PD restart successfully

What did you see instead?

PD panic

pd_log.log

What version of PD are you using (pd-server -V)?

@okJiang okJiang added the type/bug The issue is confirmed as a bug. label Dec 26, 2024
@okJiang
Copy link
Member Author

okJiang commented Dec 26, 2024

Analyze:

  1. PD server down
  2. Client delete the stale stream
  3. PD server up
  4. Client just called
    func (c *Cli) updateConnectionCtxs(ctx context.Context, connectionCtxs *sync.Map) bool {
  5. Client discovery the new etcd server up
    func (c *Cli) getAllTSOStreamBuilders() map[string]tsoStreamBuilder {
  6. Client try to create a new stream to PD server
    log.Info("[tso] try to create tso stream", zap.String("addr", addr))
    cctx, cancel := context.WithCancel(ctx)
    // Do not proxy the leader client.
    if addr != leaderAddr {
    log.Info("[tso] use follower to forward tso stream to do the proxy",
    zap.String("addr", addr))
    cctx = grpcutil.BuildForwardContext(cctx, forwardedHost)
    }
    // Create the TSO stream.
    stream, err := tsoStreamBuilder.build(cctx, cancel, c.option.Timeout)
  7. At this time, PD server's initialization does not completed. tsoDispatcher is nil.
    s.tsoDispatcher.DispatchRequest(ctx, tsoRequest, s.pdProtoFactory, doneCh, errCh, s.tsoPrimaryWatcher)
  8. So, PD panic

@okJiang okJiang added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. and removed may-affects-5.4 may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 may-affects-8.1 may-affects-8.5 labels Dec 27, 2024
@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Dec 31, 2024
@ti-chi-bot ti-chi-bot bot closed this as completed in #8951 Jan 2, 2025
ti-chi-bot bot added a commit that referenced this issue Jan 2, 2025
close #8950

Signed-off-by: okJiang <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
@okJiang okJiang added affects-8.5 This bug affects the 8.5.x(LTS) versions. and removed affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. labels Jan 3, 2025
@ti-chi-bot ti-chi-bot bot added affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. and removed affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. labels Jan 3, 2025
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Jan 3, 2025
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Jan 3, 2025
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Jan 3, 2025
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. impact/panic report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants