Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

Open
MrBlaise opened this issue Dec 18, 2024 · 0 comments

Comments

@MrBlaise
Copy link

MrBlaise commented Dec 18, 2024

I am using the latest chart: 1.43.0 (haproxy version: 3.1.0-f2b9791)

I believe the root cause of this issue lies in the client-native library for which I opened an issue in that repo: haproxytech/client-native#113. But it could be that it mostly manifests when people use the kubernetes ingress due to the nature of its reload mechanism.

I have an application in kubernetes served by haproxy ingress that receives a lot of traffic. Under that load whenever we do a rolling update, there is a chance (higher the load, higher the occurrence) of the ingress controller exiting due to segmentation fault issue.

Stacktrace with logs:

[NOTICE]   (69) : Reloading HAProxy
[NOTICE]   (69) : Initializing new worker (4635)
[NOTICE]   (4635) : haproxy version is 3.1.0-f2b9791
[WARNING]  (4635) : frontend 'http' has no 'bind' directive. Please declare it as a backend if this was intended.
[WARNING]  (4635) : frontend 'https' has no 'bind' directive. Please declare it as a backend if this was intended.
[NOTICE]   (69) : Loading success.
[WARNING]  (4615) : Proxy healthz stopped (cumulated conns: FE: 8, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-6666-grpc-6666 stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-80-grpc-80 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-rbe-tls-6669-grpc-rbe-tls-6669 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-tls-443-grpc-tls-443 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy stats stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy https stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-6666 stopped (cumulated conns: FE: 0, BE: 2).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-80 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-tls-443 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system_default-local-service_http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system_prometheus_http stopped (cumulated conns: FE: 0, BE: 0).
[NOTICE]   (69) : haproxy version is 3.1.0-f2b9791
[ALERT]    (69) : Current worker (4635) exited with code 139 (Segmentation fault)
[WARNING]  (69) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies.
Please check that you are running an up to date and maintained version of haproxy and open a bug report.
[ALERT]    (69) : exit-on-failure: killing every processes with SIGTERM
HAProxy version 3.1.0-f2b9791 2024/11/26 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-3.1.0.html
Running on: Linux 6.1.0-27-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.115-1 (2024-11-01) x86_64
[WARNING]  (69) : Former worker (4615) exited with code 143 (Terminated)
[WARNING]  (69) : All workers exited. Exiting... (139)
panic: runtime error: slice bounds out of range [:-1]

goroutine 444 [running]:
github.com/haproxytech/client-native/v5/runtime.(*client).Reload(0x14?)
	/go/pkg/mod/github.com/haproxytech/client-native/[email protected]/runtime/runtime_client.go:166 +0x2ba
github.com/haproxytech/kubernetes-ingress/pkg/haproxy/process.(*s6Control).Service(0xc001dec008, {0x263f416?, 0x8?})
	/src/pkg/haproxy/process/s6-overlay.go:61 +0xd5
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).updateHAProxy(0xc000de7808)
	/src/pkg/controller/controller.go:204 +0xa58
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).SyncData(0xc000de7808)
	/src/pkg/controller/monitor.go:38 +0x5b2
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).Start(0xc000de7808)
	/src/pkg/controller/controller.go:100 +0x209
created by main.main in goroutine 1
	/src/main.go:164 +0xe65
Ingress Controller exited with fatal code 2, taking down the S6 supervision tree

I assume this is some kind of race condition as haproxy is doing the heavy work of routing the traffic and sometimes it causes that the call to the reload goes into this edge case where there the output variable here becomes empty:

Code from the client-native library:

	output, err := c.runtime.ExecuteMaster("reload")
	if err != nil {
		return "", fmt.Errorf("cannot reload: %w", err)
	}
	parts := strings.SplitN(output, "\n--\n", 2)
	if len(parts) == 1 {
		// No startup logs. This happens when HAProxy is compiled without USE_SHM_OPEN.
		status = output[:len(output)-1]
	} else {
		status, logs = parts[0], parts[1]
	}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant