[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

MrBlaise · 2024-12-18T10:51:33Z

I am using the latest chart: 1.43.0 (haproxy version: 3.1.0-f2b9791)

I believe the root cause of this issue lies in the client-native library for which I opened an issue in that repo: haproxytech/client-native#113. But it could be that it mostly manifests when people use the kubernetes ingress due to the nature of its reload mechanism.

I have an application in kubernetes served by haproxy ingress that receives a lot of traffic. Under that load whenever we do a rolling update, there is a chance (higher the load, higher the occurrence) of the ingress controller exiting due to segmentation fault issue.

Stacktrace with logs:

[NOTICE]   (69) : Reloading HAProxy
[NOTICE]   (69) : Initializing new worker (4635)
[NOTICE]   (4635) : haproxy version is 3.1.0-f2b9791
[WARNING]  (4635) : frontend 'http' has no 'bind' directive. Please declare it as a backend if this was intended.
[WARNING]  (4635) : frontend 'https' has no 'bind' directive. Please declare it as a backend if this was intended.
[NOTICE]   (69) : Loading success.
[WARNING]  (4615) : Proxy healthz stopped (cumulated conns: FE: 8, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-6666-grpc-6666 stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-80-grpc-80 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-rbe-tls-6669-grpc-rbe-tls-6669 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system-grpc-tls-443-grpc-tls-443 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy stats stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy https stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-6666 stopped (cumulated conns: FE: 0, BE: 2).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-80 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy build-cache_build-cache-api-tls-443 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system_default-local-service_http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (4615) : Proxy kube-system_prometheus_http stopped (cumulated conns: FE: 0, BE: 0).
[NOTICE]   (69) : haproxy version is 3.1.0-f2b9791
[ALERT]    (69) : Current worker (4635) exited with code 139 (Segmentation fault)
[WARNING]  (69) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies.
Please check that you are running an up to date and maintained version of haproxy and open a bug report.
[ALERT]    (69) : exit-on-failure: killing every processes with SIGTERM
HAProxy version 3.1.0-f2b9791 2024/11/26 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-3.1.0.html
Running on: Linux 6.1.0-27-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.115-1 (2024-11-01) x86_64
[WARNING]  (69) : Former worker (4615) exited with code 143 (Terminated)
[WARNING]  (69) : All workers exited. Exiting... (139)
panic: runtime error: slice bounds out of range [:-1]

goroutine 444 [running]:
github.com/haproxytech/client-native/v5/runtime.(*client).Reload(0x14?)
	/go/pkg/mod/github.com/haproxytech/client-native/[email protected]/runtime/runtime_client.go:166 +0x2ba
github.com/haproxytech/kubernetes-ingress/pkg/haproxy/process.(*s6Control).Service(0xc001dec008, {0x263f416?, 0x8?})
	/src/pkg/haproxy/process/s6-overlay.go:61 +0xd5
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).updateHAProxy(0xc000de7808)
	/src/pkg/controller/controller.go:204 +0xa58
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).SyncData(0xc000de7808)
	/src/pkg/controller/monitor.go:38 +0x5b2
github.com/haproxytech/kubernetes-ingress/pkg/controller.(*HAProxyController).Start(0xc000de7808)
	/src/pkg/controller/controller.go:100 +0x209
created by main.main in goroutine 1
	/src/main.go:164 +0xe65
Ingress Controller exited with fatal code 2, taking down the S6 supervision tree

I assume this is some kind of race condition as haproxy is doing the heavy work of routing the traffic and sometimes it causes that the call to the reload goes into this edge case where there the output variable here becomes empty:

Code from the client-native library:

	output, err := c.runtime.ExecuteMaster("reload")
	if err != nil {
		return "", fmt.Errorf("cannot reload: %w", err)
	}
	parts := strings.SplitN(output, "\n--\n", 2)
	if len(parts) == 1 {
		// No startup logs. This happens when HAProxy is compiled without USE_SHM_OPEN.
		status = output[:len(output)-1]
	} else {
		status, logs = parts[0], parts[1]
	}

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

MrBlaise commented Dec 18, 2024 •

edited

Loading

[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

[BUG] panic: runtime error: slice bounds out of range [:-1] during rolling update of backend pods with heavy load #687

Comments

MrBlaise commented Dec 18, 2024 • edited Loading

MrBlaise commented Dec 18, 2024 •

edited

Loading