Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix zone aware alertmanager http idle timeout #9851

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions operations/helm/charts/mimir-distributed/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Entries should include a reference to the Pull Request that introduced the chang
* [ENHANCEMENT] helm: add `enabled` field for admin-api, compactor, distributor, gateway, ingester, querier, query-frontend and store-gateway components. This helps when deploying the GEM federation-frontend on its own. #9734
* [BUGFIX] Fix PVC template in AlertManager to not show diff in ArgoCD. #9774
* [BUGFIX] Fix how `fullnameOverride` is reflected in generated manifests. #9564
* [BUGFIX] Alertmanager: Set -server.http-idle-timeout to avoid EOF errors in ruler, also for zone aware Alertmanager #9851

## 5.5.1
* [BUGFIX] Fix incorrect use of topology spread constraints in `GrafanaAgent` CRD of metamonitoring. #9669
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,11 @@ spec:
{{- if .Values.alertmanager.zoneAwareReplication.enabled }}
- "-alertmanager.sharding-ring.instance-availability-zone=zone-default"
{{- end }}
{{- end }}
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
{{- end }}
{{- range $key, $value := .Values.alertmanager.extraArgs }}
- "-{{ $key }}={{ $value }}"
{{- end }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-a"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down Expand Up @@ -214,6 +218,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-b"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down Expand Up @@ -341,6 +349,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-c"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-a"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down Expand Up @@ -262,6 +266,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-b"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down Expand Up @@ -414,6 +422,10 @@ spec:
- "-config.expand-env=true"
- "-config.file=/etc/mimir/mimir.yaml"
- "-alertmanager.sharding-ring.instance-availability-zone=zone-c"
# Prometheus HTTP client used to send alerts has a hard-coded idle
# timeout of 5 minutes, therefore the server timeout for Alertmanager
# needs to be higher to avoid connections being closed abruptly.
- "-server.http-idle-timeout=6m"
volumeMounts:
- name: config
mountPath: /etc/mimir
Expand Down
Loading