You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In section 5.1.17. Kubernetes Pod not healthy of this page, the description Pod has been in a non-ready state for longer than 15 minutes. and it below rule is:
expr: min_over_time(sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed"})[15m:1m]) > 0
But, I think the correct rule is:
expr: sum_over_time(sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed"})[15m:1m]) == 15
The text was updated successfully, but these errors were encountered:
I agree.
The current rule unfortunately also fires when a freshly (re-)deployed pod takes longer than 1 min to get ready, because the subquery [15m:1m] then only contains one bucket for that one minute with value = 1 triggering the min_over_time.
The proposed rule ensures, that the pod has been existing for 15 minutes and prevents the rule to pre-fire.
In section 5.1.17. Kubernetes Pod not healthy of this page, the description
Pod has been in a non-ready state for longer than 15 minutes.
and it below rule is:But, I think the correct rule is:
The text was updated successfully, but these errors were encountered: