Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: bump the alert threshold for sustained elevated cpu #55

Merged
merged 1 commit into from
Feb 23, 2024

Conversation

thegentlemanphysicist
Copy link
Contributor

After the keycloak upgrade we noticed a slightly elevated CPU usage for normal day to day use. This relaxes the alert due to false alarms.

Copy link

Terraform Format and Style 🖌success

Terraform Initialization ⚙️success

Terraform Plan 📖success

Show Plan
module.c6af30-team.sysdig_monitor_alert_metric.prod_db_pods_low: Refreshing state... [id=10777732]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_high: Refreshing state... [id=10777725]
module.c6af30-team.sysdig_monitor_alert_metric.prod_db_pods_high: Refreshing state... [id=10777727]
module.c6af30-team.sysdig_monitor_dashboard.pv_usage: Refreshing state... [id=296849]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_pods_high: Refreshing state... [id=10777724]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_pods_low: Refreshing state... [id=10777730]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_spike_high: Refreshing state... [id=10777728]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_med: Refreshing state... [id=10777726]
module.c6af30-team.sysdig_monitor_alert_promql.prod_sso_db_pv_gt_60: Refreshing state... [id=10777731]
module.c6af30-team.sysdig_monitor_dashboard.pv_overall: Refreshing state... [id=296847]
module.c6af30-team.sysdig_monitor_alert_metric.prod_db_pod_restarts_gte_1: Refreshing state... [id=10777729]
module.c6af30-team.sysdig_monitor_dashboard.pods_cpu: Refreshing state... [id=296848]
module.c6af30-team.sysdig_monitor_alert_metric.prod_keycloak_pods_med: Refreshing state... [id=10777734]
module.c6af30-team.sysdig_monitor_alert_promql.prod_sso_db_pv_gt_80: Refreshing state... [id=10777733]
module.eb75ad-team.sysdig_monitor_alert_promql.prod_kc_disk_log_pv_usage_sixty: Refreshing state... [id=15959118]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_log_pv_med: Refreshing state... [id=9921922]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_db_pods_low: Refreshing state... [id=9921900]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_spike_high: Refreshing state... [id=9921904]
module.eb75ad-team.sysdig_monitor_alert_metric.test_dr_pod: Refreshing state... [id=15328794]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_sustained: Refreshing state... [id=15961444]
module.eb75ad-team.sysdig_monitor_dashboard.pv_overall: Refreshing state... [id=363087]
module.eb75ad-team.sysdig_monitor_alert_metric.dev_dr_pod: Refreshing state... [id=15328793]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_pods_high: Refreshing state... [id=9921902]
module.eb75ad-team.sysdig_monitor_dashboard.pods_cpu: Refreshing state... [id=363086]
module.eb75ad-team.sysdig_monitor_alert_metric.dev_backup_storage_pv_usage_gt_med: Refreshing state... [id=16074248]
module.eb75ad-team.sysdig_monitor_alert_downtime.test_dr_pod_downtime: Refreshing state... [id=15346484]
module.eb75ad-team.sysdig_monitor_dashboard.general_pod_performance: Refreshing state... [id=404905]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_db_pod_restarts_gte_1: Refreshing state... [id=9921905]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_pods_low: Refreshing state... [id=9921901]
module.eb75ad-team.sysdig_monitor_dashboard.pv_usage: Refreshing state... [id=363085]
module.eb75ad-team.sysdig_monitor_alert_downtime.dev_dr_pod_downtime: Refreshing state... [id=15338912]
module.eb75ad-team.sysdig_monitor_alert_promql.dev_db_pv_usage_seventyfive: Refreshing state... [id=14625866]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_high: Refreshing state... [id=9921898]
module.eb75ad-team.sysdig_monitor_alert_promql.prod_db_pv_usage_low: Refreshing state... [id=9921935]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_db_pods_high: Refreshing state... [id=9921906]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_dr_pod: Refreshing state... [id=15328795]
module.eb75ad-team.sysdig_monitor_alert_promql.prod_minio_pvc_storage_low: Refreshing state... [id=14080753]
module.eb75ad-team.sysdig_monitor_alert_promql.dev_db_pv_usage_ninety: Refreshing state... [id=14625871]
module.eb75ad-team.sysdig_monitor_alert_promql.test_db_pv_usage_seventyfive: Refreshing state... [id=14625869]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_med: Refreshing state... [id=9921903]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_pods_med: Refreshing state... [id=9921897]
module.eb75ad-team.sysdig_monitor_alert_promql.test_kc_disk_log_pv_usage_sixty: Refreshing state... [id=15959116]
module.eb75ad-team.sysdig_monitor_alert_promql.test_db_pv_usage_ninety: Refreshing state... [id=14625872]
module.eb75ad-team.sysdig_monitor_alert_metric.prod_backup_storage_pv_usage_gt_med: Refreshing state... [id=9921933]
module.eb75ad-team.sysdig_monitor_alert_downtime.prod_dr_pod_downtime: Refreshing state... [id=15346483]
module.eb75ad-team.sysdig_monitor_alert_promql.prod_db_pv_usage_med: Refreshing state... [id=9921934]
module.eb75ad-team.sysdig_monitor_alert_metric.test_backup_storage_pv_usage_gt_med: Refreshing state... [id=16074249]
module.eb75ad-team.sysdig_monitor_alert_promql.dev_kc_disk_log_pv_usage_sixty: Refreshing state... [id=15959117]
module.e4ca1d-team.sysdig_monitor_dashboard.pods_cpu: Refreshing state... [id=405694]
module.e4ca1d-team.sysdig_monitor_dashboard.general_pod_performance: Refreshing state... [id=405697]
module.e4ca1d-team.sysdig_monitor_dashboard.pv_overall: Refreshing state... [id=405696]
module.e4ca1d-team.sysdig_monitor_dashboard.pv_usage: Refreshing state... [id=405695]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the
last "terraform apply":

  # module.eb75ad-team.sysdig_monitor_alert_downtime.dev_dr_pod_downtime has changed
  ~ resource "sysdig_monitor_alert_downtime" "dev_dr_pod_downtime" {
        id                    = "15338912"
        name                  = "[GoldDR] Dev Switchover Downtime Alert"
      ~ version               = 4 -> 6
        # (9 unchanged attributes hidden)
    }

  # module.eb75ad-team.sysdig_monitor_alert_downtime.test_dr_pod_downtime has changed
  ~ resource "sysdig_monitor_alert_downtime" "test_dr_pod_downtime" {
        id                    = "15346484"
        name                  = "[GoldDR] Test Switchover Downtime Alert"
      ~ version               = 3 -> 5
        # (9 unchanged attributes hidden)
    }

  # module.eb75ad-team.sysdig_monitor_alert_metric.prod_backup_storage_pv_usage_gt_med has changed
  ~ resource "sysdig_monitor_alert_metric" "prod_backup_storage_pv_usage_gt_med" {
        id                    = "9921933"
        name                  = "[GOLD CUST PROD] DB Backup - storage 80%"
      ~ version               = 4 -> 5
        # (8 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

  # module.eb75ad-team.sysdig_monitor_alert_promql.dev_db_pv_usage_seventyfive has changed
  ~ resource "sysdig_monitor_alert_promql" "dev_db_pv_usage_seventyfive" {
        id                    = "14625866"
        name                  = "[GOLD CUST DEV] SSO DB PV over 75%"
      ~ version               = 3 -> 7
        # (7 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }


Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.

─────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.eb75ad-team.sysdig_monitor_alert_metric.prod_keycloak_cpu_usage_sustained will be updated in-place
  ~ resource "sysdig_monitor_alert_metric" "prod_keycloak_cpu_usage_sustained" {
        id                    = "15961444"
      ~ metric                = "max(avg(sysdig_container_cpu_cores_used)) >= 0.15" -> "max(avg(sysdig_container_cpu_cores_used)) >= 0.20"
        name                  = "[GOLD CUST PROD] Keycloak - Sustained Elevated CPU"
        # (9 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

Pusher: @thegentlemanphysicist, Action: pull_request

@thegentlemanphysicist thegentlemanphysicist merged commit 212d6b4 into main Feb 23, 2024
3 checks passed
@jlangy jlangy deleted the bumpCPUThreshold branch March 20, 2024 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants