support fallback configuration for KEDA autoscaling #9846

hobbsh · 2024-11-06T22:53:57Z

What this PR does

Supports a fallback configuration for the KEDA autoscaling configuration in the mimir-distributed helm chart, so if/when the metrics endpoint being used to scale becomes unavailable, the ScaledObject will fallback to the configured replica count.

Which issue(s) this PR fixes or relates to

n/a

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
about-versioning.md updated with experimental features.

CLAassistant · 2024-11-06T22:54:06Z

All committers have signed the CLA.

jhesketh · 2024-11-11T03:19:06Z

Thank you for the contribution!

This will need an entry in operations/helm/charts/mimir-distributed/CHANGELOG.md.

I'm also curious on your opinion if a fallback if preferable to just maintaining the current number of replicas given that a deployment is also protected by the minReplicas? Would it not be better to maintain the current replicas and alert when KEDA isn't able to get the metrics? Or is the intention to generally have a fallback end up scaling up a deployment?

hobbsh · 2024-11-11T17:15:07Z

First off, thank you for the engagement!

Would it not be better to maintain the current replicas and alert when KEDA isn't able to get the metrics?

In my experience, if the distributors aren't scaling, they will quickly OOM and then we get into a catch 22 where we need to scale but can't because metrics can't be retrieved. KEDA created fallback for exactly this reason so I would much prefer to use it rather than try to rely on minReplicas, because for one it would lead to additional overhead/cost to do it that way. We use this for some internal services and it works well, so it would be great to have this option for Mimir. So yes, the intention is to have fallback scale up the deployment if we have a metrics blip. Obviously, it would be great to have the autoscaling metrics coming from an external source, but we are quite resource strapped and that's not the easiest option.

This will need an entry in operations/helm/charts/mimir-distributed/CHANGELOG.md

Done, thanks!

support fallback configuration for KEDA autoscaling

061230b

hobbsh requested a review from a team as a code owner November 6, 2024 22:53

add space to values for linter

a9c0617

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support fallback configuration for KEDA autoscaling #9846

support fallback configuration for KEDA autoscaling #9846

hobbsh commented Nov 6, 2024 •

edited

Loading

CLAassistant commented Nov 6, 2024 •

edited

Loading

jhesketh commented Nov 11, 2024

hobbsh commented Nov 11, 2024

support fallback configuration for KEDA autoscaling #9846

Are you sure you want to change the base?

support fallback configuration for KEDA autoscaling #9846

Conversation

hobbsh commented Nov 6, 2024 • edited Loading

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

CLAassistant commented Nov 6, 2024 • edited Loading

jhesketh commented Nov 11, 2024

hobbsh commented Nov 11, 2024

hobbsh commented Nov 6, 2024 •

edited

Loading

CLAassistant commented Nov 6, 2024 •

edited

Loading