-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the SPC ready status reflect its capacity #1108
Make the SPC ready status reflect its capacity #1108
Conversation
the corresponding cluster. The capacity manager is simplified to take this fact into account, even though it needs to re-check the spacecount from the cache to decrease the chance of placing spaces to full clusters just because of the fact that the reconciliation of the SPC didn't happen yet.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: metlos The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice Job 👍
I like the new test simplification without the InitializeCounters
requirement.
I have few comments/questions , mostly related to the unit tests.
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller.go
Outdated
Show resolved
Hide resolved
The readiness reason will reflect the situation better in that case.
…o member-info-in-spc-status-2
/retest |
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller.go
Outdated
Show resolved
Hide resolved
} | ||
spc.Status.ConsumedCapacity = cc | ||
|
||
capacityCondition := r.determineCapacityReadyState(spc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have a message or different reasons in order to be able to spot which threshold exceeded ? I mean the SpaceProvisionerConfigInsufficientCapacityReason
alone doesn't say if memory usage is higher then threshold or space count exceeded the max value configured. I think it could be useful to spot that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both threshold are exceeded we could have that information in the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that but since both the limits and the usage are in the same CR, it's a matter of just glancing at the data to see what's wrong. IMHO, if we wanted to be this detailed, we should add separate conditions for different aspects of the readiness computation.
So far, a single condition was enough for us, so I didn't want to overdo the error reporting when all the information is already in the CR.
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller_test.go
Show resolved
Hide resolved
Co-authored-by: Francisc Munteanu <[email protected]>
…o member-info-in-spc-status-2
Co-authored-by: Francisc Munteanu <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1108 +/- ##
==========================================
+ Coverage 79.44% 79.50% +0.05%
==========================================
Files 78 78
Lines 7785 7924 +139
==========================================
+ Hits 6185 6300 +115
- Misses 1422 1442 +20
- Partials 178 182 +4
|
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller.go
Outdated
Show resolved
Hide resolved
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller.go
Outdated
Show resolved
Hide resolved
controllers/spaceprovisionerconfig/spaceprovisionerconfig_controller.go
Outdated
Show resolved
Hide resolved
pkg/capacity/manager.go
Outdated
func DefaultClusterManager(namespace string, cl runtimeclient.Client) *ClusterManager { | ||
return NewClusterManager(namespace, cl, nil) | ||
} | ||
|
||
func NewClusterManager(namespace string, cl runtimeclient.Client, getSpaceCount SpaceCountGetter) *ClusterManager { | ||
return &ClusterManager{ | ||
namespace: namespace, | ||
client: cl, | ||
namespace: namespace, | ||
client: cl, | ||
getSpaceCount: getSpaceCount, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we discussed that it would make more sense having the changes of the capacity manager in a separate PR, that it would help us with promoting the changes in an isolated way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, true. It's been so long that I completely forgot about this.
This simplifies the logic in the controller and doesn't increase the complexity in the controller tests.
to be less surprising.
a test package and simplify how the SpaceCountGetter is obtained.
…o member-info-in-spc-status-2
Quality Gate passedIssues Measures |
@metlos: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
I forgot that this needs to be done in stages to avoid the possibility of making the clusters unavailable for space placement for a short period of time. I'm closing this PR in favor of #1109 which puts the consumed capacity info into SPCs and will be followed up by another PR that will contain the changes in the capacity manager. |
Make the SPC ready status reflect the ability to place spaces to the corresponding cluster. The capacity manager is simplified to take this fact into account, even though it needs to re-check the spacecount from the cache to decrease the chance of placing spaces to full clusters just because of the fact that the reconciliation of the SPC didn't happen yet.
At one point during the implementation, I based the capacity computation completely on the capacity info contained in the SPC. After that, it became apparent that that will not work, because it would increase the chance of over-committing spaces to the clusters because of the delay between the increased space count and its manifestation in the SPC's status.
Therefore, the capacity manager has been updated to take into account the "fresh" space count from the cache.
Another consequence of this "retrofit" is the fact that many of the tests were simplified to not mock the ToolchainStatus and instead completely rely on the info in the SPCs. This doesn't correspond fully to the runtime state but IMHO is sufficient in the involved unit tests because the unit tests merely use this information, they're not testing it.
Paired PRs: