-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROX-27073: add cluster migration e2e test #2127
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johannes94 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2c4d4d4
to
d7b23d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several nits, nothing major - the flow of the e2e test makes sense and is easy enough to read with the comments you provided throughout
@@ -15,6 +15,10 @@ | |||
emailsender-manifests.yaml | |||
central-chart/ | |||
pids-port-forward | |||
# temp files created by multicluster e2e tests | |||
cluster-list.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do these come from / why can we not clean them up after the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found them in deploy.sh - is there no way to have a cleanup.sh equivalent? Or better to leave the artifacts in case debugging is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those files would be created and not cleaned up. So you can use them for debugging already, this is just the .gitignore to prevent those files from being commited and pushed to the repo.
@@ -0,0 +1,35 @@ | |||
package dns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would waiting for "deleting" state instead of "deprovisioning" rely on the FM's cleanup of the routes instead of manually cleaning them up here? (sorry if missing something, not used to looking at the e2e tests as much)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about this. But I think FM would only start deleting the route records on deleting
state.
We trigger a deletion and wait for deletion at the end of this test, which usually cleans up the records.
This logic is intended as an additional measure on test failures. Because on failures we might be stuck somewhere in the usual reconciliation loop which can lead to a situation where the deletion lifecycle logic doesn't get executed. In a case like this we don't want to leak those resources.
@@ -32,8 +31,8 @@ var ( | |||
dnsEnabled bool | |||
routesEnabled bool | |||
route53Client *route53.Route53 | |||
waitTimeout = getWaitTimeout() | |||
extendedWaitTimeout = getWaitTimeout() * 3 | |||
waitTimeout = testutil.GetWaitTimeout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a meta-comment - adding the testutil.
in a separate PR if possible would've made reviewing just the cluster migration part easier. It's fast enough to skim by but next time would recommend splitting into 2 PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about splitting PRs as well along the way, decided to not do it since I only wanted very few changes. Then it grew over time to what you see now :) . I'll split it next time.
Expect(*recordSet.Name).To(Equal(domain)) | ||
Expect(*record.Value).To(Equal(reencryptIngress.RouterCanonicalHostname)) // TODO use route specific ingress instead of comparing with reencryptIngress for all cases | ||
} | ||
testutil.AssertDNSMatchesRouter(dnsRecordsLoader.CentralDomainNames, recordSets, &reencryptIngress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
const ( | ||
dpCloudProvider = "standalone" | ||
dpRegion = "standalone" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - a unique value from dpCloudProvider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean dpCloudProvider should have a different value than dpRegion?
Unfortunately that's not so easy, because that's what we have for the cluster configurations all other tests do it like this as well.
func assertClusterAssignment(expectedClusterID string, centralID string, adminAPI fleetmanager.AdminAPI) { | ||
var clusterAssignment string | ||
Eventually(func() (err error) { | ||
// assert the cluster ID outside the Eventually, since once we have a non-empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? https://onsi.github.io/gomega/#eventually
Eventually checks that an assertion eventually passes. Eventually blocks when called and attempts an assertion periodically until it passes or a timeout occurs.
It shouldn't keep polling once the condition passes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah unless the comments means "Once you get a value it will not change, so then do the assert" - instead of e.g. checking the value and, if the value is there but wrong, then getting stuck in polling loop even though you know it will not change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly this is what the comment means, I will try to express this more clearly.
for _, central := range centralList.Items { | ||
if central.Id == centralID { | ||
tenantExists = true | ||
clusterID = central.ClusterId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break?
e2e/testutil/testutil.go
Outdated
} | ||
|
||
// ObtainCentralRequest queries fleet-manager public API for the CentralRequest with id and stores in in the given pointer | ||
func ObtainCentralRequest(ctx context.Context, client *fleetmanager.Client, id string, request *public.CentralRequest) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - would use Get or Fetch instead of Obtain, if possible. Just a bit more consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Its called that way because the private function I copied to testuitl
was called like that before.
I changed it.
/retest |
Description
testutils
packageChecklist (Definition of Done)
Test manual
ROX-12345: ...
Test manual