Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genSignedCert causes helm-operator to do no-op upgrades in a loop #87

Open
domq opened this issue Mar 8, 2024 · 2 comments
Open

genSignedCert causes helm-operator to do no-op upgrades in a loop #87

domq opened this issue Mar 8, 2024 · 2 comments

Comments

@domq
Copy link

domq commented Mar 8, 2024

What I attempted: install Cilium on OpenShift 4.13.32, according to the instructions

What I expected would happen: the cilium-olm operator would do its thing, and then go sit tight in the background.

What I observed instead: watch helm ls -A shows the REVISION of the cilium Helm chart going up roughly once every 7 seconds.


Diffing two subsequent versions of oc -n cilium get secret -o yaml shows that the tls.crt and tls.key entries secret/hubble-server-certs and secret/hubble-relay-client-certs are changed each time, as well as some sequence numbers and Helm's release fields.

Setting hubble.auto.tls.method to certmanager stops the upgrade loop.

@domq
Copy link
Author

domq commented Mar 8, 2024

The key insight for the workaround outlined above was found here.

domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Mar 8, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Mar 21, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Mar 22, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Apr 4, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Apr 16, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
domq pushed a commit to epfl-si/sddc-ocp that referenced this issue Apr 23, 2024
Works around isovalent/olm-for-cilium#87 using wisdom from operator-framework/operator-sdk#1069 (comment)

As it turns out, generating a random certificate in a tight compare-and-reconcile loop (that doesn't back off) is a bad idea, #WHOWOULDHAVETHUNK.

- Seting `hubble.tls.auto.method = certmanager` results in an idempotent Helm chart, and therefore breaks the loop.
- As stated in the [official documentation](https://docs.cilium.io/en/stable/installation/k8s-install-openshift-okd/) (⌘F for “You can set any custom Helm values”), we can do that out of the `CiliumConfig`'s `spec`; which also explains why the schema thereof (`oc explain CiliumConfig.spec`) is so loosely defined.
- Of course, now we need to install cert manager; which is why this is a [stopgap] and not a [fix]. (The only damage is that there will be no Hubble until we install it.)
@davtex
Copy link

davtex commented May 28, 2024

I am facing the same issue on Openshift 4.14 and Cilium 1.15.1 with cilium-apiserver enabled and default TLS settings. Operator seems to generate new apiserver certificates with each Helm run, which puts it into endless reconciliation loop. I am at Helm iteration 1670 after couple hours + this is making OLM pod consume 1 CPU and generate massive amount of logs with debug enabled + it keeps changing generated secret with each run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants