Replies: 8 comments 3 replies
-
Hi @tman24 would you mind providing installation debug logs? |
Beta Was this translation helpful? Give feedback.
-
We've noticed recent FCOS is giving us trouble on boot. Could you try 34.20210626.3.1? See https://getfedora.org/en/coreos/download?tab=metal_virtualized&stream=stable, replace version in the URL |
Beta Was this translation helpful? Give feedback.
-
Just to confirm that reverting back to 34.20210626.3.1 fixed the problem, and after the first boot, the network stack came up ok, and config continued. As this working version is part of the current stable stream, there's probably not much reason to change at this time. |
Beta Was this translation helpful? Give feedback.
-
Well, the problems go on. I'm following the inext.io install guide, and while the bootstrap and master nodes have deployed, bootstrap doesn't seem to be coming up properly, which is preventing the workers from deploying too! journalctl -b -f -u release-image.service -u bootkube.service ep 02 15:24:40 bootstrap bootkube.sh[386903]: Starting temporary bootstrap control plane... Yes, bootstrap-pod.yaml does exist in the specified location. Should it not? Everything on the services node checks out. The install-config.yaml is pretty simple; apiVersion: v1 compute:
controlPlane: networking:
platform: fips: false pullSecret: '{"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}' openshift-install 4.7.0-0.okd-2021-08-22-163618 Services node is CentOS 8.4, all others are FCOS 34. Initial deployment should not be this hard! Thanks |
Beta Was this translation helpful? Give feedback.
-
I've just rebuilt bootstrap again, and it seemed to get a bit further, but is now stuck in this loop which occurs directly after cluster-bootstrap is called. Sep 02 17:20:30 bootstrap bootkube.sh[19404]: [#4011] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": dial tcp [::1]:6443: connect: connection refused Also, not sure about this cert warning being part of the problem; 7:39:30Z is after 2021-09-01T10:53:12Z For some reason, nothing on bootstrap is listening on 6443 though, although I do have some containers running; Running kube-apiserver 6 2ffbcdf1f3c6f |
Beta Was this translation helpful? Give feedback.
-
Any feedback? I'm pretty stuck right now. Why would the bootstrap node be trying to talk to itself on 6443 when it isn't even listening on that port. What could cause the 'connection refused' message? |
Beta Was this translation helpful? Give feedback.
-
Fcos: 34.20210626.3.1 |
Beta Was this translation helpful? Give feedback.
-
Do you work with old ignitions? The certs expire after 24hours. |
Beta Was this translation helpful? Give feedback.
-
OKD first timer here. I'm following a bare metal 4.7 install guide and have got to the point where the bootstrap node is up and running, and I'm now in the process of setting up the control plane nodes. Boostrap and control plane nodes are all CoreOS 34.20210808.3.0 - services node is CentOS 8.4.
I've created the ignition files, and am now deploying the main nodes. All nodes get an IP address via static DHCP reservations. Bootstrap deployed ok, but the control plane nodes are giving me a problem. I've edited the CoreOS boot line with the correct parameters on control-plane-1, and it downloads and installs the CoreOS image ok (so networking is working, and during this phase, it responds to ICMP correctly on it's statically reserved IP address). After the writing to disk part though it reboots again, and from then on it's like it has no network. I just get a looped message saying it can't connect to https://apt-int....:22623/config/master. The URL works fine (tested), but I no longer get any ICMP response from the node, so either the network stack didn't come up or there's a bug/problem somewhere. Other than that, I've no idea.
It's a bit of a showstopper right now, and any advice is appreciated.
Beta Was this translation helpful? Give feedback.
All reactions