You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sorry in advance if this is obvious but I have been testing Linstor/DRBD on Talos for 2 weeks now. I have had a lot of issues with piraeus-operator (v2.6.0) on my Talos (1.7.6) cluster. Namely:
Random quorum lost on volumes (replicas stuck in connecting(<nodeID>/Unconnected(<nodeID>)). With no errors in drbdadm
Volumes that randomly get stuck in "Terminating" when I delete them with the logs on the pv stating Warning VolumeFailedDelete 4m42s linstor.csi.linbit.com_linstor-csi-controller-84674bd55b-4kd2n_20cbf029-05ef-4869-bf72-b9782a25f513 (combined from similar events): rpc error: code = Internal desc = failed to delete volume: Message: 'Resource 'pvc-a423738b-8249-48ae-8a57-a708f87c98e5' is still in use.'; Cause: 'Resource is mounted/in use.'; Details: 'Node: n2, Resource: pvc-a423738b-8249-48ae-8a57-a708f87c98e5'; Correction: 'Un-mount resource 'pvc-a423738b-8249-48ae-8a57-a708f87c98e5' on the node 'n2'.'; Reports: '[66EEEB28-00000-000019]' (I can share the error reports if you want)
Because I was the only one seeing these I assumed I must have made a mistake somewhere in my config.
After investigating for a week I found out that when You install mainline Talos (currently 1.7.6) and setup the drbd kernel module, you get the 9.8.2 version. Currently the DRBD version packaged with piraeus-operator is DRBD 9.2.11. It is a very obscure thing and you have to go look for it in the image tags and Talos will not push extensions updates to older versions of Talos so you need to wait until a new version of Talos gets released to get the latest version.
I new assume this is the reason why I am seeing all these issues. But before I open a PR on the piraeus-operator repo in the Talos section to add warnings to the documentation so that other people who are new to this don't hit the same issue as me, I need to confirm this :
Can running DRBD 9.2.11 userland against a DRBD 9.2.8 kernel cause these kind of issues ?
I am pretty new to DRBD so apologies if the answer is obvious. Either way, If the answer is yes, then I would suggest adding a small printk("drbd kernel version mismatch <9.X.X> vs <9.X.Y)") somewhere in the dmesg... I can even open the PR myself if you show me in which file to do it.
Thanks
Bernard.
The text was updated successfully, but these errors were encountered:
Hello
Sorry in advance if this is obvious but I have been testing Linstor/DRBD on Talos for 2 weeks now. I have had a lot of issues with piraeus-operator (v2.6.0) on my Talos (1.7.6) cluster. Namely:
connecting(<nodeID>
/Unconnected(<nodeID>)
). With no errors indrbdadm
Warning VolumeFailedDelete 4m42s linstor.csi.linbit.com_linstor-csi-controller-84674bd55b-4kd2n_20cbf029-05ef-4869-bf72-b9782a25f513 (combined from similar events): rpc error: code = Internal desc = failed to delete volume: Message: 'Resource 'pvc-a423738b-8249-48ae-8a57-a708f87c98e5' is still in use.'; Cause: 'Resource is mounted/in use.'; Details: 'Node: n2, Resource: pvc-a423738b-8249-48ae-8a57-a708f87c98e5'; Correction: 'Un-mount resource 'pvc-a423738b-8249-48ae-8a57-a708f87c98e5' on the node 'n2'.'; Reports: '[66EEEB28-00000-000019]'
(I can share the error reports if you want)Because I was the only one seeing these I assumed I must have made a mistake somewhere in my config.
After investigating for a week I found out that when You install mainline Talos (currently 1.7.6) and setup the drbd kernel module, you get the 9.8.2 version. Currently the DRBD version packaged with piraeus-operator is DRBD 9.2.11. It is a very obscure thing and you have to go look for it in the image tags and Talos will not push extensions updates to older versions of Talos so you need to wait until a new version of Talos gets released to get the latest version.
I new assume this is the reason why I am seeing all these issues. But before I open a PR on the
piraeus-operator
repo in the Talos section to add warnings to the documentation so that other people who are new to this don't hit the same issue as me, I need to confirm this :I am pretty new to DRBD so apologies if the answer is obvious. Either way, If the answer is yes, then I would suggest adding a small
printk("drbd kernel version mismatch <9.X.X> vs <9.X.Y)")
somewhere in thedmesg
... I can even open the PR myself if you show me in which file to do it.Thanks
Bernard.
The text was updated successfully, but these errors were encountered: