-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadWriteMany accessMode #69
Comments
This is being worked on. By default you get a mounted directory with a file system on a DRBD device. The DRBD device/the resource (per default) is not allowed to be dual primary, and even if so, having multiple writers on the file system would break anyways. The only thing supported right now is raw block mode, with setting DRBD two-primaries via the storage class and having them dual primary while live migration (e.g., kubevirt). |
OK. Thanks for the reply.
Here my volumes are OK on the two nodes with the resource created correctly (no Diskless). Can I help somehow on this ? |
Exactly, but now you mount the FS on two nodes, which really really is not a good idea. Contributions are always welcome, but AFAIK there is a team working on it, so just waiting might be the best for now. @alexzhc , how is the progress on that? |
OK. I was able to do ReadWriteMany access with Rook and Ceph but performances are better with Linstor. |
I don't know, I'm not working on this part, @alexzhc is still the one that could give updates on it. |
Hi, |
Integrating an NFS export by using a NFS-Ganesha POD on top of a drbd RWO volume is not hard as demonstrated by Rook-NFS. However, by doing so, the NFS-Ganesha POD with the drbd RWO volume needs to failover in a reasonable RTO in case of node failure. However, besides 5-minutes k8s node timeout, currently in CSI, RWO volume attachment failover has a hard-coded timeout of 6 minutes. Such a long failover time will definitely cause NFS mount to go stale. We need to solve the RWO failover RTO issue before implementing NFS export. |
Hi, take a look at nfs-server-provisioner project, we're using it together with linstor for having ReadWriteMany volumes. Failover is ensured by kube-fencing. |
Hi @kvaps, it sounds interesting. |
Performance is fine, you can see my benchmarks for comparison:
For the nfs-server-provisioner, we're using the helm chart: here is example values: persistence:
enabled: true
size: 50Gi
storageClass: linstor-1
storageClass:
name: example-nfs After deployment, you can request new volume using Fencing is very specific to your environment, here is an example for HP ilo Meanwhile I would advise to avoid using ReadWriteMany in case if you don't really needed it, since it is adding extra abstraction layers, which might made whole system less stable. Consider using S3 (minio) in your application instead |
Yes. I wanted to have a deployment with two pods on two different nodes to serve the same Prestashop app for load balancing. It must sharing a file system for image uploads,... I can't use S3 by design. |
Well we're using same approach to host Nextcloud itself, but the user data is stored on another server using S3 backend |
@kvaps We have tried using https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner per your suggestion with a pireaus StorageClass, however when the NFS pod is rescheduled on to another node due to node failure, the mount on the new node via the NFS service just hangs.
We've tried nfsver 3 & 4 and are having the same issue. Weirdly you can still connect via NFS on the same service if you change the nfsver parameter, but the original nfsver setting does not work. Have you experienced this issue, do you have any advice which may help? |
Sounds like network problem |
@kvaps thanks for the reply, I agree. Currently trying to debug the issue with tcpdump/conntrack & cilium. The CNI is cilium, unfortunately tcpdump on the service IP doesn't show anything as the routing is done in BPF. I can see the packets getting through to the correct endpoint (
|
I was testing it with cilium in kube-proxy free mode, it was working fine in my cluster. |
@kvaps I'm not using kube-proxy, however I've found if I force delete the pods using the pvs and force unmount the NFS mounts, it works. Were you deleting the pods consuming the pvs? I was trying to avoid this. |
No it was working to me without any additional removings |
@kvaps Could you share your mount options and any other server options? It looks like for some reason the mounts are not timing out, they're just stuck. |
Unfortunately I have no this setup anymore. |
Hi,
I'm trying to create a
StatefulSet
with aReadWriteMany
accessMode.It doesn't work. Is it possible to do
ReadWriteMany
access mode with Linstor and Lisntor CSI ?Events on
test-0
:kubectl logs linstor-csi-node-pvdjb -n linstor -c linstor-csi-plugin
:linstor controller
drbdadm status
on node2drbdsetup events
on node2Definition
The text was updated successfully, but these errors were encountered: