
* Set the WaitForFirstConsumer phase on DataVolume when storage uses the WaitForFirstConsumer binding mode and is not bound yet. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Skip PVC if not bound in import|clone|upload controllers. This is done so the VM pod(not the cdi pod) will be the first consumer, and the PVC can be scheduled on the same location as the pod. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> fixup! Skip PVC if not bound in import|clone|upload controllers. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update importer tests to force bind the PCV by scheduling a pod for pvc, when storage class is wffc. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update datavolume tests to force bind the PCV by scheduling a pod for pvc, when storage class is wffc. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update upload controller and upload tests to correctly handle force binding the PCV by scheduling a pod for pvc, when storage class is wffc. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update clone tests to force bind the PCV by scheduling a pod for pvc when the storage class is wffc. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update cloner multi-node tests to force bind the PCV by scheduling a pod for pvc when storage class is wffc. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Correct after automerge Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Improve/simplify tests Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Fix error in import test. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update transport_test,operator_test.go Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update rbac_test.go and leaderelection_test.go Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Improve Datavolume and PVC Checks for WFFC. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Handle wffc only if feature gate is open - import-controller Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * TEST for Handle wffc only if feature gate is open - import-controller - TEST Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Handle wffc only if feature gate is open - upload-controller with test Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * rename and simplify checks Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * cleanup after rebase Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * update tests after rebase Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * update tests after rebase Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * more cleanups Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Document new WFFC behavior Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Document new HonorWaitForFirstConsumer option Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * update docs according to comments Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * extract common function, cleanup - code review fixes Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * add comment for another pr - 1210, so it can have easier merge/rebase Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * typo Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Simplify getStoragebindingMode - code review comments Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Add FeatureGates interface - code review fix Additionally pass the features gates instead of the particular feature gate value, and let shouldReconcilePVC decide what to do with the feature gate. That way shouldReconcilePVC contains all the logic, and the caller does not need to do additional calls to provide parameters. Signed-off-by: Bartosz Rybacki <brybacki@redhat.com> * Update matcher Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
2.1 KiB
Local Storage Placement for VM Disks
This document describes a special handling of PVCs which have StoragaClass with volumeBindingMode
set to WaitForFirstConsumer
.
Introduction
Local Storage PVs are bound to a specific node. With the binding mode of WaitForFirstConsumer
the binding and the provisioning is delayed until a Pod using the PVC is created. That way the Pod's scheduling constraints
are used to select or provision a PV.
CDI uses worker Pod to import/upload/clone data to the PVC. Creating such a worker Pod will trigger binding of the PVC on the node where worker Pod is scheduled. Worker Pod might have different constraints than a kubevirt VM. When the VM is scheduled on a different node than the PVC it becomes unusable. The problem might be even bigger when a VM has more than one PVC managed by CDI.
Handling the WaitForFirstConsumer mode
The CDI has a special handling for DataVolumes (DV) that use storage with WaitForFirstConsumer
mode.
The DataVolume controller creates the PersistentVolumeClaim (PVC). When the underlying status of the PVC is Pending/WaitForFirstConsumer
the CDI will not schedule any worker pods (import/clone/upload) associated with that PVC. The DataVolume controller sets
the DV status to a new phase WaitForFirstConsumer
. This allows the workload itself (ie. kubevirt)
to detect a DV phase and handle the initial scheduling which causes the PVC to change to a Bound state
(e.g. by creating and exiting a dummy pod with the same resource requirements as the actual workload).
NOTE: The workload should not attempt to use the contents of the DV until CDI has finished the transfer.
Config
To be fully compatible with any external tools that may already use CDI, this new feature has to be enabled by
the feature gate: HonorWaitForFirstConsumer
. It is available in the CDIConfig
custom resource (see cdi-config doc).
A Snippet below shows CDIConfig with HonorWaitForFirstConsumer
enabled.
apiVersion: cdi.kubevirt.io/v1beta1
kind: CDIConfig
[...]
spec:
featureGates:
- HonorWaitForFirstConsumer
[...]