* [WIP] Debug ceph csidriver not being there
We'd expect the CSIDriver object to be there, otherwise ceph install might be struggling
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Avoid possible StorageClassName nils
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
PVC gets its DV ownerRefs, which can be any CR. If BlockOwnerDeletion is
true, we allow GC only if CDI has RBAC to update the owner finalizers
(we explicitly added it for VirtualMachines). BlockOwnerDeletions gets
validated with this admission plugin which is enabled in OpenShift, but
disabled in our kubevirtci clusters:
https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#ownerreferencespermissionenforcement
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
In case our pvc is already populated no need to check for source PVC
unknown size etc.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Improve the error handling when pod creation fails
When pod creation fails, the error is usually logged without providing additional information to the user. This behavior is especially risky when the user lacks the permits to check the logs, making it unintuitive and almost impossible to find the source of the problem.
This commit improves the error handling of the pod-creation process, so pertinent info about the failure is included in the pod's PVC.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update functional tests to check for events when pod-creation fails
Since error handling in pod-creation has been improved in our controllers, this commit introduces several changes in the corresponding functional tests to properly cover the new behavior included when pod-creation fails.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit-tests after improving error-handling of pods for proper coverage
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and improvements on error handling for pods
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-controller to change the running condition of datavolumes when a pod fails
Until this commit, the way of handling pod errors in the datavolume-controller has been to change the affected datavolume's phase to failed, which conflicts with the declarative approach of the controllers.
This commit modifies this behavior so that, when a pod fails, the affected datavolume's running condition is changed to false while the phase remains unchanged.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume admission webhook to enable creating clones without source PVC
This commit modifies the datavolume admission webhook to follow a more descriptive approach, enabling the creation of clones without a source PVC.
This clone will later be handled by the datavolume-controller until the source PVC is created.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify the datavolume-controller to improve the handling of clones without source
Since we are allowing the creation of clones without source PVC in the admission webhook, we need to improve the handling of these clones once in the datavolume-controller.
This commit modifies said controller, so we do proper error handling and validation until the source PVC is created.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests to check the creation of clones without source PVC
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include a mechanism to reconcile clones without source once the source PVC is created
This commit introduces a new datavolume-controller watch so, if a clone without source is created, we make sure to reconcile it once a proper PVC is created.
It also updates/includes unit tests for proper coverage.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include functional tests to cover the creation of clones without source PVC
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor refactoring of clone-related code in DataVolume reconciler to improve readability
After enabling the creation of clones without source PVC in the datavolume controller, the clone-related logic outside its reconciler has increased in size and become sparse.
This commit rearranges all this code and condenses it into the clone reconciler, without changing the loop behavior.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-mutate webhook to reject the creation of clones if the source PVC's namespace doesn't exist
In previous commits, a mechanism to allow the creation of clones without source PVC was added, without ever checking if the source PVC's namespace exists or not.
This behavior could lead to permission conflicts between the user and the source's namespace since the webhook skipped all the related validation.
This commit modifies the datavolume-mutate admission webhook to reject the clone if the source PVC's namespace doesn't exist.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit tests for proper coverage of clone-validation functions
This commit adds and updates several unit tests to improve the coverage of the clone-validation mechanism after several functions were moved to the controller.
It also introduces minor changes on related code in the datavolume-controller and functional tests following PR review.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Allow empty DV size when cloning using storage API
When cloning a Data Volume, the size of the target can be potentially obtainable via the source PVC, which discards the need to explicitly specify it.
Considering that, this commit introduces a change in the correspondent validation webhook to allow omitting the resources.request.storage field when cloning a PVC using the storage API.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-controller to allow obtaining storage size from source PVC when cloning
When cloning a PVC, if the target's size is not specified, said value can be attainable from the source PVC.
This commit introduces a change in datavolume controller so, in case of detecting an empty storage size, said value can be obtained when performing CSI and Smart cloning.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit tests for datavolume-validation after enabling cloning with empty size
This commit updates the unit testing for the datavolume validation webhook, covering the possibility of cloning a PVC without setting any storage size.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit testing for controller-related functions after enabling cloning with empty size
This commit includes unit tests for the volumeSize() function after enabling creating clones with blank size.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update the datavolume controller to create a size-detection pod when performing host-assisted clone
When performing a host-assisted clone with empty clone size, simply copying the original PVC size could lead to potential overhead miscalculations if the source's VolumeMode is "filesystem".
When that's the case, an inspection pod will be created in the datavolume controller so it extracts the size of the virtual image using qemu-img.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include an image-size detection tool to allow cloning with empty DV size
This commit introduces a new tool in charge of collecting the virtual image size when cloning with an empty DV size. In some cases where said value is unattainable from the original PVC's spec, the datavolume controller will create a new pod containing this new tool.
The binary will then run the 'qemu-img' command and handle its results appropriately.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Optimize the clone-size lookup process to avoid creating unnecessary size-detection pods
When performing host-assisted clone with an empty DV size, in some cases, a size-detection pod is used to obtain the required capacity.
This commit tries to optimize this process to keep the collected value as a PVC annotation, that is checked in subsequent clones to avoid creating redundant pods.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and improvements on mechanism for cloning with empty storage size
* Add new optional flag on size-detection binary to enable using a different URI scheme
* Improve the pod-creation mechanism so the pod is not created until the source PVC has finished the import
* Modify size-finlation mechanism to account for possible round-downs when importing the source image
* Improve the size inflation mechanism so only PVCs with filesystem as volume mode are considered
* Minor style corrections
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify the clone-controller to allow skipping the clone size validation in some cases
Due to filesystem overhead differences, the target's size can sometimes be smaller than the source's one when obtaining said value with the size-detection pod.
This commit introduces minor changes in the clone-controller so we can skip the size validation in those cases.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor changes and improvements in size-detection mechanism following PR review
* Added new UT that covers using empty storage API for non-cloning sources
* Added new watch on datavolume-controller that looks for changes in the size-detection pod
* Removed redundant and unnecessary specs on size-detection pod
* Added error handling when reading the pod's termination message
* Moved general-usage functions to 'util.go' file
* Updated 'datavolumes' documentation to reference the possibility of omitting the storage size when cloning
* Minor style corrections
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests that cover the size-detection mechanism in the DataVolume controller
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include functional tests for cloning without specifying storage size
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Improve error handling in the creation/deletion process of the size-detection pod
This commit introduces additional handling in case of error after and during the size-detection pod is created.
It also updates several related unit tests.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes to improve fsOverhead calculations when cloning with empty storage size
* Modified the size-detection mechanism so we account for fsOverhead when cloning to filesystem volume mode in all cases
* Clean up the code for reconciling when cloning a PVC that is not ready
* Minor fix in functional test so it works when cloning from block to filesystem volume mode
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Fix smart clone request size update
In case the pvc got an actual size as the expected
size there wasn't an update of the pvc spec with the
actual user request size, which left the pvc spec with
a smaller size then the user requested(the data size).
This caused a discrepancy when trying to restore such pvc,
which restored a pvc with the small size instead of the user request
size.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* review fixes
Signed-off-by: Shelly Kagan <skagan@redhat.com>
Noticed the consts were strings instead of the right types
this changes the type to the right type and modifies the usage
to not have to cast to the right type all over the place.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Detect storage capabilities for no-provisioner storage classes
Assume there's a persistent volume that we can look up to infer the
correct values for volume mode and access modes.
Limit ourselves to detecting no-provisioner capabilities on LSO to
avoid greatly increasing the number of storage classes we provide
capabilities for. This is similar to our current flow where we
only provide capabilities for known storage classes.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Regenerate bazel stuff for pkg/monitoring's existence
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add a watcher for no-provisioner PVs
We maintain a map of storage class names and provisioners whenever
storage classes are changed.
If a PV has one of the storage classes with no-provisioner as a
provisioner, reconcile that storage class.
This is because we infer the storage profile based on PVs, and
new ones might have different storage capabilities.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use a client to do our storage class caching
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Pass a client as an argument, not global.
Suggested by awels, thanks!
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* periodic sync CSI snapshot CRD check
It was possible for the CSI snapshot CRD check to fail silently and
prevent the smart clone controller from starting during the cdi deployment
pod start up. This would prevent smart clone from working properly.
This adds a periodic sync of 1 minute for checking the CRDs. We also
log failures that are not is not found so we can more easily detect this
situation as humans.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Change location of the start controller call.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Test handling of populated PVC
Populated PVC created from clone operation should not start
any CDI actions. It can only update DV status.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle prepopulated pvc with network clone
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle prepopulated PVC for Smart and CSI clone
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update kubevirtci to overcome AfterSuite flake
Update kubevirtci to get a fix for a flake where PVC cant be removed
because it still holds the `pvc-as-source-protection` finalizer:
https://github.com/kubernetes-csi/external-snapshotter/issues/349
More info in https://github.com/kubevirt/kubevirtci/pull/750.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Don't create multiple VolumeSnapshots
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Append checkpoint ID to multi-stage importer pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Ignore completed pods for multi-stage imports.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Reset current import pod when checkpoint is done.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Don't prevent pod deletion for scratch space.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Only ignore pod when retainAfterCompletion is set.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix data volume unit tests.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Tests for checkpoint suffix and completed pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Test for retained pods exiting for scratch space.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add functional test for retaining multistage pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Clean up lint error.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Remove scratch handling that is fixed elsewhere.
This is part of shouldDeletePod now.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add unit tests for long PVC/checkpoint names.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Match retainAfterCompletion test to description.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add optional VDDK initImageURL field.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Pass VDDK image URL through to PVC annotation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Unit tests for per-DV VDDK image URL.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Functional test for VDDK initImageURL field.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Update documentation for VDDK initImageURL.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix lint error.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Check for absence of AwaitingVDDK in unit test.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
We want to silence the KubePersistentVolumeFillingUp for all our PVCs that hold virtual machine disks,
since these disks consume the entire PVC by design.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add support for archive upload
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* fix golang errors
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Change storage profile property set to support more then one set
So far CDI supported only 1 claim propery set. We want to be able
to support more then one so in case the user provides to the
DV storage volumeMode without accessMode or vice versa cdi
will be able to fit to it the most appropriate match.
Added to rook ceph block a second default of filesystem
volume mode with RWO access mode, it will support archive
upload which has default of filesystem mode.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR fix - change to one endpoint for the user
upload proxy will identify if the upload is archive
or not by looking at the content type annotation on
the pvc. If the content type is archive it will route
the uplaod to upload server to a new archive upload uri.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add storage profile and data volume controllers unit tests
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR fixes
* add default volume mode to archive content type
* upload server use data processor for archive upload
* tests for volume mode with archive content type
* tests for archive upload of compressed tar
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Adjust imports acording to new apis dir
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR small fixes
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* move apis to new staging area
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add script to push to staging
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix lint check and api reference
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* push staging to api repo
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Deploy alerts infra as part of our installation
Conditionally deploy the infrastructure that is needed to fire alerts for our users
when bad things are happening to CDI.
Testing with `KUBEVIRT_DEPLOY_PROMETHEUS=true`
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Watch and unit test all prometheus related resources
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* add gateway for changing monitoring namespace (rbac purposes)
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* refactor test to check for exact alert name and firing state
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Align pattern of ensuring prometheus resource exists for all
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Remove potential noisy event
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Extract duplicate code to function
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Dont use empty value for prometheus label due to open issue
https://github.com/prometheus-operator/prometheus-operator/issues/4325
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
Added missing tests for change "Explicitly set the storage class name #1936".
Corrected the behavior when storage class is not provided and not available.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Add long term token (10 years) to pvcs when host assisted cloning between namespaces
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* clone controller should retry if source in use
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* minor refactor if/else -> switch
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Correct the fsOverhead calculation in profile
Calculation needs play well with the actual resize that is done in data-processor
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Properly reverse the calculation for overhead.
Signed-off-by: Alexander Wels <awels@redhat.com>
Co-authored-by: Alexander Wels <awels@redhat.com>
* CSI Volume Clone for same namespace
CSI Volume Cloning is available on the same namespace and also
works with namespace transfer and volume expansion.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update documentation for CSI Volume Clone
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Cleanup and refactor - extract common code into functions
Remove csi-clone-controller (only set cloneOf annotation)
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Corrects reconcile results
Do not requeue reconciliation loop when not needed.
Mark DV as Failed when the PVC Claim is lost.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handles PVC recovery from ClaimLost
Make sure that CSI clone continues when target pvc recovers from
ClaimLost to to Bound or Pending.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Code Review improvements
Extracted common code for doCrossNamespaceClone and expandAfterClone, and some updates to comments/cleanups.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
Users don't want 👽 resources in clusters,
and we should also be able to tell if were part of a broader installation.
Note:
- Operator created resources were handled in https://github.com/kubevirt/controller-lifecycle-operator-sdk/pull/18
as these labels will be common to all resources deployed by the HCO.
- Now that the controller is guaranteed to have the labels, we can set env vars
that reference the label values (fieldRef) to spare calling GET on the CR in the controllers.
(thanks mhenriks).
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Refactor: simplify by extracting methods
Prepare for new clone logic - extracted smartClone reconcile functions.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Select clone strategy based on storageProfile
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Changes from CR comments.
A series of small fixes, and cleanups.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Documentation update
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Fix smartclone sometimes not triggering.
Updated tests to use a real image instead of data that is filled.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Refactor getSnapshotClass into two functions
Signed-off-by: Alexander Wels <awels@redhat.com>
* Use constant instead of magic number for size.
Signed-off-by: Alexander Wels <awels@redhat.com>
* force bind for WFFC storage on tests.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Updated based on comments.
Fixed failing functional test.
Signed-off-by: Alexander Wels <awels@redhat.com>
* update deps and bazel
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix apidocs and unit tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix generate-verify
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
An error in the logic that updates DV, it would ignore errors
during the update. Also small typo in tests corrected.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* fix synchronization between smart clone and datavolume controller
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* pvc transfer controller should be more aggressive to force binding
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Fix: Compute fs overhead only for fs volumeMode
Correctly compute fs overhead for an effective VolumeMode. Effective, means one that is
computed based on value in storage spec and the storageProfile.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Test: Add more tests for fs overhead
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Add an interface to watch nbdkit logs.
Useful for fishing out various pieces of information. Save VDDK library
version and connected ESX host by appending to the importer pod's
termination message. Turns nbdkit logging up to verbose for VDDK data
sources, so only the last few lines are printed for debugging.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Copy VDDK info from termination message to PVC/DV.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add unit tests for saved VDDK information.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add functional test for VDDK annotations.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix unit test, forgot to check for nil pvc.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Don't ignore errors updating PVC with VDDK info.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Watch nbdkit with Scanner instead of ReadString.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Move VDDK info test into existing functional test.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Make nbdkit stop sequence slightly clearer.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Save VDDK info in regular DV reconciler.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Don't save VDDK info when PVC is being deleted.
Also, piggyback off existing PVC update instead of introducing a new
error handling path.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix VDDK-info unit tests.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Use scanner for all nbdkit logging.
Also fix up a minor merge mistake.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Try to satisfy complaints from SonarCloud.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* use namespace transfer for smart clone
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* updates from test failures
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add expansion func tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add dv phases for expansion and transfer
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* rebase and integrate with storage profiles
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Create new Storage type
A new Storage type similar to the PVC Spec is now available to use
in the DataVolume Spec. This is more permissive than PVC, and together
with StorageProfile this allows CDI to apply additional logic for
missing or optional fields.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Use the StorageProfile
Handle the StorageProfile recommended params when creating the PVC for
a DataVolume. When parameters like volumeMode or accessModes are
not provided, CDI checks the StorageProfile for a given StorageClass
to set the recommended defaults. This enables user to create DataVolume
without the need to provide all the parameters.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Allow multiple accessModes
CDI allows multiple access modes to be specified in the DataVolume.spec.storage and in the StorageProfile. This now works the same way as in PVC specification.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle the storage.size field
The storage.size specifies how much space a user wants to have.
When creating image on the fileSystem storage CDI takes into
account the file system overhead and requests PVC big enough to
fit an image and file system metadata.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Test storage profile with DV
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Document Storage Profiles
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Refactor: Render the effective PVC early
The helper 'render PVC' was moved earlier in the control flow, so
it can be used in more places. Removing the need for if/else logic.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Test handling size on import, upload and clone
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Code Review: Refactor resolving of volumeMode
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Fix: render target pvc spec correctly in smart clone controller
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* sigs.k8s.io/controller-runtime/pkg/runtime/* packages are deprecated, and were moved to new paths.
Trying to upgrade sigs.k8s.io/controller-runtime to version v0.7.0 in HCO created a conflict because in v0.7.0 the deprecated packages were removed and cannot be used.
This PR replaces the deprecated packages with their new paths.
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* Run `make deps-update`
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* fix logger init
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* fix test loggers
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* Plumb new checkpoint API through to VDDK importer.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add incremental data copy from VDDK.
Create a new data source implementation similar to vddk-datasource, but
only for blocks of data that changed between two snapshots. Also factor
out common things between the two VDDK data sources.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Check block status for warm and cold imports.
Addresses a bunch of runtime issues, but progress tracking isn't right.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Find snapshots correctly.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Remove separate warm/cold VDDK importers.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Advance through the checkpoint list in the spec.
Move DataVolume to Paused after each checkpoint, and start a new
importer pod for the next available checkpoint. Keep track of which
checkpoints have been copied by adding PVC annotations associating each
checkpoint with the UID of the pod that copied it.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Allow spec updates to drive multi-stage imports.
A multi-stage import can create checkpoints at any time, so CDI needs to
be able to receive updates to the list of checkpoints. Implement this by
allowing spec changes only for fields related to multi-stage imports.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Avoid deleting destination in multi-stage import.
A multi-stage import will have an initial data copy to the destination
file followed by separate copies for individual deltas. The destination
file should not be deleted before starting these delta copies.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Get VDDK data source to pass formatting tests.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Unit tests for multi-stage import admission rules.
Make sure only updates to checkpoint-related fields are accepted.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add warm import unit tests for VDDK data source.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add VDDK warm import functional test.
Put two snapshots in the vCenter simulator inventory, and run them
through a multi-stage import process. Also clean up some issues
reported by test-lint.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add some documentation about multi-stage imports.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Pass existing multi-stage DataVolume unit tests.
Also remove MD5 sum step used for debugging, since it can take a long time.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Remove tabs from documentation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Pass failing import-controller unit test.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* More unit tests for multi-stage field updates.
Also factor these tests into a DescribeTable.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add nbdkit retry filter.
Available as of Fedora 33 update.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Give correct file name to nbdkit in more cases.
The backing file in the spec might not always match the backing file in
the snapshot, so try harder to match those files by disk ID. May still
need to allow updates to backingFile, depending on how this gets used.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add more unit tests for datavolume-controller.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix linter error from last commit.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add unit tests for some govmomi API calls.
Move original calls into mock interfaces to make this work.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add an API for disabling smart-cloning.
We used to detect the possibility of smart-cloning and always use it
if it's there. This might not be the desirable behaviour if:
- Snapshots cost more money than a host-assisted clone
- Snapshots are broken
The API is:
kubectl edit cdi
cdi.Spec.cloneStrategyOverride = "copy"
If no value is chosen, we continue with the existing behaviour of
preferring smart clone if possible.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Remove redundant parentheses, don't open code GetActiveCDI
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add const CloneStrategySnapshot to v1alpha1 too
Pointed out by awels, thanks.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add unit tests for getCloneStrategy
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add checkpoints to DataVolume CRD and reconciliation
* Add Previous, Current, and FinalCheckpoint to DataVolume CRD
* Use checkpoints to set annotations on the PVC
* If an importer pod succeeds while checkpoint annotations are set,
then set the DataVolume status to Paused intstead of Succeeded.
* Clear the PVC checkpoint annotations
Signed-off-by: Sam Lucidi <slucidi@redhat.com>
* Add new fields to DataVolume CRD creation
Signed-off-by: Sam Lucidi <slucidi@redhat.com>
* Generate updated code for the DataVolume changes
Signed-off-by: Sam Lucidi <slucidi@redhat.com>
* Add tests for multistage import annotations
Signed-off-by: Sam Lucidi <slucidi@redhat.com>
* Add library function to determine if a PVC has been populated fully.
The logic is as following:
If PVC has no ownerRef, then we assume something else fully populated it and
will return true
If PVC has an ownerRef and its a DataVolume, then look up the DataVolume
If DV.status.Phase == succeeded, return true, return false otherwise.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Renamed functions to better indicate its purpose.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Generate CDI CRD using controller-tools.
This is only done for CDI CRD as it requires the existence of source
code. Other CRDs we create are created by a more bare bones pod.
CDIUninstallStrategy was missing a comment describing it, so add
one. This was spotted manually so there might be more missing.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Allow users to specify which nodes CDI pods will live on.
nodeSelector, affinity and tolerations are possible values.
This is done in the CDI CR (rather than CDIConfig) as we are
interested in having this field be populated by external operators.
Unit tests now require the existence of a CDI CR, so create it.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add a unit test covering some node placement functions
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Specify that all our pods are linux-only.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Avoid duplicate test, accidental left over.
Pointed out by awels, thanks.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Rename to cdiOperatorDeployment for clarity.
Suggested by awels
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Specify we only run on linux using the CDI CR, no need to embed this
into the code.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Don't dereference workloadPlacement for no reason
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Split off operator test to have its own AfterEach, BeforeEach.
Use even more descriptive function names.
Do all the CDI delete/restore logic in AfterEach, to ensure that
it happens and restores the deployment with the original CR even
if the test fails.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Remove XXX. This is the proper way.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Adapt to latest changes in controller_test.go (renaming import)
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Simplify, not storing intermediate value.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Don't dereference nodeplacement in callers to CreateDeployment
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Remove redundant save & restore. Unit tests do this for us.
Pointed out by awels, thanks.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Split out "find toplevel" to a utility function
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Wait for the CDI CR update to apply before continuing.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Simplify, not storing intermediate value.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Make it clear that the chosen node placement will not be schedulable.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* update k8s deps to 1.18.6 and controller runtime to 0.6.2
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* remove building code generators from docker image. This way the k8s ligray version only has to be updated in go.mod
Do more stuff in the bazel container. Faster and better interop
Fix unit tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* make format
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* remove unnecessary rsync
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* redo code generator dep management
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* builder uses go modules
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
consistent failed should never happen during normal operations,
it can potentially happen if a pvc claim is lost.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Set the WaitForFirstConsumer phase on DataVolume when storage uses the WaitForFirstConsumer binding mode and is not bound yet.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Skip PVC if not bound in import|clone|upload controllers.
This is done so the VM pod(not the cdi pod) will be the first consumer, and the PVC can be scheduled on the same location as the pod.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
fixup! Skip PVC if not bound in import|clone|upload controllers.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update importer tests to force bind the PCV by scheduling a pod for pvc, when storage class is wffc.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update datavolume tests to force bind the PCV by scheduling a pod for pvc, when storage class is wffc.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update upload controller and upload tests to correctly handle force binding the PCV by scheduling a pod for pvc, when storage class is wffc.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update clone tests to force bind the PCV by scheduling a pod for pvc when the storage class is wffc.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update cloner multi-node tests to force bind the PCV by scheduling a pod for pvc when storage class is wffc.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Correct after automerge
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Improve/simplify tests
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Fix error in import test.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update transport_test,operator_test.go
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update rbac_test.go and leaderelection_test.go
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Improve Datavolume and PVC Checks for WFFC.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle wffc only if feature gate is open - import-controller
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* TEST for Handle wffc only if feature gate is open - import-controller - TEST
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle wffc only if feature gate is open - upload-controller with test
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* rename and simplify checks
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* cleanup after rebase
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* update tests after rebase
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* update tests after rebase
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* more cleanups
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Document new WFFC behavior
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Document new HonorWaitForFirstConsumer option
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* update docs according to comments
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* extract common function, cleanup - code review fixes
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* add comment for another pr - 1210, so it can have easier merge/rebase
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* typo
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Simplify getStoragebindingMode - code review comments
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Add FeatureGates interface - code review fix
Additionally pass the features gates instead of the particular feature gate value,
and let shouldReconcilePVC decide what to do with the feature gate. That way shouldReconcilePVC
contains all the logic, and the caller does not need to do additional calls to provide parameters.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Update matcher
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* move upload.cdi.kubevirt.io API group to v1beta1
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* move core api to v1beta1
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix os-3.11 cluster sync and add functional tests for alpha api
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* change more occurences of v1alpha1
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* updates after rebase
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* don't create snapshot or clone pods if pvcs in use
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* cleanup pods during functional tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* kill mmore pods blocking clone tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix typos
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>