This commit fixes a race condition in CDI populators so we attempt to get PVC' even if the population source doesn't exist.
This new behavior allows us to correctly delete the PVC' once the population has succeeded, even if the population source doesn't exist anymore.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Allow ImmediateBind annotation when using populators
In case of using PVC with populators if the PVC has this
annotation we prevent from waiting for it to be schedueled
and we proceed with the process.
When using datavolumes with populators in case the dv has the annotation
it will be passed to the PVC. we prevent from being in pendingPopulation
in case the created pvc has the annotaion.
Plus when having honorWaitForFirstConsumer feature gate disabled we will
put on the target PVC the immediateBind annotation.
Now we allow to use populators when having the annotation the the
feature gate disabled.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add functional tests to population using PVCs
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Support immediate binding with clone datavolume
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Pass allowed annotations from target pvc to pvc prime
This annotations are used for the import/upload/clone
pods to define netork configurations.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
---------
Signed-off-by: Shelly Kagan <skagan@redhat.com>
Co-authored-by: Shelly Kagan <skagan@redhat.com>
* Enable empty schedule in DataImportCron (#2711)
Allow disabling DataImportCron schedule and support external trigger
Signed-off-by: Ido Aharon <iaharon@redhat.com>
* expand upon #2721 (#2731)
Need to replace requeue bool with requeue duration
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Add clone from snapshot functionalities to clone-populator (#2724)
* Add clone from snapshot functionalities to the clone populator
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update clone populator unit tests to cover clone from snapshot capabilities
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Fix storage class assignation in temp-source claim for host-assisted clone from snapshot
This commit also includes other minor and styling-related fixes
Signed-off-by: Alvaro Romero <alromero@redhat.com>
---------
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Prepare CDI testing for the upcoming non-CSI lane (#2730)
* Update functional tests to skip incompatible default storage classes
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Enable the use of non-csi HPP in testing lanes
This commit modifies several scripts to allow the usage of classic HPP as the default SC in tests.
This allows us to test our non-populator flow with a non-csi provisioner.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
---------
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Allow snapshots as format for DataImportCron created sources (#2700)
* StorageProfile API for declaring format of resulting cron disk images
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Integrate recommended format in dataimportcron controller
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Take snapclass existence into consideration when populating cloneStrategy and sourceFormat
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Remove leader election test (#2745)
Now that we are using the standard k8s leases from
the controller runtime library, there is no need to
test our implementation as it is no longer in use.
This will save some testing time and random failures.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Integration of Data volume using CDI populators (#2722)
* move cleanup out of dv deletion
It seemed off to call cleanup in the prepare function
just because we don't call cleanup unless the dv is deleting.
Instead we check in the clenup function itself if it should be
done: in this 2 specific cases in case of deletion and in case
the dv succeeded.
The cleanup will be used in future commit also for population cleanup
which we also want to happen not only on deletion.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Use populator if csi storage class exists
Add new datavolume phase PendingPopulation to
indicate wffc when using populators, this new
phase will be used in kubevirt in order to know
that there is no need for dummy pod to pass wffc phase
and that the population will occur once creating the vm.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Update population targetPVC with pvc prime annotations
The annotations will be used to update dv that uses the
populators.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Adjust UT with new behavior
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* updates after review
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Fix import populator report progress
The import pod should be taken from pvcprime
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Prevent requeue upload dv when failing to find progress report pod
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Remove size inflation in populators
The populators are handling existing PVCs.
The PVC already has a defined requested size,
inflating the PVC' with fsoverhead will only be
on the PVC' spec and will not reflect on the target
PVC, this seems undesired.
Instead if the populators is using by PVC that the
datavolume controller created the inflation will happen
there if needed.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Adjust functional tests to handle dvs using populators
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Fix clone test
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* add shouldUpdateProgress variable to know if need to update progress
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Change update of annotation from denied list to allowed list
Instead if checking if the annotation on pvcPrime is not desired
go over desired list and if the annotation exists add it.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* fix removing annotations from pv when rebinding
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* More fixes and UT
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* a bit more updates and UTs
Signed-off-by: Shelly Kagan <skagan@redhat.com>
---------
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Run bazelisk run //robots/cmd/uploader:uploader -- -workspace /home/prow/go/src/github.com/kubevirt/project-infra/../containerized-data-importer/WORKSPACE -dry-run=false (#2751)
Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com>
* Allow dynamic linked build for non bazel build (#2753)
The current script always passes the static ldflag to the
compiler which will result in a static binary. We would like
to be able to build dynamic libraries instead.
cdi-containerimage-server has to be static because we
are copying it into the context of a container disk container
which is most likely based on a scratch container and has no
libraries for us to use.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Disable DV GC by default (#2754)
* Disable DV GC by default
DataVolume garbage collection is a nice feature, but unfortunately it
violates fundamental principle of Kubernetes. CR should not be
auto-deleted when it completes its role (Job with TTLSecondsAfter-
Finished is an exception), and once CR was created we can assume it is
there until explicitly deleted. In addition, CR should keep idempotency,
so the same CR manifest can be applied multiple times, as long as it is
a valid update (e.g. DataVolume validation webhook does not allow
updating the spec).
When GC is enabled, some systems (e.g GitOps / ArgoCD) may require a
workaround (DV annotation deleteAfterCompletion = "false") to prevent
GC and function correctly.
On the next kubevirt-bot Bump kubevirtci PR (with bump-cdi), it will
fail on all kubevirtci lanes with tests referring DVs, as the tests
IsDataVolumeGC() looks at CDIConfig Spec.DataVolumeTTLSeconds and
assumes default is enabled. This should be fixed there.
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix test waiting for PVC deletion with UID
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix clone test assuming DV was GCed
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix DIC controller DV/PVC deletion when snapshot is ready
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
---------
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
---------
Signed-off-by: Ido Aharon <iaharon@redhat.com>
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
Signed-off-by: Alvaro Romero <alromero@redhat.com>
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Shelly Kagan <skagan@redhat.com>
Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
Co-authored-by: Ido Aharon <iaharon@redhat.com>
Co-authored-by: Michael Henriksen <mhenriks@redhat.com>
Co-authored-by: alromeros <alromero@redhat.com>
Co-authored-by: akalenyu <akalenyu@redhat.com>
Co-authored-by: Shelly Kagan <skagan@redhat.com>
Co-authored-by: kubevirt-bot <kubevirtbot@redhat.com>
Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
* Create CRD for volumeuploadsource populator
This CRD will be used in the DataSourceRef on PVCs
to trigger population that upload to the volume.
This will be performed by the upload populator
that will be added in future commits.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Create upload populator controller
The upload populator controller can be used
standalone without the need of datavolume.
It reconciles pvc with upload dataSourceRef
and uses populators API to populated the pvc
with an upload command.
The controller creates pvc' with upload
annotation. After the upload completes it
rebinds the pv to the original target pvc and
deletes pvc prime.
Eventually we get a bound PVC which is already
populated.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Adjust upload-proxy to handle upload population
In case of pvc with datasourceref to upload population
we should create the url to the upload server with the
pvc' name.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add tests for upload population
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add unit tests for upload populator
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add preallocation to volumeuploadsource crd
Also some other small fixes
Signed-off-by: Shelly Kagan <skagan@redhat.com>
---------
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Ensure anyvolumedatasource pvc is gone
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Skip tls config upload test when openshift route is used
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
- Split the huge DV controller into smaller op-specific DV controllers -
import, clone, upload
- Add common watch-adding function so each controller watches only its
relevant DVs
- Refactor the common Reconcile() to use interface DataVolumeReconciler
implemented by each controller
- Move all functions, structs, consts to the relevant controller
- Split the utests per controller
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
On the next kubevirt-bot Bump kubevirtci PR (with bump-cdi), it will
fail on all kubevirtci lanes with tests referring DVs, as the tests
IsDataVolumeGC() looks at CDIConfig Spec.DataVolumeTTLSeconds and
assumes default is disabled. This should be fixed there.
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Remove unneeded WaitForPVC
This is redundant after calling ForceBindPvcIfDvIsWaitForFirstConsumer.
Done in cases where the return value of WaitForPVC was not used.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use the framework CreatePVCFromDefinition
That requires less argument-passing
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use less code to get bound PVCs
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Don't return error on CreateBoundPVCFromDefinition
It's always nil as the function expects success.
Allows us to delete some error-checking paths.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Added a step to the crypto policy tests that ensures we can talk to the server
when TLS crypto config allows it, instead of just failing when it doesn't.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Improve the error handling when pod creation fails
When pod creation fails, the error is usually logged without providing additional information to the user. This behavior is especially risky when the user lacks the permits to check the logs, making it unintuitive and almost impossible to find the source of the problem.
This commit improves the error handling of the pod-creation process, so pertinent info about the failure is included in the pod's PVC.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update functional tests to check for events when pod-creation fails
Since error handling in pod-creation has been improved in our controllers, this commit introduces several changes in the corresponding functional tests to properly cover the new behavior included when pod-creation fails.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit-tests after improving error-handling of pods for proper coverage
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and improvements on error handling for pods
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-controller to change the running condition of datavolumes when a pod fails
Until this commit, the way of handling pod errors in the datavolume-controller has been to change the affected datavolume's phase to failed, which conflicts with the declarative approach of the controllers.
This commit modifies this behavior so that, when a pod fails, the affected datavolume's running condition is changed to false while the phase remains unchanged.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add TLS Security Profile API
TLSSecurityProfile is used by operators to apply cluster-wide TLS security settings to operands.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Update apiserver & uploadproxy server TLS config on CDIConfig TLS knob change
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Propagate TLS config to uploadserver as well
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add functests for apiserver and upload that ensure value is respected
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Remove unused function PanicOnError
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Remove logic for skipping a test on old kubectl
1.14.0 is pretty old, assume a new enough kubectl is used now.
Simplify utils.go which has an elaborate function for detecting
the version used.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Make functions private if they're only used in one file
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move prometheus functions to tests/framework/prometheus.go
Split utils.go into meaningful filenames.
Framework was chosen because it uses framework things, and that
can't be done from utils without circular imports.
Make functions part of the framework struct wherever possible
to avoid needing to pass extra arguments.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move {Create,Run}KubectlCommand to framework
Allows moving more things away from tests/utils.go towards the goal
of eliminating it.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move CreateVddkWarmImportDataVolume into own file in framework
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move node placement utils to a separate file in framework package
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move WaitForConditions and associated functions to util
Pass it a ClientsIface instead of pulling in framework.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Move pod log functions to framework/pod.go, delete tests/utils.go
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Fix typo
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use CombinedOutput - shows stderr in the output unconditionally
Simplifies the code.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Garbage Collect Completed DVs
See design at:
https://github.com/kubevirt/community/blob/main/design-proposals/garbage-collect-completed-dvs.md
ToDos:
-DataImportCron and DataSource controllers adaptation and func tests
-Add doc for DataVolume, CDIConfig and DataImportCron changes
-Extend unit tests and functional tests
-KubeVirt adaptation
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Controller minor fixes
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Adapt tests to GC
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add DV mutate unit test for GC
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Improve GC skip per annotation test
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Use DescribeTable for the GC tests
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Create a test for an overhead bug
This image size and filesystem overhead combination was experimentally determined
to reproduce bz#2064936 in CI when using ceph/rbd with a Filesystem mode PV since
the filesystem capacity will be constrained by the PVC request size.
Below is the problem it tries to recreate:
When validating whether an image will fit into a PV we compare the
image's virtual size to the filesystem's reported available space to
guage whether it will fit. The current calculation reduces the apparent
available space by the configured filesystem overhead value but the
overhead is already (mostly) factored into the result of Statfs. This
causes the check to fail for PVCs that are just large enough to
accommodate an image plus overhead (ie. when using the DataVolume
Storage API with filesystem PVs with capacity constrained by the PVC
storage request size).
This was not caught in testing because HPP does not have capacity
constrained PVs and we are typically testing block volumes in the ceph
lanes. It can be triggered in our CI by allocating a Filesystem PV on
ceph-rbd storage because these volumes are capacity constrained and
subject to filesystem overhead.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Fix a target pvc validation bug
Corrects the validation logic for target volume.
Below description of the original problem:
When validating whether an image will fit into a PV we compare the
image's virtual size to the filesystem's reported available space to
guage whether it will fit. The current calculation reduces the apparent
available space by the configured filesystem overhead value but the
overhead is already (mostly) factored into the result of Statfs. This
causes the check to fail for PVCs that are just large enough to
accommodate an image plus overhead (ie. when using the DataVolume
Storage API with filesystem PVs with capacity constrained by the PVC
storage request size).
This was not caught in testing because HPP does not have capacity
constrained PVs and we are typically testing block volumes in the ceph
lanes. It can be triggered in our CI by allocating a Filesystem PV on
ceph-rbd storage because these volumes are capacity constrained and
subject to filesystem overhead.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Improve the warning message
Removed redundant and misleading part about pvc size and update the simplification
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Remove useless test
The test checks that the validation logic takes fs Overhead into account.
New validation logic does not check fs overhead. So test is no longer
relevant.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* check uploadserver using certs in secrets
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* review comments
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
We consistently fail on the check that `info.VirtualSize >= info.ActualSize`
on a real external nfs sever with different block size to what we're using upstream.
This happens because qemu-img info gives us the size on disk in ActualSize.
For the purpose of sparseness, I believe we are interested in the content size and not the padded size on disk.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
Ignore the error, it might be because the other end closed the
connection because it was done. We verify the image uploaded
by md5 anyway.
Signed-off-by: Alexander Wels <awels@redhat.com>
So the upload func being called in the eventually loop means we're going to be running
this:
https://github.com/kubevirt/containerized-data-importer/blob/main/tests/upload_test.go#L415-L428
a bunch of times, which leads to sometimes failing with
```bash
<*errors.errorString | 0xc000200140>: {
s: "io: read/write on closed pipe",
}
io: read/write on closed pipe
occurred
/root/go/src/kubevirt.io/containerized-data-importer/tests/upload_test.go:424
```
Slightly incrementing the interval in which we call the upload func seems to mitigate this flake.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
- Upload tests:
We only care that the operation will succeed eventually, were fine with one call failing.
- Annotation test:
```bash
Operation cannot be fulfilled on pods "cdi-upload-test-dv": the object has been modified; please apply your changes to the latest version and try again
```
Loop over the .Update call here as well, as were fine with it succeeding at some point.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add support for archive upload
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* fix golang errors
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Change storage profile property set to support more then one set
So far CDI supported only 1 claim propery set. We want to be able
to support more then one so in case the user provides to the
DV storage volumeMode without accessMode or vice versa cdi
will be able to fit to it the most appropriate match.
Added to rook ceph block a second default of filesystem
volume mode with RWO access mode, it will support archive
upload which has default of filesystem mode.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR fix - change to one endpoint for the user
upload proxy will identify if the upload is archive
or not by looking at the content type annotation on
the pvc. If the content type is archive it will route
the uplaod to upload server to a new archive upload uri.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add storage profile and data volume controllers unit tests
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR fixes
* add default volume mode to archive content type
* upload server use data processor for archive upload
* tests for volume mode with archive content type
* tests for archive upload of compressed tar
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Adjust imports acording to new apis dir
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* CR small fixes
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* move apis to new staging area
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add script to push to staging
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix lint check and api reference
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* push staging to api repo
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Bump kubevirtci
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add k8s-1.21, k8s-1.22 cluster-sync goo
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Update comments to match reality - we're making local default here
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Test upgrade from newer versions of CDI
You can't install too old CDI on k8s-1.22, as we used APIs removed
in this version of kubernetes.
Try upgrading only from newer versions
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use a path under /dev for attaching a PVC
After kubernetes 1.20 it seems like / is mounted nodev, so we can't
have "devicePath: /pvc" any more.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Update default storage to use CSI hpp driver
Update the default storage for hpp lanes to be the csi driver
instead of the legacy driver.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Add check for csi driver capabilities in particular fsGroup.
Skip check for fsGroup if csi driver reports it cannot accomidate.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Addressed review comment.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Correct the fsOverhead calculation in profile
Calculation needs play well with the actual resize that is done in data-processor
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Properly reverse the calculation for overhead.
Signed-off-by: Alexander Wels <awels@redhat.com>
Co-authored-by: Alexander Wels <awels@redhat.com>
* Fix flake causing test and move to destructive lane
This test deletes the CDIConfig and waits for it to recreate with a different UID.
However, this condition alone will not suffice as the CDIConfig status might still be empty for a few seconds,
causing havoc in following tests.
Will now wait for .status to not be empty, and marking test as destructive.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Fix second flake
Test waits for *any* error after making the HTTP request for upload from the host, which doesn't cut it;
That could be an EOF (witnessed), port-forward, etc -> while the only error that would guarantee the upload process
actually got triggered is `Unexpected return value 500`.
Without seeing this error we can't continue testing the scenario (can't be sure any upload logic even triggered).
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Create new Storage type
A new Storage type similar to the PVC Spec is now available to use
in the DataVolume Spec. This is more permissive than PVC, and together
with StorageProfile this allows CDI to apply additional logic for
missing or optional fields.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Use the StorageProfile
Handle the StorageProfile recommended params when creating the PVC for
a DataVolume. When parameters like volumeMode or accessModes are
not provided, CDI checks the StorageProfile for a given StorageClass
to set the recommended defaults. This enables user to create DataVolume
without the need to provide all the parameters.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Allow multiple accessModes
CDI allows multiple access modes to be specified in the DataVolume.spec.storage and in the StorageProfile. This now works the same way as in PVC specification.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Handle the storage.size field
The storage.size specifies how much space a user wants to have.
When creating image on the fileSystem storage CDI takes into
account the file system overhead and requests PVC big enough to
fit an image and file system metadata.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Test storage profile with DV
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Document Storage Profiles
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Refactor: Render the effective PVC early
The helper 'render PVC' was moved earlier in the control flow, so
it can be used in more places. Removing the need for if/else logic.
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Test handling size on import, upload and clone
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Code Review: Refactor resolving of volumeMode
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* Fix: render target pvc spec correctly in smart clone controller
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
* When qemu image is resized, it needs to be preallocated to the requested
size. In-place preallocation (using convert function of qemu-img)
cleans the data.
With this PR preallocation is applied simply when the image is resized.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Functional tests for preallocation verify content (MD5).
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
This commit adds a test that verifies, that filesystem overhead is taken
into account when creating a blank image (with preallocation).
Also, some test cleanup.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Validate images fit on a filesystem in more cases.
Background:
When the backing store is a filesystem, we store the images
as sparse files. So the file may eventually grow to be bigger
than the available storage. This will cause unfortunate
failures down the line.
Prior to this commit, we validated the size:
- In case the backing store implicitly did it for us (block volumes)
- On async upload
- When resizing (by the operation failing if the image cannot fit
in the available space).
The Resize phase is encountered quite commonly:
Transfer->Convert->Resize
TransferFile->Resize
Adding validation here for the non-resize case covers almost all
the cases.
The only exceptions that aren't validated now are:
- DataVolumeArchive via the HTTP datasource
- VDDK
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When resizing, take into account filesystem overhead.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add testing for too large upload/import
- Import/sync upload of too large physical size image (raw.xz, qcow2)
- Import/sync upload of too large virtual size image (raw.xz, qcow2)
- Import of a too large raw image file, if filesystem overhead is
taken into account
- Async upload of too large physical size qcow2.
The async upload cases do not mirror the sync upload ones because if
a block device is used as scratch space, it will hit a size limit
before the validation pause, and fail differently.
This scenario is identical to the sync upload case which was added.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Refactor code in a way that requires less comments to explain.
We can just validate that the requested image size will fit in the
available space, and not rely on the fact we typically resize the
images to the full size.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When calculating usable space, round down to a multiple of 512.
Our validation is indirectly:
image after resize to usable space <= usable space
For this to pass, we need to ensure that qemu-img's rounding
up to 512 doesn't change the size.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Adjust qemu-img to the ones emitted by nbdkit:
- In some cases, we clearly don't rely on the qemu-img error,
so don't check for it.
- In one case, switch it to looking for the nbdkit equivalent
error message.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* [WIP] doc: User-facing doc for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* apis: CDI accepts `preallocation` option.
With this commit CDI accepts (but does handle) `preallocation` settings
for DataVolumes and in CDIConfig.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Implementing preallocation
This commit implements preallocation support for import and upload.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Functional tests for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Remove "preallocation for StorageClasses" config
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Removed unused function
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Updated dependencies
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Uss PVC annotation to pass preallocation parameters
DataVolume controller now uses a PVC annotation to pass preallocation
configuration to import and update controllers.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Implementing preallocation
This commit implements preallocation support for import and upload.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Make sure all import/upload paths do preallocation
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
- D/S the importer pod has verbosity=1 => the preallocation messages are not present in log
- Due to this, we hit an infinite loop in the preallocation tests
(Import completed without preallocation message => pod doesn't exist anymore but sampling continues)
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* [WIP] doc: User-facing doc for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* apis: CDI accepts `preallocation` option.
With this commit CDI accepts (but does handle) `preallocation` settings
for DataVolumes and in CDIConfig.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Implementing preallocation
This commit implements preallocation support for import and upload.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Functional tests for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Remove "preallocation for StorageClasses" config
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Removed unused function
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Fix rook-ceph test failures
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Updated dependencies
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Uss PVC annotation to pass preallocation parameters
DataVolume controller now uses a PVC annotation to pass preallocation
configuration to import and update controllers.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Add CDIConfigSpec to CDI
Make CDIConfig singleton mirror data in "active" CDI
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix functional tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Wait for the CDIConfig Status to be updated, too
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add test IDs and a couple of positive fs overhead test cases
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When validating disk space, reserve space for filesystem overhead
The amount of available space in a filesystem is not exactly
the advertise amount. Things like indirect blocks or metadata
may use up some of this space. Reserving it to avoid reaching
full capacity by default.
This value is configurable from the CDIConfig object spec,
both globally and per-storageclass.
The default value is 0.055, or "5.5% of the space is
reserved". This value was chosen because some filesystems
reserve 5% of the space as overhead for the root user and
this space doubles as reservation for the worst case
behaviour for unclear space usage. I've chosen a value
that is slightly higher.
This validation is only necessary because we use sparse
images instead of fallocated ones, which was done to have
reasonable alerts regarding space usage from various
storage providers.
---
Update CDIConfig filesystemOverhead status, validate, and
pass the final value to importer/upload pods.
Only the status values controlled by the config controller
are used, and it's filled out for all available storage
classes in the cluster.
Use this value in Validate calls to ensure that some of the
space is reserved for the filesystem overhead to guard from
accidents.
Caveats:
Doesn't use Default: to define the default of 0.055, instead
it is hard-coded in reconcile. It seems like we can't use a
default value.
Validates the per-storageClass values in reconcile, and
doesn't reject bad values.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use util GetStorageClassByName
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Test filesystem overhead validation against async upload endpoint
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* wait for NFS PVs to be deleted before continuing
Intended to help with flakes, but didn't make a difference.
Probably still worth doing.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Avoid using the uncached client unnecessarily
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add error handling for the case where even a default SC is not found
Note that this change isn't expected to make a difference, as we
check if the targetStorageClass is nil later on and have the same
behaviour, but this is probably more correct API usage.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add testing for the validation of filesystem overhead values
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Fix logical error in waiting for NFS PVs.
Wait for all of them, not just the last one.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* update k8s deps to 1.18.6 and controller runtime to 0.6.2
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* remove building code generators from docker image. This way the k8s ligray version only has to be updated in go.mod
Do more stuff in the bazel container. Faster and better interop
Fix unit tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* make format
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* remove unnecessary rsync
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* redo code generator dep management
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* builder uses go modules
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Add support for external openshift cluster, in particular CRC.
make cluster-sync, and functional tests should all work.
Added documentation on how to enable CRC to work.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Updates based on review.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Changes based on review comments:
- removed registry in favor of making people use an external registry.
- added ceph for external provider.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Fix review comments
Signed-off-by: Alexander Wels <awels@redhat.com>