* Add annotation for extra VDDK library arguments.
The VDDK library itself accepts infrequently-used arguments in a
configuration file, and some of these arguments have been tested to show
a significant transfer speedup in some environments. This adds an
annotation that references a ConfigMap holding the contents of this VDDK
configuration file, and mounts it to the importer pod. The first file in
the mounted directory is passed to the VDDK.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add functional test for VDDK args annotation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add unit test for extra VDDK arguments annotation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add documentation for extra VDDK arguments.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Simplify new functional test annotation creation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Look for specific file instead of first file.
Instead of listing the mounted VDDK arguments directory and filtering
out hidden files, just hard-code the expected file name and ConfigMap
key.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Move extra VDDK arguments functional test.
Put this in import_test and assert the values there, instead of in the
VDDK test plugin. The VDDK plugin logs the given values, and then the
test scans the log for what it expects to see.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Clean up lint error.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Move VDDK configuration test back, change test ID.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Avoid using kubectl for scanning nbdkit logs.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Temporary: show whole nbdkit log after failure.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Revert "Temporary: show whole nbdkit log after failure."
This reverts commit 488849f8fd.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Copy extra VDDK args annotation for populators.
Also add a related unit test.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Correct VDDK args config map mount comment.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
---------
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Ensure scratch space is large enough to hold disk image
It is possible the source image is fully allocated
and in that case the scratch space would not be large
enough to hold the disk image. This takes fsOverhead
into account when calculating the scratch space size.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Fix bug when copying the size of the pvc for the scratchspace
This was not copying but re-using the same struct. This caused the
size of the regular PVC to be overwritten by the scratchspace PVC
in certain conditions.
Use storage stanza instead of pvc stanza in the cloner tests.
Signed-off-by: Alexander Wels <awels@redhat.com>
---------
Signed-off-by: Alexander Wels <awels@redhat.com>
* make deps-update
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* ReourceRequirements -> VolumeResourceRequirements
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix calls to controller.Watch()
controller-runtime changed the API!
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Fix errors with actual openshift/library-go lib
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* make all works now and everything compiles
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix "make update-codegen" because generate_groups.sh deprecated
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* run "make generate"
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix transfer unittest because of change to controller-runtime
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
---------
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Enable revive linter
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Simplify cdi-func-test-proxy
This function had quite a bit of redundant code (caught by the linter).
The workgroup was never Done because all exit paths went through a
log.Fatal.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Fix 'revive' linter warnings
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Fix tests that asserted on modified error messages
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Run make format
The formatted code has nothing to do with this PR but we may as well
include it.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Use lower-case variables and use built-in min function in vddk-datasource
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Use contexts in cdi-func-test-proxy
This added quite a bit of boilerplate per call, so I put everything in
a loop.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
---------
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Use direct io with qemu-img convert if target supports it
For a while now we have been switching between cache=none (direct io) and cache=writeback (page cache)
for qemu-img's writes.
We have settled on cache=writeback for quite some time since https://github.com/kubevirt/containerized-data-importer/pull/1919,
however, this has proven to be problematic;
Our pod's live in a constrained memory environment (default limit 600M).
cgroupsv1 compares utilization of page cache vs the host's dirty_ratio.
This means that on a standard system (30% dirty ratio) pages only get forced to disk at 0.3 * HOST_MEM (basically never),
easily triggering OOM on hosts with lots of free memory.
cgroupsv2 does come to the rescue here:
- It considers dirty_ratio against CGROUP_MEM
- Has a new memory.high knob that throttles instead of OOM killing
Sadly, k8s is yet to capitalize on memory.high since this feature is still alpha:
https://kubernetes.io/blog/2023/05/05/qos-memory-resources/
Leaving us with no way to avoid frequent OOMs.
This commit changes the way we write to bypass page cache if the target supports it,
otherwise, fall back to cache=writeback (use page cache).
There have previously been issues where target did not support O_DIRECT. A quick example is tmpfs (ram-based)
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Capitalize on cache mode=trynone if importer is being OOMKilled
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Enable gofmt linter
From the docs:
> Gofmt checks whether code was gofmt-ed. By default this tool runs with
> -s option to check for code simplification.
https://golangci-lint.run/usage/linters/#gofmt
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Run gomft on the project
Ran this command after adding the gofmt linter:
golangci-lint run ./... --fix
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Enable whitespace linter
From the docs:
> Whitespace is a linter that checks for unnecessary newlines at the
> start and end of functions, if, for, etc.
https://golangci-lint.run/usage/linters/#whitespace
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Run whitespace on the project
Ran this command after adding the whitespace linter:
golangci-lint run ./... --fix
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Enable GCI linter
Per the docs:
> Gci controls Go package import order and makes it always deterministic.
https://golangci-lint.run/usage/linters/#gci
NOTE: I noticed that many files separate their imports in a particular
way, so I set the linter to enforce this standard.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Run GCI on the project
Ran this command after adding the GCI linter:
golangci-lint run ./... --fix
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
---------
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Enable unconvert linter
This linter's doc describes it as:
The unconvert program analyzes Go packages to identify unnecessary
type conversions; i.e., expressions T(x) where x already has type T.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Unrestrict the number of linter warnings
It is best to show all warnings at once than to reveal them piece-meal,
particularly in CI where the feedback loop can be a bit slow.
By default, linters may only print the same message three times
(https://golangci-lint.run/usage/configuration/#issues-configuration)
The unconvert linter always prints the same message, so it specially
affected by this setting.
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
* Remove redundant type conversions
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
---------
Signed-off-by: Edu Gómez Escandell <egomez@redhat.com>
Fix that the Resources configuration
is missing the configuration of the initcontainer container
Signed-off-by: lion <mzzgaopeng@gmail.com>
Co-authored-by: gaopeng <mzzgaopeng@gmailc.om>
Instead of changing the labels map that is passed in as parameter to the
setLabelsFromTerminationMessage function, it now returns a modified copy
of the passed in labels map and is renamed to
addLabelsFromTerminationMessage. If the passed in map is nil a new map
will be allocated and initialized.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* feat(cdi-containerimage-server): Add info endpoint
The info endpoint returns a ServerInfo object containing all
environment variables of the server serialized to json. This allows the
extraction of env vars from a containerdisk when using pullMethod node.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* feat(importer): Add conversion of env vars to label
This adds the conversion of env vars containing KUBEVIRT_IO_ to a label
key/value pair.
Example: TEST_KUBEVIRT_IO_TEST=testvalue becomes test.kubevirt.io/test:
testvalue.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* feat(importer): Extract labels from registry datasource
This allows the registry-datasource to return a termination message with
labels extracted from the env vars of a source containerdisk when using
pullMethod pod.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* feat(importer): Extract labels from http datasource
This allows the http-datasource to return a termination message with
labels extracted from the env vars of a source containerdisk when using
pullMethod node.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* feat(controller): Set PVC labels from importer termination message
With this change the import-controller is able set labels on destination
PVCs returned from the importer in its termination message.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* tests: Add tests for conversion of containerdisk env vars to PVC labels
This adds tests for the conversion of containerdisk env vars to PVC
labels for both pullMethods pod and node.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* fix: Fix race in import-populator
By running reconcileTargetPVC of populatorController on every reconcile
cycle, the import-populator controller is able to retry seting labels and
annotations on the target PVC when import-controller modified the target
PVC at the same time.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
---------
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
Make the communication of datasources in the importer explicit by adding
a GetTerminationMessage method to the DataSourceInterface.
Then use this method to communicate additional information to the import
controller once the importer pod has terminated, instead of writing
additional data to the termination message in the Close method of
datasources.
Signed-off-by: Felix Matouschek <fmatouschek@redhat.com>
* Avoid race condition during importer termination by returning 0 exitCode when scratch space is required
The restart policy on failure along with manual pod deletion caused some issues after the importer exited with scratch space needed.
This commit sets the exit code to 0 when exiting for scratch space required so we manually delete the pod and avoid the described race condition.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Adapt functional test to work with faster importer pod recovery
Test [test_id:1990] relied on the assumption that deleting the file from an http server would always cause the DV to restart.
The old scratch space required mechanism always caused restarts on the DV, masking some false positives: This doesn't happen in all cases since the polling from the server can keep retrying without failing if the file is restored fast enough.
This commit adapts the test to work with faster importer recoveries and adds a md5sum check to make sure the imports ends up being succesfull despite removing the file.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
---------
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Bump k8s/OpenShift/ctrl-runtime/lifecycle-sdk & make deps-update
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Operator: adapt for dependency bump
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Controller: adapt watch calls for dependency bump
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Controller: adapt to ctrl-runtime's cache API changes
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Operator: fix unit tests by deleting resources properly in fake client
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Controller: fix unit tests by deleting resources properly in fake client
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Controller: adapt to fake client honoring status subresource
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Fix codegen script & make generate
There are some issues in the new script, so we
will still use the deprecated one.
More context in f4d1a5431b
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Functests: Adapt to NamespacedName now implementing MarshalLog
ns/name -> {"name":"name","namespace":"ns"}
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Functests & API server: address deprecation of wait.PollImmediate
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Default virt storage class
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add alert for multiple default virt storage classes
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Refactor content type funcs to not return strings
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
We currently don't support the wffc override for blank block disks,
while there may be some use cases where that is desired.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
This commit ups the cpu request for for all our installed compopents
(cdi-deployment, cdi-apiserver, cdi-uploadproxy, cdi-operator)
for 10m (1% of a core) to 100m (10% of a core).
The main driver of this is BZ: 2216038.
Without this change, it is pretty easy to create a large number of
concurrent clone operations and get token timeout errors.
Upping resource requests and concurrency addresses the issue
in a very direct way.
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
When importing via node container runtime cache, we always have the image handy locally.
This manifests itself in the form of a bug where we loop over
```bash
E0813 13:32:38.443088 1 data-processor.go:251] scratch space required and none found
E0813 13:32:38.443102 1 importer.go:181] scratch space required and none found
```
On registry node pull imports where images are not raw
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Remove older nbdkit
Signed-off-by: Alexander Wels <awels@redhat.com>
* When converting always use scratch space importing instead
of ndbkit. Once we are able to get nbdkit 1.35.8 or newer
we can revert this change since that will include improvements
to the downloading speed.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Disable metrics test for import because straight import doesn't
return total, and this means the metrics are disabled.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Fix broken functional tests
Signed-off-by: Alexander Wels <awels@redhat.com>
* Address review comments
Signed-off-by: Alexander Wels <awels@redhat.com>
* Additional review comments.
Fixed functional test that was not doing the right
thing while running the test.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Always set preallocation on block devices when directly
writing to the device
Signed-off-by: Alexander Wels <awels@redhat.com>
---------
Signed-off-by: Alexander Wels <awels@redhat.com>
* Update VolumeImportSource API to support multi-stage imports
This commit modifies the VolumeImportSource API to support multi-stage imports, adding the following fields:
- Checkpoints, to represent the stages of a multistage import
- TargetClaim, the name of the specific PVC to be imported
- FinalCheckpoint, to indicate that the current Checkpoint is the final one
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Support multi-stage imports in import-populator
This commit updates the import populator to support multi-stage imports. The API and functionality remains the same as with DataVolumes, with the only difference that the used VolumeImportSource will now require a populated "TargetClaim" field that reffers to the specific PVC to be populated.
The DataVolume controller is also updated to allow using the populator flow with VDDK and ImageIO sources.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests for multistage import support in populators
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add functional tests to test multistage import populator flow
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Fix multi-stage import logic in import-populator and add remaining tests
This commit fixes several bugs in the import-populator logic for multi-stage imports.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
---------
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Allow ImmediateBind annotation when using populators
In case of using PVC with populators if the PVC has this
annotation we prevent from waiting for it to be schedueled
and we proceed with the process.
When using datavolumes with populators in case the dv has the annotation
it will be passed to the PVC. we prevent from being in pendingPopulation
in case the created pvc has the annotaion.
Plus when having honorWaitForFirstConsumer feature gate disabled we will
put on the target PVC the immediateBind annotation.
Now we allow to use populators when having the annotation the the
feature gate disabled.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add functional tests to population using PVCs
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Support immediate binding with clone datavolume
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Pass allowed annotations from target pvc to pvc prime
This annotations are used for the import/upload/clone
pods to define netork configurations.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
---------
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* touch up zero restoresize snapshot
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* clone populator
only supports PVC source now
snapshot coming soon
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* more unit tests
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* unit test for clone populator
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* func tests for clone populator
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* move clone populator cleanup function to planner
other review comments
verifier pod should bount readonly
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add readonly flag to test executor pods
synchronize get hash calls
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* increase linter timeout
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* better/explicit readonly support for test pods
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* check pv for driver info before looking up storageclass as it may not exist
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* addressed review comments
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* chooseStrategy shoud generate more events
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
---------
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Create populators package to be used for all populators
This commit introduces the basic reconciler for
populators with common function that can be used
by the different populators.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* unite getcontenttype func across code
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add VolumeImportSource CRD for import populator
This commit adds the VolumeImportSource CRD into CDI.
CRs created from this CRD will be referenced in the dataSourceRef field to populate PVCs with the import populator.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Refactor common populator code to be shared among all populators
This commit introduces and modifies several functions so we can reuse common code between all populators.
Other than having a common reconcile function, a new populatorController interface has been introduced so we are able to call populator-specific methods from the populator-base reconciler.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Create Import Populator
The import populator is a controller that handles the import of data in PVCs without the need of DataVolumes while still taking advantage of the import-controller flow.
This controller creates an additional PVC' with import annotations. After the import process succeeds, the controller rebinds the PV to the original target PVc and deletes the PVC prime.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add functional tests to cover the import populator flow
This commit updates the import tests to cover the new import populator flow.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests for import populator
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and enhancements in import/common populator code
* Modify indexes and other related code to support namespaced dataSourceRefs. Cross-namespace population is still not supported as it depends on alpha feature gates.
* Add functional test to cover static binding.
* Fix selected node annotation bug in scratch space PVCs
* Fix linter alerts
Signed-off-by: Alvaro Romero <alromero@redhat.com>
---------
Signed-off-by: Shelly Kagan <skagan@redhat.com>
Signed-off-by: Alvaro Romero <alromero@redhat.com>
Co-authored-by: Shelly Kagan <skagan@redhat.com>
* Start adding the golangci-lint to CI
golangci-lint is a collection of many linters. This PR adds
golangci-lint to the CI. For strat, it enables the govet linter, and fix
its single finding.
The PR adds this linter to the `test-lint` Makefile target.
The new .golangci.yml file is the configuration for the linter.
golangci-lint version was set to the latest one - v1.52.2.
It is defined in hack/build/run-linters.sh
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* golangci-lint: enable gosimple and fix findings
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* golangci-lint: enable unused and fix findings
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
---------
Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
* Google Cloud Storage Importer
This is a Google Cloud Storage importer for CDI
Signed-off-by: Marcelo Parisi <marcelo@feitoza.com.br>
* Fix auto-generated swagger and openapi
Signed-off-by: Marcelo Parisi <marcelo@feitoza.com.br>
* GCS Importer General Fixes
Signed-off-by: Marcelo Parisi <marcelo@feitoza.com.br>
* Moving back gcs-secret.txt
Moving file back to imageDir to fix unit testing.
Signed-off-by: Marcelo Parisi <marcelo@feitoza.com.br>
---------
Signed-off-by: Marcelo Parisi <marcelo@feitoza.com.br>
Co-authored-by: Marcelo Parisi <marcelo@dev-box.corp.feitoza.com.br>
* Add support for imagePullSecrets in the CDI CR, to support pulling
images from repositories that require secrets.
The imagePullSecrets is propagated to the following components: cdi-apiserver,
cdi-deployment, and cdi-uploadproxy. The definition of imagePullSecrets in
cdi-operator must be done manually.
Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>
* Modifying code to incorporate review comments.
Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>
---------
Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>
Co-authored-by: Gleb Aronsky <gleb.aronsky@windriver.com>
* Fix hostpath CSI being skipped as "Not HPP"
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Fall back to host assisted if immediate bind requested
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
---------
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add support for volume populators in CDI
This commit enables the use of volume populators in CDI, so datavolume-owned PVCs can be populated using custom logic.
Volume populators are CRDs used to populate volumes externally, independently of CDI. These CRDs can now be specified using the new DataSourceRef API field in the DataVolume spec.
When a DataVolume is created with a populated DataSourceRef field, the datavolume-controller creates the corresponding PVC accordingly but skips all the population-related steps. Once the PVC is bound, the DV phase changes to succeeded.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify CDI test infrastructure to support testing of external populators
This commit introduces several changes to CDI ci to support the testing of DataVolumes with external populators:
* A sample volume populator is now deployed in the test infrastructure, in a similar way as bad-webserver or test-proxy. This populator will be used in functional tests from now on.
* A new test file with external population tests has been introduced in the tests directory
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update dependencies to include lib-volume-populator library
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add functional tests for proper coverage of external population of DataVolumes
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes on external-population logic for DataVolumes:
* Added comments for exported structs
* Removed non-inclusive language
* Improved error messages in webhooks
* Fixed logic on datavolume-controller
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Improve DataVolume external-population logic when using the old 'DataSource' API
This commit introduces several changes into the datavolume external-population controller to improve its behavior when using the DataSource field.
It also introduces minor fixes on the generic populator logic.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests for external-population controller and DV admission
Signed-off-by: Alvaro Romero <alromero@redhat.com>
Signed-off-by: Alvaro Romero <alromero@redhat.com>
- Split the huge DV controller into smaller op-specific DV controllers -
import, clone, upload
- Add common watch-adding function so each controller watches only its
relevant DVs
- Refactor the common Reconcile() to use interface DataVolumeReconciler
implemented by each controller
- Move all functions, structs, consts to the relevant controller
- Split the utests per controller
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* remove root worker pods
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* remove selinux requirement for worker pods
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* run tests in restricted namespace and required changes
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* handle empty tar
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* add PSA label when running functional tests in OpenShift
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* cannot use restricted PSA with istio (for now)
refactor scc management
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* fix clean script
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
* Improve the error handling when pod creation fails
When pod creation fails, the error is usually logged without providing additional information to the user. This behavior is especially risky when the user lacks the permits to check the logs, making it unintuitive and almost impossible to find the source of the problem.
This commit improves the error handling of the pod-creation process, so pertinent info about the failure is included in the pod's PVC.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update functional tests to check for events when pod-creation fails
Since error handling in pod-creation has been improved in our controllers, this commit introduces several changes in the corresponding functional tests to properly cover the new behavior included when pod-creation fails.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit-tests after improving error-handling of pods for proper coverage
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and improvements on error handling for pods
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-controller to change the running condition of datavolumes when a pod fails
Until this commit, the way of handling pod errors in the datavolume-controller has been to change the affected datavolume's phase to failed, which conflicts with the declarative approach of the controllers.
This commit modifies this behavior so that, when a pod fails, the affected datavolume's running condition is changed to false while the phase remains unchanged.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add TLS Security Profile API
TLSSecurityProfile is used by operators to apply cluster-wide TLS security settings to operands.
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Update apiserver & uploadproxy server TLS config on CDIConfig TLS knob change
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Propagate TLS config to uploadserver as well
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add functests for apiserver and upload that ensure value is respected
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Comply with restricted security context in kubernetes
Ensure CDI pods comply with the restricted security context as much as
possible (have to be root for nbdkit and block devices). Also cannot set
SeccompProfile since SCC won't allow us to set it.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Changed path /var/local/all_certs to stay in /var
Signed-off-by: Alexander Wels <awels@redhat.com>
* Allow empty DV size when cloning using storage API
When cloning a Data Volume, the size of the target can be potentially obtainable via the source PVC, which discards the need to explicitly specify it.
Considering that, this commit introduces a change in the correspondent validation webhook to allow omitting the resources.request.storage field when cloning a PVC using the storage API.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify datavolume-controller to allow obtaining storage size from source PVC when cloning
When cloning a PVC, if the target's size is not specified, said value can be attainable from the source PVC.
This commit introduces a change in datavolume controller so, in case of detecting an empty storage size, said value can be obtained when performing CSI and Smart cloning.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit tests for datavolume-validation after enabling cloning with empty size
This commit updates the unit testing for the datavolume validation webhook, covering the possibility of cloning a PVC without setting any storage size.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update unit testing for controller-related functions after enabling cloning with empty size
This commit includes unit tests for the volumeSize() function after enabling creating clones with blank size.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Update the datavolume controller to create a size-detection pod when performing host-assisted clone
When performing a host-assisted clone with empty clone size, simply copying the original PVC size could lead to potential overhead miscalculations if the source's VolumeMode is "filesystem".
When that's the case, an inspection pod will be created in the datavolume controller so it extracts the size of the virtual image using qemu-img.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include an image-size detection tool to allow cloning with empty DV size
This commit introduces a new tool in charge of collecting the virtual image size when cloning with an empty DV size. In some cases where said value is unattainable from the original PVC's spec, the datavolume controller will create a new pod containing this new tool.
The binary will then run the 'qemu-img' command and handle its results appropriately.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Optimize the clone-size lookup process to avoid creating unnecessary size-detection pods
When performing host-assisted clone with an empty DV size, in some cases, a size-detection pod is used to obtain the required capacity.
This commit tries to optimize this process to keep the collected value as a PVC annotation, that is checked in subsequent clones to avoid creating redundant pods.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes and improvements on mechanism for cloning with empty storage size
* Add new optional flag on size-detection binary to enable using a different URI scheme
* Improve the pod-creation mechanism so the pod is not created until the source PVC has finished the import
* Modify size-finlation mechanism to account for possible round-downs when importing the source image
* Improve the size inflation mechanism so only PVCs with filesystem as volume mode are considered
* Minor style corrections
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Modify the clone-controller to allow skipping the clone size validation in some cases
Due to filesystem overhead differences, the target's size can sometimes be smaller than the source's one when obtaining said value with the size-detection pod.
This commit introduces minor changes in the clone-controller so we can skip the size validation in those cases.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor changes and improvements in size-detection mechanism following PR review
* Added new UT that covers using empty storage API for non-cloning sources
* Added new watch on datavolume-controller that looks for changes in the size-detection pod
* Removed redundant and unnecessary specs on size-detection pod
* Added error handling when reading the pod's termination message
* Moved general-usage functions to 'util.go' file
* Updated 'datavolumes' documentation to reference the possibility of omitting the storage size when cloning
* Minor style corrections
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Add unit tests that cover the size-detection mechanism in the DataVolume controller
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Include functional tests for cloning without specifying storage size
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Improve error handling in the creation/deletion process of the size-detection pod
This commit introduces additional handling in case of error after and during the size-detection pod is created.
It also updates several related unit tests.
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Minor fixes to improve fsOverhead calculations when cloning with empty storage size
* Modified the size-detection mechanism so we account for fsOverhead when cloning to filesystem volume mode in all cases
* Clean up the code for reconciling when cloning a PVC that is not ready
* Minor fix in functional test so it works when cloning from block to filesystem volume mode
Signed-off-by: Alvaro Romero <alromero@redhat.com>
* Introduce controller-runtime-sdk api package
Split controller-runtime-sdk into the base package and
controller-runtime-sdk/api.
Signed-off-by: Roman Mohr <rmohr@redhat.com>
* go mod vendor
Signed-off-by: Roman Mohr <rmohr@redhat.com>
* Update code references
Signed-off-by: Roman Mohr <rmohr@redhat.com>
* Create imageio container during CDI build.
Instead of using a really old imageio, use bazel to build a new
imageio based on 2.5.0. Update the tests to use the new image
and paths in that new image. This requires a new repo in quay for
us to push the image to.
Also changed the approach of resolving the warm import potential
dead lock (scratch PVC from previous import pod terminating, while
the new pod is trying to create itself). Instead of trying to avoid
in all scenarios, detect the state, and delete the pod so the dead
lock can be resolved.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Populate test images
Signed-off-by: Alexander Wels <awels@redhat.com>
* Enable disabled test, and fix race condition where the import
controller thought it was done, but we were still on the final
import of a warm migration.
Updated the way we create the ticket on the fake imageio
Signed-off-by: Alexander Wels <awels@redhat.com>
* Append checkpoint ID to multi-stage importer pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Ignore completed pods for multi-stage imports.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Reset current import pod when checkpoint is done.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Don't prevent pod deletion for scratch space.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Only ignore pod when retainAfterCompletion is set.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix data volume unit tests.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Tests for checkpoint suffix and completed pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Test for retained pods exiting for scratch space.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add functional test for retaining multistage pods.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Clean up lint error.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Remove scratch handling that is fixed elsewhere.
This is part of shouldDeletePod now.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add unit tests for long PVC/checkpoint names.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Match retainAfterCompletion test to description.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix failing imageio test
Always enable scratch space for imageio imports.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Only always enable scratch space with warm migration imports.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Set htpp(s)_proxy to lower case env variable
CURL used by nbdkit doesn't read upper case http(s)_proxy environment
variables, and thus was not using the proxy. Changed the variable to
be lower case.
Added a significant number of tests to test many more variations of
using a proxy. Also added https + auth endpoint to the file-host
container, so we can test https + auth with the proxy.
Added https endpoint to proxy, so we can test an https proxy.
Cleaned up some of the error handling in the import controller for
the proxy, in particular if a trustedCAProxy is defined.
Fixed some of the cluster wide proxy configuration so it works properly
inside an openshift cluster.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Add https proxy support to registry import. Added extra
functional tests to test all registry import combinations
Signed-off-by: Alexander Wels <awels@redhat.com>
* Fixed some tests to work better in Open Shift.
Signed-off-by: Alexander Wels <awels@redhat.com>
* Add optional VDDK initImageURL field.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Pass VDDK image URL through to PVC annotation.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Unit tests for per-DV VDDK image URL.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Functional test for VDDK initImageURL field.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Update documentation for VDDK initImageURL.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Fix lint error.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Check for absence of AwaitingVDDK in unit test.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Update datavolume conditions when quota exceeded when creating pvc
When creating the pvc from the dv the pvc size
can exceed the allowed quota, in such case so far the only
indication was to look in the logs.
Now added indication in the data volume conditions
(when possible) and emitted event.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Add functional tests to check the new conditons and event
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* tests cosmetics
-use existing functions
-add missing checks on errors
-remove unused code
-etc..
Signed-off-by: Shelly Kagan <skagan@redhat.com>
* Update HTTP data source API to allow custom headers.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Implement custom HTTP headers API.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Document custom headers in HTTP data source.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Correct secretExtraHeader comment to reference Secret.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add volume mounts for secret headers.
Replaces environment variables for headers from secrets.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Avoid failing when there are no extra headers.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Redact contents of headers that come from secrets.
Also split up getExtraHeaders to reduce Sonar Cloud complexity.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Ensure all HTTP client requests use extra headers.
Missed redirect check and content length retrieval, both of which might
need the extra headers.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add some unit tests for extra HTTP headers.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Do not quote headers in nbdkit curl arguments.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add functional tests for extra HTTP headers.
Avoids new test server by specifiying basic authorization headers to the
existing file host port that requires it.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Use filepath.Walk to read secrets.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Minor documentation update for secrets.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Re-run 'make generate' for verification failure.
Signed-off-by: Matthew Arnold <marnold@redhat.com>
* Add DataImportCron controller
-The new controller polls for updates to a registry source container
image, based on a given schedule. When updates to a container image are
detected, the controller imports the content into a new uniquely named
PVC in a golden image namespace.
-For each DataImportCron, the controller manages a corresponding
DataSource to always point to the latest most up-to-date golden
image PVC.
-DataImportCron takes ownership of an existing DataSource (with
controller: false), allowing an admin to opt-in to using auto
delivery/updates later on.
-The controller has PVC garbage collector removing old PVCs.
ToDo:
-status conditions updates
-verify full image streams support
-utests and func tests
-fixmes and commented out code
-doc
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix CR comments and fixmes
- isolate imagestream and registry specific code
- fix namespace of CronJob, and its job and pod to CDI namespace
- manage CronJob-DataImportCron ownership relationship with a finalizer,
handle DataImportCron deletion (CronJob etc.)
- remove CronJob and job pod for ImageStreams, use RequeueAfter and
cronexpr instead
- add k8s app cdi-source-update-poller executed by CronJob to poll source
image digest via skopeo inspect for url registry source, and annotate
the DataImportCron when the image was updated and pending for import based
on the cron schedule
- add cdi-source-update-poller and skopeo binary to the cdi-importer container
- complete dataimportcron-validate and its tests
- reconcile - use context.Context instead of context.TODO
- remove uncached client
- doc
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix ImageStreams watch
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add DataImportCron DV template instead of source
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Fix CR comments
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Split updateSucceeded func
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Improve cdi-source-update-poller cmd logs
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Remove ImageStream reconcile
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Remove ImageStream watch
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Remove unnecessary AnnSourceUpdatePending
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* More CR fixes
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Idempotentify initCron
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Recreate DV in case is't not found
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add DataImportCron spec.importsToKeep and status.currentImports
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add DataImportCron controller functional test
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add insecure TLS support
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Remove finalizers in cluster clean script
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Bound each import to its sha256 digest instead of latest
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Add DataImportCron controller utests
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Tests CR fixes
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
* Minor tests CR fixes
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>