This PR removes "skipped" condition for preallocation. Importer/uploader
will preallocate to the available size. Filesystem overhead needs to be
taken into account.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Validate images fit on a filesystem in more cases.
Background:
When the backing store is a filesystem, we store the images
as sparse files. So the file may eventually grow to be bigger
than the available storage. This will cause unfortunate
failures down the line.
Prior to this commit, we validated the size:
- In case the backing store implicitly did it for us (block volumes)
- On async upload
- When resizing (by the operation failing if the image cannot fit
in the available space).
The Resize phase is encountered quite commonly:
Transfer->Convert->Resize
TransferFile->Resize
Adding validation here for the non-resize case covers almost all
the cases.
The only exceptions that aren't validated now are:
- DataVolumeArchive via the HTTP datasource
- VDDK
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When resizing, take into account filesystem overhead.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add testing for too large upload/import
- Import/sync upload of too large physical size image (raw.xz, qcow2)
- Import/sync upload of too large virtual size image (raw.xz, qcow2)
- Import of a too large raw image file, if filesystem overhead is
taken into account
- Async upload of too large physical size qcow2.
The async upload cases do not mirror the sync upload ones because if
a block device is used as scratch space, it will hit a size limit
before the validation pause, and fail differently.
This scenario is identical to the sync upload case which was added.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Refactor code in a way that requires less comments to explain.
We can just validate that the requested image size will fit in the
available space, and not rely on the fact we typically resize the
images to the full size.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When calculating usable space, round down to a multiple of 512.
Our validation is indirectly:
image after resize to usable space <= usable space
For this to pass, we need to ensure that qemu-img's rounding
up to 512 doesn't change the size.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Adjust qemu-img to the ones emitted by nbdkit:
- In some cases, we clearly don't rely on the qemu-img error,
so don't check for it.
- In one case, switch it to looking for the nbdkit equivalent
error message.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* [WIP] doc: User-facing doc for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* apis: CDI accepts `preallocation` option.
With this commit CDI accepts (but does handle) `preallocation` settings
for DataVolumes and in CDIConfig.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Implementing preallocation
This commit implements preallocation support for import and upload.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Functional tests for preallocation support
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Remove "preallocation for StorageClasses" config
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Removed unused function
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* test: Fix rook-ceph test failures
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* Updated dependencies
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
* core: Uss PVC annotation to pass preallocation parameters
DataVolume controller now uses a PVC annotation to pass preallocation
configuration to import and update controllers.
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
This phase can mean three things:
HTTP, imageio, s3, upload: Go directly to Convert
VDDK: unreachable code, points to arbitrary other phase.
registry: do minor processing after transfer.
Only registry makes actual use of this phase.
Point all current users directly to Convert, and fold the
work done in the registry Process phase into the previous
phase.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* When validating disk space, reserve space for filesystem overhead
The amount of available space in a filesystem is not exactly
the advertise amount. Things like indirect blocks or metadata
may use up some of this space. Reserving it to avoid reaching
full capacity by default.
This value is configurable from the CDIConfig object spec,
both globally and per-storageclass.
The default value is 0.055, or "5.5% of the space is
reserved". This value was chosen because some filesystems
reserve 5% of the space as overhead for the root user and
this space doubles as reservation for the worst case
behaviour for unclear space usage. I've chosen a value
that is slightly higher.
This validation is only necessary because we use sparse
images instead of fallocated ones, which was done to have
reasonable alerts regarding space usage from various
storage providers.
---
Update CDIConfig filesystemOverhead status, validate, and
pass the final value to importer/upload pods.
Only the status values controlled by the config controller
are used, and it's filled out for all available storage
classes in the cluster.
Use this value in Validate calls to ensure that some of the
space is reserved for the filesystem overhead to guard from
accidents.
Caveats:
Doesn't use Default: to define the default of 0.055, instead
it is hard-coded in reconcile. It seems like we can't use a
default value.
Validates the per-storageClass values in reconcile, and
doesn't reject bad values.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Use util GetStorageClassByName
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Test filesystem overhead validation against async upload endpoint
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* wait for NFS PVs to be deleted before continuing
Intended to help with flakes, but didn't make a difference.
Probably still worth doing.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Avoid using the uncached client unnecessarily
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add error handling for the case where even a default SC is not found
Note that this change isn't expected to make a difference, as we
check if the targetStorageClass is nil later on and have the same
behaviour, but this is probably more correct API usage.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Add testing for the validation of filesystem overhead values
Signed-off-by: Maya Rashish <mrashish@redhat.com>
* Fix logical error in waiting for NFS PVs.
Wait for all of them, not just the last one.
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Modified function that gets the size of a block device/available to return error as well as -1, so we
can distinguish the path not existing from the binary not existing in case the container doesn't have
the required binaries.
Last lane also passed, but due to slow CI timed out before reporting results.
Signed-off-by: Alexander Wels <awels@redhat.com>