Commit Graph

10 Commits

Author SHA1 Message Date
Tomasz Barański
a29c3d4165
Preallocate even if the size is too small (#1637)
This PR removes "skipped" condition for preallocation. Importer/uploader
will preallocate to the available size. Filesystem overhead needs to be
taken into account.

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
2021-02-10 21:18:56 +01:00
Maya Rashish
3b36e1cd4f
Validate image fits in filesystem in a lot more cases. take filesystem overhead into account when resizing. (#1466)
* Validate images fit on a filesystem in more cases.

Background:
When the backing store is a filesystem, we store the images
as sparse files. So the file may eventually grow to be bigger
than the available storage. This will cause unfortunate
failures down the line.

Prior to this commit, we validated the size:
- In case the backing store implicitly did it for us (block volumes)
- On async upload
- When resizing (by the operation failing if the image cannot fit
in the available space).

The Resize phase is encountered quite commonly:
Transfer->Convert->Resize
TransferFile->Resize

Adding validation here for the non-resize case covers almost all
the cases.

The only exceptions that aren't validated now are:
- DataVolumeArchive via the HTTP datasource
- VDDK

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* When resizing, take into account filesystem overhead.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Add testing for too large upload/import

- Import/sync upload of too large physical size image (raw.xz, qcow2)
- Import/sync upload of too large virtual size image (raw.xz, qcow2)
- Import of a too large raw image file, if filesystem overhead is
taken into account

- Async upload of too large physical size qcow2.
The async upload cases do not mirror the sync upload ones because if
a block device is used as scratch space, it will hit a size limit
before the validation pause, and fail differently.
This scenario is identical to the sync upload case which was added.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Refactor code in a way that requires less comments to explain.

We can just validate that the requested image size will fit in the
available space, and not rely on the fact we typically resize the
images to the full size.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* When calculating usable space, round down to a multiple of 512.

Our validation is indirectly:
image after resize to usable space <= usable space
For this to pass, we need to ensure that qemu-img's rounding
up to 512 doesn't change the size.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Adjust qemu-img to the ones emitted by nbdkit:

- In some cases, we clearly don't rely on the qemu-img error,
so don't check for it.
- In one case, switch it to looking for the nbdkit equivalent
error message.

Signed-off-by: Maya Rashish <mrashish@redhat.com>
2021-01-25 19:36:49 +01:00
Tomasz Barański
91a15c57d1
Preallocation support (#1498)
* [WIP] doc: User-facing doc for preallocation support

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* apis: CDI accepts `preallocation` option.

With this commit CDI accepts (but does handle) `preallocation` settings
for DataVolumes and in CDIConfig.

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* core: Implementing preallocation

This commit implements preallocation support for import and upload.

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* test: Functional tests for preallocation support

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* core: Remove "preallocation for StorageClasses" config

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* test: Removed unused function

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* test: Fix rook-ceph test failures

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* Updated dependencies
Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>

* core: Uss PVC annotation to pass preallocation parameters

DataVolume controller now uses a PVC annotation to pass preallocation
configuration to import and update controllers.

Signed-off-by: Tomasz Baranski <tbaransk@redhat.com>
2020-12-18 16:46:16 -05:00
Maya Rashish
92a1ae073a
Remove the "Process" data processor phase, simplify state machine. (#1446)
This phase can mean three things:
HTTP, imageio, s3, upload: Go directly to Convert
VDDK: unreachable code, points to arbitrary other phase.
registry: do minor processing after transfer.

Only registry makes actual use of this phase.

Point all current users directly to Convert, and fold the
work done in the registry Process phase into the previous
phase.

Signed-off-by: Maya Rashish <mrashish@redhat.com>
2020-11-04 12:45:49 +01:00
Maya Rashish
b91887e1b7
Reserve overhead when validating that a Filesystem has enough space (#1319)
* When validating disk space, reserve space for filesystem overhead

The amount of available space in a filesystem is not exactly
the advertise amount. Things like indirect blocks or metadata
may use up some of this space. Reserving it to avoid reaching
full capacity by default.

This value is configurable from the CDIConfig object spec,
both globally and per-storageclass.

The default value is 0.055, or "5.5% of the space is
reserved". This value was chosen because some filesystems
reserve 5% of the space as overhead for the root user and
this space doubles as reservation for the worst case
behaviour for unclear space usage. I've chosen a value
that is slightly higher.

This validation is only necessary because we use sparse
images instead of fallocated ones, which was done to have
reasonable alerts regarding space usage from various
storage providers.

---

Update CDIConfig filesystemOverhead status, validate, and
pass the final value to importer/upload pods.

Only the status values controlled by the config controller
are used, and it's filled out for all available storage
classes in the cluster.

Use this value in Validate calls to ensure that some of the
space is reserved for the filesystem overhead to guard from
accidents.

Caveats:

Doesn't use Default: to define the default of 0.055, instead
it is hard-coded in reconcile. It seems like we can't use a
default value.

Validates the per-storageClass values in reconcile, and
doesn't reject bad values.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Use util GetStorageClassByName

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Test filesystem overhead validation against async upload endpoint

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* wait for NFS PVs to be deleted before continuing

Intended to help with flakes, but didn't make a difference.
Probably still worth doing.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Avoid using the uncached client unnecessarily

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Add error handling for the case where even a default SC is not found

Note that this change isn't expected to make a difference, as we
check if the targetStorageClass is nil later on and have the same
behaviour, but this is probably more correct API usage.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Add testing for the validation of filesystem overhead values

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Fix logical error in waiting for NFS PVs.

Wait for all of them, not just the last one.

Signed-off-by: Maya Rashish <mrashish@redhat.com>
2020-10-01 18:31:32 +02:00
Alexander Wels
310e5e239f
GetAvailableSpace(block) now returns error (#1244)
Modified function that gets the size of a block device/available to return error as well as -1, so we
can distinguish the path not existing from the binary not existing in case the container doesn't have
the required binaries.

Last lane also passed, but due to slow CI timed out before reporting results.

Signed-off-by: Alexander Wels <awels@redhat.com>
2020-06-19 13:57:37 -04:00
Alexander Wels
9a2b514365
Add async endpoint for upload that closes connection immediately after transfer completes and then continues background processing. (#1095)
Signed-off-by: Alexander Wels <awels@redhat.com>
2020-02-12 16:17:26 +01:00
Alexander Wels
47fbebad42
Added unit test to verify quantity returns right value. (#968)
Add label to verifier pods so if they fail, they also print their log.

Signed-off-by: Alexander Wels <awels@redhat.com>
2019-09-23 07:56:16 -04:00
Michael Henriksen
019c843586 make clone pods use selinux type spc_t instead of privileged (#875)
* make clone pods use selinux type spc_t instead of privileged

* fix block mode related tests
2019-07-08 13:58:42 -04:00
Alexander Wels
2d6375b057 data stream refactor.
Signed-off-by: Alexander Wels <awels@redhat.com>
2019-04-10 09:18:55 -04:00