containerized-data-importer/doc/os-image-poll-and-update.md
akalenyu 50efcee3c2
[release-v1.57] Backport main commits to 1.57 release branch v2 (#2785)
* dataimportcron: Pass dynamic credential support label (#2760)

* dataimportcron: code change: Use better matchers in tests

Signed-off-by: Andrej Krejcir <akrejcir@redhat.com>

* dataimportcron: Pass dynamic credential support label

The label is passed from DataImportCron to DataVolume
and DataSource.

Signed-off-by: Andrej Krejcir <akrejcir@redhat.com>

---------

Signed-off-by: Andrej Krejcir <akrejcir@redhat.com>

* Add DataImportCron snapshot sources docs (#2747)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* add akalenyu as approver, some others as reviewers (#2766)

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* Run `make rpm-deps` (#2741)

* Run make rpm-deps

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Avoid overlayfs error message by using vfs driver

Signed-off-by: Maya Rashish <mrashish@redhat.com>

---------

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Fix Destructive test lane failure - missing pod following recreate of CDI (#2744)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* [WIP] Handle nil ptr in dataimportcron controller (#2769)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Revert some gomega error checking that produce confusing output (#2772)

One of these tests flakes, but the error is hard to debug because
gomega will yell about
`Unexpected non-nil/non-zero argument at index 0`
Instead of showing the error.

Apparently this is intended:
https://github.com/onsi/gomega/pull/480/files#diff-e696deff1a5be83ad03053b772926cb325cede3b33222fa76c2bb1fcf2efd809R186-R190

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Run bazelisk run //robots/cmd/uploader:uploader -- -workspace /home/prow/go/src/github.com/kubevirt/project-infra/../containerized-data-importer/WORKSPACE -dry-run=false (#2770)

Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com>

* [CI] Add metrics name linter (#2774)

Signed-off-by: Aviv Litman <alitman@redhat.com>

---------

Signed-off-by: Andrej Krejcir <akrejcir@redhat.com>
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
Signed-off-by: Maya Rashish <mrashish@redhat.com>
Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com>
Signed-off-by: Aviv Litman <alitman@redhat.com>
Co-authored-by: Andrej Krejcir <akrejcir@gmail.com>
Co-authored-by: Michael Henriksen <mhenriks@redhat.com>
Co-authored-by: Maya Rashish <mrashish@redhat.com>
Co-authored-by: kubevirt-bot <kubevirtbot@redhat.com>
Co-authored-by: Aviv Litman <64130977+avlitman@users.noreply.github.com>
2023-07-06 16:05:38 +02:00

4.2 KiB

Automated OS image import, poll and update

CDI supports automating OS image import, poll and update, keeping OS images up-to-date according to the given schedule. On the first time a DataImportCron is scheduled, the controller will import the source image. On any following scheduled poll, if the source image digest (sha256) has updated, the controller will import it to a new source in the DataImportCron namespace, and update the managed DataSource to point to the newly created source. A garbage collector (garbageCollect: Outdated enabled by default) is responsible to keep the last importsToKeep (3 by default) imported sources per DataImportCron, and delete older ones.

See design doc here

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: fedora-image-import-cron
  namespace: golden-images
spec:
  template:
    spec:
      source:
        registry:
          url: "docker://quay.io/kubevirt/fedora-cloud-registry-disk-demo:latest"
          pullMethod: node
          certConfigMap: some-certs
      storage:
        resources:
          requests:
            storage: 5Gi
        storageClassName: hostpath-provisioner
  schedule: "30 1 * * 1"
  garbageCollect: Outdated
  importsToKeep: 2
  managedDataSource: fedora

A DataVolume can use a sourceRef referring to a DataSource, instead of the source, so whenever created it will use the latest imported source similarly to specifying dv.spec.source.

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: fedora-ref
  namespace: golden-images
spec:
  sourceRef:
      kind: DataSource
      name: fedora
  storage:
    resources:
      requests:
        storage: 5Gi
    storageClassName: hostpath-provisioner

OpenShift ImageStreams

Using pullMethod: node we also support import from OpenShift imageStream instead of url:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: rhel8-image-import-cron
  namespace: openshift-virtualization-os-images
spec:
  template:
    spec:
      source:
        registry:
          imageStream: rhel8-is
          pullMethod: node
      storage:
        resources:
          requests:
            storage: 5Gi
        storageClassName: hostpath-provisioner
  schedule: "0 0 * * 5"
  importsToKeep: 4
  managedDataSource: rhel8

Currently we assume the ImageStream is in the same namespace as the DataImportCron.

To create an ImageStream one can use for example:

  • oc import-image rhel8-is -n openshift-virtualization-os-images --from=registry.redhat.io/rhel8/rhel-guest-image --scheduled --confirm
  • oc set image-lookup rhel8-is -n openshift-virtualization-os-images

Or on CRC:

  • oc import-image cirros-is -n openshift-virtualization-os-images --from=kubevirt/cirros-container-disk-demo --scheduled --confirm
  • oc set image-lookup cirros-is -n openshift-virtualization-os-images

More information on image streams is available here and here.

DataImportCron source formats

  • PersistentVolumeClaim
  • VolumeSnapshot

DataImportCron was originally designed to only maintain PVC sources,
However, for certain storage types, we know that snapshots sources scale better.
Some details and examples can be found in clone-from-volumesnapshot-source.

We keep this provisioner-specific information on the StorageProfile object for each provisioner at the dataImportCronSourceFormat field (possible values are snapshot/pvc), which tells the DataImportCron which type of source is preferred for the provisioner.

Some provisioners like ceph rbd are opted in automatically.
To opt-in manually, one must edit the StorageProfile:

apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
  ...
spec:
  dataImportCronSourceFormat: snapshot

To ensure smooth transition, existing DataImportCrons can be switchd to maintaining snapshots instead of PVCs by updating their corresponding storage profiles.