mirror of https://github.com/kubevirt/containerized-data-importer.git synced 2025-06-03 06:30:22 +00:00

Data Import Service for kubernetes, designed with kubevirt in mind.

Go to file

kubevirt-bot 3ff3a7ccfa [release-v1.57] remove RWX for filesystem PVC capability from default profile of IBM Block Storage CSI driver (#3742 ) * remove RWX for filesystem PVC capability from default profile of IBM Block Storage CSI driver Signed-off-by: Ariel Kass <arielk@il.ibm.com> * W/A compress sha not existing inspired by https://github.com/kubevirt/kubevirt/pull/14112 and https://github.com/kubevirt/kubevirt/pull/14570 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> --------- Signed-off-by: Ariel Kass <arielk@il.ibm.com> Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Ariel Kass <arielk@il.ibm.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>		2025-05-20 20:05:27 +02:00
.github	Use new-style issue templates (#2200 )	2022-03-24 19:13:23 +01:00
api/openapi-spec	[release-v1.57] Avoid creating snapshot of old storage class DataImportCron PVCs (#2885 )	2023-09-01 14:15:29 +02:00
assets	CDI operator OLM integration:	2019-05-01 13:54:28 +03:00
automation	[release-v1.57] Add new Prometheus alerts and label existing alerts (#3041 )	2024-01-03 20:29:20 +01:00
bazel	feature: support aarch64 (#1863 )	2021-07-21 17:52:32 +02:00
cluster-sync	Backport main commits to 1.57 release branch (#2764 )	2023-06-22 00:59:17 +02:00
cluster-up	let's start testing 1.26 (#2697 )	2023-04-22 04:47:54 +01:00
cmd	[release-v1.57] Manual backport of "Use direct io with qemu-img convert if pod is OOMKilled" (#3309 )	2024-06-10 12:36:27 +02:00
doc	VDDK: Fix NBD status coalescing for large blocks. (#3242 ) (#3510 )	2024-11-06 00:50:03 +01:00
hack	[release-v1.57] Prepare branch for releasing, follow up builder changes (#3514 )	2024-11-06 22:10:05 +01:00
manifests	Google Cloud Storage Import Support (#2615 )	2023-03-22 16:49:29 +00:00
pkg	[release-v1.57] remove RWX for filesystem PVC capability from default profile of IBM Block Storage CSI driver (#3742 )	2025-05-20 20:05:27 +02:00
rpm	Always use scratchspace when importing and converting (#2832 ) (#2841 )	2023-08-12 05:47:21 +02:00
staging/src/kubevirt.io	CVE-2023-45288 fix: Bump golang.org/x/net to v0.23.0 in /staging/src/kubevirt.io/containerized-data-importer-api (#3216 )	2024-04-23 03:52:20 +02:00
tests	Fix flaky DataImportCron volume snapshot test (#3583 )	2025-01-06 18:52:24 +01:00
third_party	make tools support aarch64 (#1952 )	2021-09-23 02:54:41 +02:00
tools	[release-1.57] Backport 3385 VDDK: pass snapshot ID through to nbdkit. (#3501 )	2024-11-05 15:32:03 +01:00
vendor	[release-v1.57] Bump golang.org/x/net to v0.33.0 (#3596 )	2025-01-16 16:46:46 +01:00
.bazelrc	feature: support aarch64 (#1863 )	2021-07-21 17:52:32 +02:00
.fossa.yml	Add branch name to fossa analysis (#2545 )	2023-01-19 01:11:07 +01:00
.gitignore	remove vscode settings (#2497 )	2022-12-06 19:23:24 +00:00
.golangci.yml	enable ginkgolinter and fix findings (#2703 )	2023-05-04 13:07:36 +02:00
.snyk	feat: add .snyk file (#3001 )	2023-11-30 01:55:14 +01:00
.travis.yml	Fix travis nightly to push from main. (#1655 )	2021-02-12 17:32:56 +01:00
BUILD.bazel	Don't use scratch space for registry node pull imports (#2846 )	2023-08-14 23:15:33 +02:00
CONTRIBUTING.md	Update external links. (#1946 )	2021-09-22 14:16:30 +02:00
go.mod	[release-v1.57] Bump golang.org/x/net to v0.33.0 (#3596 )	2025-01-16 16:46:46 +01:00
go.sum	[release-v1.57] Bump golang.org/x/net to v0.33.0 (#3596 )	2025-01-16 16:46:46 +01:00
LICENSE	Add Apache V2 license (#379 )	2018-08-28 14:50:21 -07:00
Makefile	[release-v1.57] Backport main commits to 1.57 release branch v2 (#2785 )	2023-07-06 16:05:38 +02:00
manual-release-notes	v1.57.1 release notes	2024-11-07 16:31:21 +02:00
OWNERS	Add owner files	2019-11-11 16:38:44 -05:00
OWNERS_ALIASES	[release-v1.57] Backport main commits to 1.57 release branch v2 (#2785 )	2023-07-06 16:05:38 +02:00
README.md	Google Cloud Storage Import Support (#2615 )	2023-03-22 16:49:29 +00:00
repo.yaml	Add missing file repo.yaml (#2275 )	2022-05-15 18:46:28 +02:00
SECURITY.md	Move apis to staging, push to containerized-data-importer-api (#1997 )	2021-10-28 13:40:24 +02:00
tools.go	Change +build ignore tag in tools.go to +build codegen so that (#1339 )	2020-08-13 19:35:56 +02:00
WORKSPACE	[release-v1.57] remove RWX for filesystem PVC capability from default profile of IBM Block Storage CSI driver (#3742 )	2025-05-20 20:05:27 +02:00

README.md

Containerized Data Importer

Containerized-Data-Importer (CDI) is a persistent storage management add-on for Kubernetes. It's primary goal is to provide a declarative way to build Virtual Machine Disks on PVCs for Kubevirt VMs

CDI works with standard core Kubernetes resources and is storage device agnostic, while its primary focus is to build disk images for Kubevirt, it's also useful outside of a Kubevirt context to use for initializing your Kubernetes Volumes with data.

Introduction

Kubernetes extension to populate PVCs with VM disk images or other data

CDI provides the ability to populate PVCs with VM images or other data upon creation. The data can come from different sources: a URL, a container registry, another PVC (clone), or an upload from a client.

DataVolumes

CDI includes a CustomResourceDefinition (CRD) that provides an object of type DataVolume. The DataVolume is an abstraction on top of the standard Kubernetes PVC and can be used to automate creation and population of a PVC with data. Although you can use PVCs directly with CDI, DataVolumes are the preferred method since they offer full functionality, a stable API, and better integration with kubevirt. More details about DataVolumes can be found here.

Import from URL

This method is selected when you create a DataVolume with an http source. CDI will populate the volume using a pod that will download from the given URL and handle the content according to the contentType setting (see below). It is possible to configure basic authentication using a secret and specify custom TLS certificates in a ConfigMap.

Import from container registry

When a DataVolume has a registry source CDI will populate the volume with a Container Disk downloaded from the given image URL. The only valid contentType for this source is kubevirt and the image must be a Container Disk. More details can be found here.

Clone another PVC

To clone a PVC, create a DataVolume with a pvc source and specify namespace and name of the source PVC. CDI will attempt an efficient clone of the PVC using the storage backend if possible. Otherwise, the data will be transferred to the target PVC using a TLS secured connection between two pods on the cluster network. More details can be found here.

Upload from a client

To upload data to a PVC from a client machine first create a DataVolume with an upload source. CDI will prepare to receive data via an upload proxy which will transit data from an authenticated client to a pod which will populate the PVC according to the contentType setting. To send data to the upload proxy you must have a valid UploadToken. See the upload documentation for details.

Prepare an empty Kubevirt VM disk

The special source blank can be used to populate a volume with an empty Kubevirt VM disk. This source is valid only with the kubevirt contentType. CDI will create a VM disk on the PVC which uses all of the available space. See here for an example.

Import from oVirt

Virtual machine disks can be imported from a running oVirt installation using the imageio source. CDI will use the provided credentials to securely transfer the indicated oVirt disk image so that it can be used with kubevirt. See here for more information and examples.

Import from VMware

Disks can be imported from VMware with the vddk source. CDI will transfer the disks using vCenter/ESX API credentials and a user-provided image containing the non-redistributable VDDK library. See here for instructions.

Content Types

CDI features specialized handling for two types of content: Kubevirt VM disk images and tar archives. The kubevirt content type indicates that the data being imported should be treated as a Kubevirt VM disk. CDI will automatically decompress and convert the file from qcow2 to raw format if needed. It will also resize the disk to use all available space. The archive content type indicates that the data is a tar archive. Compression is not yet supported for archives. CDI will extract the contents of the archive into the volume. The content type can be selected by specifying the contentType field in the DataVolume. kubevirt is the default content type. CDI only supports certain combinations of source and contentType as indicated below:

http → kubevirt, archive
registry → kubevirt
pvc → Not applicable - content is cloned
upload → kubevirt
imageio → kubevirt
vddk → kubevirt

Deploy it

Deploying the CDI controller is straightforward. In this document the default namespace is used, but in a production setup a protected namespace that is inaccessible to regular users should be used instead.

$ export VERSION=$(curl -s https://api.github.com/repos/kubevirt/containerized-data-importer/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
$ kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
$ kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml

Use it

Create a DataVolume and populate it with data from an http source

$ kubectl create -f https://raw.githubusercontent.com/kubevirt/containerized-data-importer/$VERSION/manifests/example/import-kubevirt-datavolume.yaml

There are quite a few examples in the example manifests, check them out as a reference to create DataVolumes from additional sources like registries, S3, GCS and your local system.

Hack it

CDI includes a self contained development and test environment. We use Docker to build, and we provide a simple way to get a test cluster up and running on your laptop. The development tools include a version of kubectl that you can use to communicate with the cluster. A wrapper script to communicate with the cluster can be invoked using ./cluster-up/kubectl.sh.

$ mkdir $GOPATH/src/kubevirt.io && cd $GOPATH/src/kubevirt.io
$ git clone https://github.com/kubevirt/containerized-data-importer && cd containerized-data-importer
$ make cluster-up
$ make cluster-sync
$ ./cluster-up/kubectl.sh .....

For development on external cluster (not provisioned by our CI), check out the external provider.

Storage notes

CDI is designed to be storage agnostic. Since it works with the kubernetes storage APIs it should work well with any configuration that can produce a Bound PVC. The following are storage-specific notes that may be relevant when using CDI.

NFSv3 is not supported: CDI uses qemu-img to manipulate disk images and this program uses locking which is not compatible with the obsolete NFSv3 protocol. We recommend using NFSv4.

Connect with us

We'd love to hear from you, reach out on Github via Issues or Pull Requests!

Hit us up on Slack

Shoot us an email at: kubevirt-dev@googlegroups.com