Disable DV GC by default (#2754)

* Disable DV GC by default

DataVolume garbage collection is a nice feature, but unfortunately it
violates fundamental principle of Kubernetes. CR should not be
auto-deleted when it completes its role (Job with TTLSecondsAfter-
Finished is an exception), and once CR was created we can assume it is
there until explicitly deleted. In addition, CR should keep idempotency,
so the same CR manifest can be applied multiple times, as long as it is
a valid update (e.g. DataVolume validation webhook does not allow
updating the spec).

When GC is enabled, some systems (e.g GitOps / ArgoCD) may require a
workaround (DV annotation deleteAfterCompletion = "false") to prevent
GC and function correctly.

On the next kubevirt-bot Bump kubevirtci PR (with bump-cdi), it will
fail on all kubevirtci lanes with tests referring DVs, as the tests
IsDataVolumeGC() looks at CDIConfig Spec.DataVolumeTTLSeconds and
assumes default is enabled. This should be fixed there.

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix test waiting for PVC deletion with UID

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix clone test assuming DV was GCed

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix DIC controller DV/PVC deletion when snapshot is ready

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

---------

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
This commit is contained in:
Arnon Gilboa 2023-06-20 22:09:19 +03:00 committed by GitHub
parent 8ced30251c
commit 0bc6a8aeca
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
13 changed files with 53 additions and 43 deletions

View File

@ -4937,7 +4937,7 @@
"type": "object",
"properties": {
"dataVolumeTTLSeconds": {
"description": "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. The default is 0 sec. To disable GC use -1.",
"description": "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. Disabled by default.",
"type": "integer",
"format": "int32"
},

View File

@ -16,7 +16,7 @@ CDI configuration in specified by administrators in the `spec.config` of the `CD
| preallocation | nil | Preallocation setting to use unless a per-dataVolume value is set |
| importProxy | nil | The proxy configuration to be used by the importer pod when accessing a http data source. When the ImportProxy is empty, the Cluster Wide-Proxy (Openshift) configurations are used. ImportProxy has four parameters: `ImportProxy.HTTPProxy` that defines the proxy http url, the `ImportProxy.HTTPSProxy` that determines the roxy https url, and the `ImportProxy.noProxy` which enforce that a list of hostnames and/or CIDRs will be not proxied, and finally, the `ImportProxy.TrustedCAProxy`, the ConfigMap name of an user-provided trusted certificate authority (CA) bundle to be added to the importer pod CA bundle. |
| insecureRegistries | nil | List of TLS disabled registries. |
| dataVolumeTTLSeconds | nil | Time in seconds after DataVolume completion it can be garbage collected. The default is 0 sec. To disable GC use -1. |
| dataVolumeTTLSeconds | nil | Time in seconds after DataVolume completion it can be garbage collected. Disabled by default. |
| tlsSecurityProfile | nil | Used by operators to apply cluster-wide TLS security settings to operands. |
filesystemOverhead configuration:
@ -38,9 +38,9 @@ kubectl patch cdi cdi --type='json' -p='[{ "op" : "add" , "path" : "/spec/confi
```bash
kubectl patch cdi cdi --type='json' -p='[{ "op" : "add" , "path" : "/spec/config/filesystemOverhead/global" , "value" : "0.0" }]'
```
To configure dataVolumeTTLSeconds (e.g. disable DataVolume garbage collection)
To configure dataVolumeTTLSeconds (e.g. enable DataVolume garbage collection)
```bash
kubectl patch cdi cdi --patch '{"spec": {"config": {"dataVolumeTTLSeconds": "-1"}}}' --type merge
kubectl patch cdi cdi --patch '{"spec": {"config": {"dataVolumeTTLSeconds": "0"}}}' --type merge
```
## Getting

View File

@ -5,8 +5,8 @@ Data Volumes(DV) are an abstraction on top of Persistent Volume Claims(PVC) and
Why is this an improvement over simply looking at the state annotation created and managed by CDI? Data Volumes provide a versioned API that other projects like [Kubevirt](https://github.com/kubevirt/kubevirt) can integrate with. This way those projects can rely on an API staying the same for a particular version and have guarantees about what that API will look like. Any changes to the API will result in a new version of the API.
### Garbage collection of successfully completed DataVolumes
Once the PVC population process is completed, its corresponding DV has no use, so it is garbage collected by default.
### Garbage collection of successfully completed DataVolumes (disabled by default)
Once the PVC population process is completed, its corresponding DV has no use, so it can be garbage collected.
Some GC motivations:
* Keeping the DV around after the fact sometimes confuses users, thinking they should modify the DV to have the matching PVC react. For example, resizing PVC seems to confuse users because they see the DV.
@ -14,7 +14,9 @@ Some GC motivations:
* Restore a backed up VM without the need to recreate the DataVolume is much simpler.
* Replicate a workload to another cluster without the need to mutate the PVC and DV with special annotations in order for them to behave as expected in the new cluster.
GC can be configured in [CDIConfig](cdi-config.md), so users cannot assume the DV exists after completion. When the desired PVC exists, but its DV does not exist, it means that the PVC was successfully populated and the DV was garbage collected. To prevent a DV from being garbage collected, it should be annotated with:
However, after several releases we decided to disable GC by default, as unfortunately it violates fundamental principle of Kubernetes. CR should not be auto-deleted when it completes its role (Job with TTLSecondsAfterFinished is an exception), and once CR was created we can assume it is there until explicitly deleted. In addition, CR should keep idempotency, so the same CR manifest can be applied multiple times, as long as it is a valid update (e.g. DataVolume validation webhook does not allow updating the spec).
GC can be configured in [CDIConfig](cdi-config.md), so users cannot assume the DV exists after completion. When the desired PVC exists, but its DV does not exist, it means that the PVC was successfully populated and the DV was garbage collected. To prevent a DV from being garbage collected (when enabled in CDIConfig), it should be annotated with:
```yaml
cdi.kubevirt.io/storage.deleteAfterCompletion: "false"
```

View File

@ -25016,7 +25016,7 @@ func schema_pkg_apis_core_v1beta1_CDIConfigSpec(ref common.ReferenceCallback) co
},
"dataVolumeTTLSeconds": {
SchemaProps: spec.SchemaProps{
Description: "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. The default is 0 sec. To disable GC use -1.",
Description: "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. Disabled by default.",
Type: []string{"integer"},
Format: "int32",
},

View File

@ -86,12 +86,11 @@ func (wh *dataVolumeMutatingWebhook) Admit(ar admissionv1.AdmissionReview) *admi
if err != nil {
return toAdmissionResponseError(err)
}
// Garbage collection is disabled by default
// Annotate DV for GC only if GC is enabled in CDIConfig and the DV is not annotated to prevent GC
if cc.GetDataVolumeTTLSeconds(config) >= 0 {
if modifiedDataVolume.Annotations == nil {
modifiedDataVolume.Annotations = make(map[string]string)
}
if modifiedDataVolume.Annotations[cc.AnnDeleteAfterCompletion] != "false" {
modifiedDataVolume.Annotations[cc.AnnDeleteAfterCompletion] = "true"
cc.AddAnnotation(modifiedDataVolume, cc.AnnDeleteAfterCompletion, "true")
}
}
}

View File

@ -1211,8 +1211,9 @@ func IsErrCacheNotStarted(err error) bool {
}
// GetDataVolumeTTLSeconds gets the current DataVolume TTL in seconds if GC is enabled, or < 0 if GC is disabled
// Garbage collection is disabled by default
func GetDataVolumeTTLSeconds(config *cdiv1.CDIConfig) int32 {
const defaultDataVolumeTTLSeconds = 0
const defaultDataVolumeTTLSeconds = -1
if config.Spec.DataVolumeTTLSeconds != nil {
return *config.Spec.DataVolumeTTLSeconds
}

View File

@ -724,9 +724,9 @@ func (r *DataImportCronReconciler) handleSnapshot(ctx context.Context, dataImpor
}
} else {
if cc.IsSnapshotReady(currentSnapshot) {
// Clean up PVC as that is not needed any more
pvc := &corev1.PersistentVolumeClaim{ObjectMeta: metav1.ObjectMeta{Name: desiredSnapshot.Name, Namespace: desiredSnapshot.Namespace}}
if err := r.client.Delete(ctx, pvc); err != nil && !k8serrors.IsNotFound(err) {
// Clean up DV/PVC as they are not needed anymore
r.log.Info("Deleting dv/pvc as snapshot is ready", "name", desiredSnapshot.Name)
if err := r.deleteDvPvc(ctx, desiredSnapshot.Name, desiredSnapshot.Namespace); err != nil {
return err
}
}
@ -821,13 +821,8 @@ func (r *DataImportCronReconciler) garbageCollectPVCs(ctx context.Context, names
return pvcList.Items[i].Annotations[AnnLastUseTime] > pvcList.Items[j].Annotations[AnnLastUseTime]
})
for _, pvc := range pvcList.Items[maxImports:] {
dv := cdiv1.DataVolume{ObjectMeta: metav1.ObjectMeta{Name: pvc.Name, Namespace: pvc.Namespace}}
if err := r.client.Delete(ctx, &dv); err == nil {
continue
} else if !k8serrors.IsNotFound(err) {
return err
}
if err := r.client.Delete(ctx, &pvc); err != nil && !k8serrors.IsNotFound(err) {
r.log.Info("Deleting dv/pvc", "name", pvc.Name, "pvc.uid", pvc.UID)
if err := r.deleteDvPvc(ctx, pvc.Name, pvc.Namespace); err != nil {
return err
}
}
@ -836,6 +831,20 @@ func (r *DataImportCronReconciler) garbageCollectPVCs(ctx context.Context, names
return nil
}
// deleteDvPvc deletes DV or PVC if DV was GCed
func (r *DataImportCronReconciler) deleteDvPvc(ctx context.Context, name, namespace string) error {
om := metav1.ObjectMeta{Name: name, Namespace: namespace}
dv := &cdiv1.DataVolume{ObjectMeta: om}
if err := r.client.Delete(ctx, dv); err == nil || !k8serrors.IsNotFound(err) {
return err
}
pvc := &corev1.PersistentVolumeClaim{ObjectMeta: om}
if err := r.client.Delete(ctx, pvc); err != nil && !k8serrors.IsNotFound(err) {
return err
}
return nil
}
func (r *DataImportCronReconciler) garbageCollectSnapshots(ctx context.Context, namespace string, selector labels.Selector, maxImports int) error {
snapList := &snapshotv1.VolumeSnapshotList{}
@ -850,6 +859,7 @@ func (r *DataImportCronReconciler) garbageCollectSnapshots(ctx context.Context,
return snapList.Items[i].Annotations[AnnLastUseTime] > snapList.Items[j].Annotations[AnnLastUseTime]
})
for _, snap := range snapList.Items[maxImports:] {
r.log.Info("Deleting snapshot", "name", snap.Name, "uid", snap.UID)
if err := r.client.Delete(ctx, &snap); err != nil && !k8serrors.IsNotFound(err) {
return err
}

View File

@ -92,8 +92,8 @@ spec:
properties:
dataVolumeTTLSeconds:
description: DataVolumeTTLSeconds is the time in seconds after
DataVolume completion it can be garbage collected. The default
is 0 sec. To disable GC use -1.
DataVolume completion it can be garbage collected. Disabled
by default.
format: int32
type: integer
featureGates:
@ -2327,8 +2327,8 @@ spec:
properties:
dataVolumeTTLSeconds:
description: DataVolumeTTLSeconds is the time in seconds after
DataVolume completion it can be garbage collected. The default
is 0 sec. To disable GC use -1.
DataVolume completion it can be garbage collected. Disabled
by default.
format: int32
type: integer
featureGates:
@ -4536,8 +4536,7 @@ spec:
properties:
dataVolumeTTLSeconds:
description: DataVolumeTTLSeconds is the time in seconds after DataVolume
completion it can be garbage collected. The default is 0 sec. To
disable GC use -1.
completion it can be garbage collected. Disabled by default.
format: int32
type: integer
featureGates:

View File

@ -928,7 +928,7 @@ type CDIConfigSpec struct {
Preallocation *bool `json:"preallocation,omitempty"`
// InsecureRegistries is a list of TLS disabled registries
InsecureRegistries []string `json:"insecureRegistries,omitempty"`
// DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. The default is 0 sec. To disable GC use -1.
// DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. Disabled by default.
// +optional
DataVolumeTTLSeconds *int32 `json:"dataVolumeTTLSeconds,omitempty"`
// TLSSecurityProfile is used by operators to apply cluster-wide TLS security settings to operands.

View File

@ -469,7 +469,7 @@ func (CDIConfigSpec) SwaggerDoc() map[string]string {
"filesystemOverhead": "FilesystemOverhead describes the space reserved for overhead when using Filesystem volumes. A value is between 0 and 1, if not defined it is 0.055 (5.5% overhead)",
"preallocation": "Preallocation controls whether storage for DataVolumes should be allocated in advance.",
"insecureRegistries": "InsecureRegistries is a list of TLS disabled registries",
"dataVolumeTTLSeconds": "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. The default is 0 sec. To disable GC use -1.\n+optional",
"dataVolumeTTLSeconds": "DataVolumeTTLSeconds is the time in seconds after DataVolume completion it can be garbage collected. Disabled by default.\n+optional",
"tlsSecurityProfile": "TLSSecurityProfile is used by operators to apply cluster-wide TLS security settings to operands.",
"imagePullSecrets": "The imagePullSecrets used to pull the container images",
}

View File

@ -2636,11 +2636,7 @@ var _ = Describe("all clone tests", func() {
snapshot = utils.WaitSnapshotReady(f.CrClient, snapshot)
By("Snapshot ready, no need to keep PVC around")
err = f.DeletePVC(pvc)
Expect(err).ToNot(HaveOccurred())
deleted, err := utils.WaitPVCDeleted(f.K8sClient, pvc.Name, f.Namespace.Name, 2*time.Minute)
Expect(err).ToNot(HaveOccurred())
Expect(deleted).To(BeTrue())
utils.CleanupDvPvc(f.K8sClient, f.CdiClient, f.Namespace.Name, pvc.Name)
}
BeforeEach(func() {

View File

@ -157,7 +157,7 @@ var _ = Describe("DataImportCron", func() {
case cdiv1.DataImportCronSourceFormatPvc:
pvc, err := f.K8sClient.CoreV1().PersistentVolumeClaims(ns).Get(context.TODO(), name, metav1.GetOptions{})
Expect(err).ToNot(HaveOccurred())
deleteDvPvc(f, name)
utils.CleanupDvPvcNoWait(f.K8sClient, f.CdiClient, f.Namespace.Name, name)
deleted, err := f.WaitPVCDeletedByUID(pvc, time.Minute)
Expect(err).ToNot(HaveOccurred())
Expect(deleted).To(BeTrue())

View File

@ -134,19 +134,22 @@ func DeleteDataVolume(clientSet *cdiclientset.Clientset, namespace, name string)
})
}
// CleanupDvPvc Deletes PVC if DV was GCed, otherwise wait for PVC to be gone
// CleanupDvPvc deletes DV or PVC if DV was GCed, and waits for PVC to be gone
func CleanupDvPvc(k8sClient *kubernetes.Clientset, cdiClient *cdiclientset.Clientset, namespace, name string) {
CleanupDvPvcNoWait(k8sClient, cdiClient, namespace, name)
deleted, err := WaitPVCDeleted(k8sClient, name, namespace, 2*time.Minute)
gomega.Expect(err).ToNot(gomega.HaveOccurred())
gomega.Expect(deleted).To(gomega.BeTrue())
}
// CleanupDvPvcNoWait deletes DV or PVC if DV was GCed
func CleanupDvPvcNoWait(k8sClient *kubernetes.Clientset, cdiClient *cdiclientset.Clientset, namespace, name string) {
err := cdiClient.CdiV1beta1().DataVolumes(namespace).Delete(context.TODO(), name, metav1.DeleteOptions{})
if apierrs.IsNotFound(err) {
// Must have been GCed, delete PVC
err = DeletePVC(k8sClient, namespace, name)
gomega.Expect(err).ToNot(gomega.HaveOccurred())
return
}
gomega.Expect(err).ToNot(gomega.HaveOccurred())
deleted, err := WaitPVCDeleted(k8sClient, name, namespace, 2*time.Minute)
gomega.Expect(err).ToNot(gomega.HaveOccurred())
gomega.Expect(deleted).To(gomega.BeTrue())
}
// NewCloningDataVolume initializes a DataVolume struct with PVC annotations