Commit Graph

7 Commits

Author SHA1 Message Date
Aviv Litman
3bb70209d0
Refactor monitoring code (#3009)
* refactor monitoring

Signed-off-by: avlitman <alitman@redhat.com>

* Upgrade pointer to pnt

Signed-off-by: avlitman <alitman@redhat.com>

* fix controller base and ready gague

Signed-off-by: avlitman <alitman@redhat.com>

---------

Signed-off-by: avlitman <alitman@redhat.com>
2024-01-02 09:17:18 +01:00
Arnon Gilboa
edda5abe0f
Add new Prometheus alerts and label existing alerts (#2998)
* Add Prometheus alerts and label existing alerts

- CDINoDefaultStorageClass - not having a default (or virt default)
SC is surely not an OpenShift error, as admins may prefer their cluster
users to only use explicit SC names. However, in the CDI context when
DV is created with default SC but default does not exist, we will fire
an error event and the PVC will be Pending for the default SC, so when
there are such Pending PVCs we will fire an alert.

- CDIDefaultStorageClassDegraded - when the default (or virt default)
SC does not support CSI/Snapshot clone (smart clone) or does not have
ReadWriteMany access mode (for live migration).

- CDIStorageProfilesIncomplete - add storageClass and provisioner
labels.

- CDIDataImportCronOutdated - add dataImportCron namespace and name
labels.

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Create stub VolumeSnapshotClass for testing

Including the VolumeSnapshot/Class/Content crds for the
CDIDefaultStorageClassDegraded alert func test.

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add snapshot manifests for tests

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Deploy snapshot CRDs in the hpp destructive lane

Remove stub snapshot CRDs

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add label explanation to new metric help

Also rename the metric kubevirt_cdi_storageprofile_status to
kubevirt_cdi_storageprofile_info since it always reports value 1,
where the label values provide the details about the storage
class and storage profile.

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Revert NoProvisioner check removal

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Nicify StorageProfile metric update

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

---------

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2023-12-19 12:29:08 +01:00
Arnon Gilboa
0ee4a61987
Get rid of DataImportCron finalizer (#2144)
* Get rid of DataImportCron finalizer

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove CRDs deletion in operator deletion

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Cleanups

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2022-02-12 05:56:08 +01:00
Michael Henriksen
d56e0cca05
23 libs (#2077)
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2022-01-07 16:56:25 +01:00
Arnon Gilboa
d77abc3fa9
Add DataSource controller to update the Ready condition (#2085)
even when there is no DataImportCron associated

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2022-01-06 21:06:23 +01:00
akalenyu
3bff27dd43
Add alert for DataImportCron not being up to date (#2063)
* Add alert for DataImportCron failing

DataImportCrons now have conditions (particularly UpToDate) that tell us if
things are going as planned. We can utilize those to alert whenever were not UpToDate for a while.

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Address CR review; don't List, increment when needed via corresponding instance

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Address review & bugfix: don't update metric if err occurs

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* upToDateCondition => prevUpToDateCondition so it's clear we're deciding if we should inc/dec based on that

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Don't store state in controller; change metric type to GaugeVec (bool metric per DIC)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-12-24 01:10:52 +01:00
Arnon Gilboa
fe018f1dc5
Add DataImportCron status conditions (#2045)
* Add DataImportCron status conditions

The `DataImportCron` controller updates the status conditions in a
controlled `DataImportCron` and its managed `DataSource`.

DataImportCron:
- UpToDate - indicates if the the most recent import is successful and
    `DataSource` is up-to-date. Updated to False whenever the source
     digest (latest sha256) is updated.
- Progressing - indicates whether the cron is currently in the process
    of importing. Updated to True if there is a current import and its
    `DataVolume` is `ImportInProgress`, otherwise False.

DataSource:
- Ready - indicates that the corresponding pvc exists and is populated.
    Update according to `DataImportCron.Status.LastImportedPVC`
    `DataVolume`'s `DataVolumeReady` condition, if the `DataVolume`
    exists. Otherwise False. Unlike `DataImportCron` `UpToDate`
    condition, this one does not care about newer source digest.

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataImportCron RetentionPolicy and remove OwnerReferences

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* More CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add tests for retention policies and datasource/datavolume recreation if deleted

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add status condition tests

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* SetRecommendedLabels for all created CRs

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2021-12-16 02:21:01 +01:00