* Add alert for incomplete storage profiles
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Run metric tests on both openshift and k8s
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add functional test for storageprofile metrics
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Delete profile as a follow up to storage class getting deleted
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Address review, alter tests to cover List metric approach
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Address review; individually loop over metric decrement, shorten reconcile.Result{}
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Address review; deletion timestamp not possible when err/teardown in AfterEach
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add degraded alert
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add unusual restart count metric
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add actual firing alerts (degraded/restartcount)
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Test newly added metrics
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Review: Rename metric to match conventions, func to check if test is eligible to run metric tests
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Get rid of similar funcs, reconcile more generally
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Deploy alerts infra as part of our installation
Conditionally deploy the infrastructure that is needed to fire alerts for our users
when bad things are happening to CDI.
Testing with `KUBEVIRT_DEPLOY_PROMETHEUS=true`
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Watch and unit test all prometheus related resources
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* add gateway for changing monitoring namespace (rbac purposes)
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* refactor test to check for exact alert name and firing state
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Align pattern of ensuring prometheus resource exists for all
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Remove potential noisy event
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Extract duplicate code to function
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
* Dont use empty value for prometheus label due to open issue
https://github.com/prometheus-operator/prometheus-operator/issues/4325
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>