Commit Graph

40 Commits

Author SHA1 Message Date
akalenyu
eb639a6ac5
Change some relationship labels on update as well (#2018)
* Update operator-lifecycle-sdk to get fix for labels on upgrade

Update dep to get https://github.com/kubevirt/controller-lifecycle-operator-sdk/pull/19

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Reconcile labels also for CDIConfig

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Reconcile labels on storageprofile

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Reconcile remaining operator resources for updated labels

BZ#2017478

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-11-23 16:16:49 +01:00
akalenyu
fd332a3165
Degraded/unusual restartcount alerts (#2009)
* Add degraded alert

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Add unusual restart count metric

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Add actual firing alerts (degraded/restartcount)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Test newly added metrics

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Review: Rename metric to match conventions, func to check if test is eligible to run metric tests

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Get rid of similar funcs, reconcile more generally

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-11-18 01:05:01 +01:00
Arnon Gilboa
7087b57cd2
Add DataImportCron controller (#1949)
* Add DataImportCron controller

-The new controller polls for updates to a registry source container
image, based on a given schedule. When updates to a container image are
detected, the controller imports the content into a new uniquely named
PVC in a golden image namespace.
-For each DataImportCron, the controller manages a corresponding
DataSource to always point to the latest most up-to-date golden
image PVC.
-DataImportCron takes ownership of an existing DataSource (with
controller: false), allowing an admin to opt-in to using auto
delivery/updates later on.
-The controller has PVC garbage collector removing old PVCs.

ToDo:
-status conditions updates
-verify full image streams support
-utests and func tests
-fixmes and commented out code
-doc

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix CR comments and fixmes

- isolate imagestream and registry specific code
- fix namespace of CronJob, and its job and pod to CDI namespace
- manage CronJob-DataImportCron ownership relationship with a finalizer,
  handle DataImportCron deletion (CronJob etc.)
- remove CronJob and job pod for ImageStreams, use RequeueAfter and
  cronexpr instead
- add k8s app cdi-source-update-poller executed by CronJob to poll source
  image digest via skopeo inspect for url registry source, and annotate
  the DataImportCron when the image was updated and pending for import based
  on the cron schedule
- add cdi-source-update-poller and skopeo binary to the cdi-importer container
- complete dataimportcron-validate and its tests
- reconcile - use context.Context instead of context.TODO
- remove uncached client
- doc

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix ImageStreams watch

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataImportCron DV template instead of source

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix CR comments

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Split updateSucceeded func

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Improve cdi-source-update-poller cmd logs

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove ImageStream reconcile

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove ImageStream watch

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove unnecessary AnnSourceUpdatePending

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* More CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Idempotentify initCron

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Recreate DV in case is't not found

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataImportCron spec.importsToKeep and status.currentImports

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataImportCron controller functional test

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add insecure TLS support

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove finalizers in cluster clean script

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Bound each import to its sha256 digest instead of latest

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataImportCron controller utests

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Tests CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Minor tests CR fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2021-11-11 20:09:48 +01:00
Michael Henriksen
aedaf513ec
Move apis to staging, push to containerized-data-importer-api (#1997)
* move apis to new staging area

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* add script to push to staging

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix lint check and api reference

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* push staging to api repo

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2021-10-28 13:40:24 +02:00
akalenyu
50c93e8b0e
Deploy alerts infra as part of our installation (#1979)
* Deploy alerts infra as part of our installation

Conditionally deploy the infrastructure that is needed to fire alerts for our users
when bad things are happening to CDI.

Testing with `KUBEVIRT_DEPLOY_PROMETHEUS=true`

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Watch and unit test all prometheus related resources

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* add gateway for changing monitoring namespace (rbac purposes)

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* refactor test to check for exact alert name and firing state

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Align pattern of ensuring prometheus resource exists for all

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Remove potential noisy event

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Extract duplicate code to function

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

* Dont use empty value for prometheus label due to open issue

https://github.com/prometheus-operator/prometheus-operator/issues/4325

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-10-26 21:26:07 +02:00
Vishesh Tanksale
abcb176429
Removing cdi-prometheus-metrics service for CDI installation (#1892)
Signed-off-by: Vishesh Ajay Tanksale <vtanksale@apple.com>

Co-authored-by: Vishesh Ajay Tanksale <vtanksale@apple.com>
2021-08-16 13:18:30 +02:00
akalenyu
2254cf0c1f
Add relationship labels (#1864)
Users don't want 👽 resources in clusters,
and we should also be able to tell if were part of a broader installation.

Note:
- Operator created resources were handled in https://github.com/kubevirt/controller-lifecycle-operator-sdk/pull/18
as these labels will be common to all resources deployed by the HCO.
- Now that the controller is guaranteed to have the labels, we can set env vars
that reference the label values (fieldRef) to spare calling GET on the CR in the controllers.
(thanks mhenriks).

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-07-28 20:05:24 +02:00
Arnon Gilboa
13275ce351
OS image poll and update API (#1808)
* Add CRD for DataSource definition

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add optional sourceRef to DataSource in DataVolumeSpec

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add CRD for DataImportCron definition

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add DataSource and DataImportCron generated files

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Code review fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* More code review fixes

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Code genrated after rebase

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Fix DV source reference in utests

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Remove api validation tests for missing data volume source

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>

* Add standard fields to condition structs

Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
2021-06-14 13:58:42 +02:00
Michael Henriksen
d92c2f459d
update deps and bazel (#1815)
* update deps and bazel

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix apidocs and unit tests

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix generate-verify

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2021-06-08 01:31:59 +02:00
akalenyu
0428dc5465
Stop using deprecated admissionregistration, apiregistration v1beta1 (#1804)
Switch admissionregistration.k8s.io/v1beta, apiregistration.k8s.io/v1beta1 to v1
as they are deprecated and will be removed from k8s-1.22.

apiextensions.k8s.io/v1beta1 was updated to v1 by https://github.com/kubevirt/containerized-data-importer/pull/1307.

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
2021-05-26 22:52:47 +02:00
Michael Henriksen
ee2f8376bb
fix custom cert rotation params (#1775)
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2021-05-06 20:19:39 +02:00
Michael Henriksen
3447bb84c7
Cluster scoped DataVolume/PVC namespace transfer API (#1673)
* Cluster-scoped namespace transfer api and controller

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* unit tests

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* ObjectTransfer webhook

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* new functests

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* experiment with termination grace period

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* quota test

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2021-02-24 20:45:24 +01:00
Bartosz Rybacki
386dbf413f
Add CRD for the StorageProfile definition (#1629)
Signed-off-by: Bartosz Rybacki <brybacki@redhat.com>
2021-02-18 02:53:02 +01:00
Michael Henriksen
7c05e8f093
Designate CDI as CDIConfig authority (#1516)
* Formally designate CDI as owner of CDIConfig by adding annotation cdi.kubevirt.io/configAuthority

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* More robust upgrade handling.  No error if beta api not installed yet.

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-12-04 02:52:40 +01:00
Jakub Dzon
7f368900de
Updated controller-lifecycle-operator-sdk dependency (#1389)
Signed-off-by: Jakub Dzon <jdzon@redhat.com>
2020-09-24 14:39:29 +02:00
Jakub Dzon
5aa47587d3
Introducing operator lifecycle sdk (#1350)
Signed-off-by: Jakub Dzon <jdzon@redhat.com>
2020-09-17 23:25:26 +02:00
Maya Rashish
e3436e0199
Allow specifying nodeSelector, affinity and tolerations for CDI pods (#1346)
* Generate CDI CRD using controller-tools.

This is only done for CDI CRD as it requires the existence of source
code. Other CRDs we create are created by a more bare bones pod.

CDIUninstallStrategy was missing a comment describing it, so add
one. This was spotted manually so there might be more missing.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Allow users to specify which nodes CDI pods will live on.

nodeSelector, affinity and tolerations are possible values.

This is done in the CDI CR (rather than CDIConfig) as we are
interested in having this field be populated by external operators.

Unit tests now require the existence of a CDI CR, so create it.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Add a unit test covering some node placement functions

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Specify that all our pods are linux-only.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Avoid duplicate test, accidental left over.

Pointed out by awels, thanks.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Rename to cdiOperatorDeployment for clarity.

Suggested by awels

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Specify we only run on linux using the CDI CR, no need to embed this
into the code.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Don't dereference workloadPlacement for no reason

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Split off operator test to have its own AfterEach, BeforeEach.

Use even more descriptive function names.

Do all the CDI delete/restore logic in AfterEach, to ensure that
it happens and restores the deployment with the original CR even
if the test fails.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Remove XXX. This is the proper way.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Adapt to latest changes in controller_test.go (renaming import)

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Simplify, not storing intermediate value.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Don't dereference nodeplacement in callers to CreateDeployment

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Remove redundant save & restore. Unit tests do this for us.

Pointed out by awels, thanks.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Split out "find toplevel" to a utility function

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Wait for the CDI CR update to apply before continuing.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Simplify, not storing intermediate value.

Signed-off-by: Maya Rashish <mrashish@redhat.com>

* Make it clear that the chosen node placement will not be schedulable.

Signed-off-by: Maya Rashish <mrashish@redhat.com>
2020-09-03 22:13:18 +02:00
Alexander Wels
6cf86d5984
Add events to operator (#1182)
* Add events to operator condition changes
Add events to operator create/delete/update of managed resources.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Updated unit tests based on comments

Signed-off-by: Alexander Wels <awels@redhat.com>

* rebase on betav1

Signed-off-by: Alexander Wels <awels@redhat.com>

* Removed start events to reduce event generation spam

Signed-off-by: Alexander Wels <awels@redhat.com>
2020-08-27 18:59:15 +02:00
Alexander Wels
6dce12f090
Move CRDS from apiextensions v1beta1 to v1. (#1307)
* Move CRDS from apiextensions v1beta1 to v1.
Ensure that our code based schema validation matches the types in the api.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Ran go mod tidy and vendor in attempt to see if we could use newer runtime controller, but our go version too old.
Addressed review comments.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Addressed more review comments and fixed k8s-1.18 functional test failing.

Signed-off-by: Alexander Wels <awels@redhat.com>

* Remove categories 'all' from cluster scoped CRDs

Signed-off-by: Alexander Wels <awels@redhat.com>
2020-08-01 01:01:50 +02:00
Michael Henriksen
9e2c79b1e0
move api groups to v1beta1 (#1232)
* move upload.cdi.kubevirt.io API group to v1beta1

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* move core api to v1beta1

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix os-3.11 cluster sync and add functional tests for alpha api

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* change more occurences of v1alpha1

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* updates after rebase

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-07-10 15:47:38 +02:00
Nahshon Unna Tsameret
ece10521e9
[Upgrade Operator] Make sure that ObservedVersion is updated (#1213)
Fix #1212

Make sure that the `Status.ObservedVersion` fiels  on upgrade, even if it was not set in the previous version.

Signed-off-by: Nahshon Unna-Tsameret <nunnatsa@redhat.com>
2020-05-26 15:17:31 +02:00
Michael Henriksen
fba04c868b
use dedicated SCC (#1174)
* use dedicated SCC

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* SCC was not getting on initial deploy

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-04-15 15:38:03 +02:00
Alexander Wels
5ae438935c
Create prometheus service in cdi namespace. (#1170)
Signed-off-by: Alexander Wels <awels@redhat.com>
2020-04-15 01:41:59 +02:00
Michael Henriksen
03c36c8cd8
wait for all old resources to be deleted when installing CDI (#1156)
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-03-27 05:18:32 +01:00
Michael Henriksen
64d7a26a65
need to use uncached client in certain places (#1107)
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-02-16 17:30:46 +01:00
Michael Henriksen
0b9fb15e86
operator create apiservice and webhook configurations (#1103)
Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-02-11 05:45:15 +01:00
Michael Henriksen
bd4c4c950b
cert rotation (#1091)
* initial cert rotation controller

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix typo

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-02-03 23:36:58 +01:00
Michael Henriksen
99f8af5b86 k8s client upgrade to 1.16 (#1079)
* initial client upgrade to 1.16

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* fix Route detection in OpenShift

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2020-01-14 13:43:17 +01:00
Michael Henriksen
97c23cfa5a remove DOCKER_REPO from operator (#1022)
* remove DOCKER_REPO from operator

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>

* make generate and update CDI schema

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
2019-11-14 02:59:16 +01:00
Alexander Wels
28b0b7b70b
Set conditions properly while deploying. (#948)
Signed-off-by: Alexander Wels <awels@redhat.com>
2019-09-04 12:15:28 -04:00
Alexander Wels
45eecea14e
Added conditions to match the HCO requirements. (#910)
Signed-off-by: Alexander Wels <awels@redhat.com>
2019-08-28 18:36:44 -04:00
Michael Henriksen
412b6e10ca CDI upgrade support (#929)
* * Initial upgrade support
* - Detect from reconcile loop that it is uograde flow
* - Set ObeservedVersion to target when upgrade is finished
* - Delete unused objects at the end of upgrade

* *     opertor controller unit test - detect upgrade
    *  cdi upgrade unit tests
    *  - verify upgrade flow is detected when version is updated
    *  - verify on upgrade objects are updated
    *  - verify on upgrade unused objects are deleted

* * optimize cleanuoUnusedResourses function
* fix logging error

* * CR fixes
* remove unused methods in unit tests
* use reflect.DeepEqual to compare runtime.Objects in unit test
* check DeletionTimeStamp before entering upgrade

* * uit tests - CR is deleted during/before upgrade

* * CR fixes:
* - invoke Deletion callbacks before and after resource deletion on clenaupUnusedResourse function
* - when looking for object to delete - search not only by name but by namespace as well

* * delete unused resources of previous version is CDI CRF is marked for deletion during upgrade
* add unit test for this case

* * should not start upgrade if versions are identical

* * add unit tests to verify there is no upgrade on identical versions

* CR fix - return error

* don't think we have to explicitly cleanup old resources when CDI deleted during upgrade

* refactor code and properly handle deleting resources on upgrade

* reconcile loop now does three way merge to better handle upgrade
2019-08-27 08:43:49 -04:00
Michael Henriksen
f8b79ba5bc CCC reconsiliation in callbacks also improved merge route creation TODO 2019-08-05 22:55:42 -04:00
Michael Henriksen
834b85ecbf Network clone (#897)
* network cloning

* fix clone progress
2019-08-01 16:01:25 -04:00
annastopel
f634cdaa17 CDI operator OLM integration:
- Generate OLM related manifests for CDI in _out/manifests/release/olm
      OLM bundle:
	- cdi CSV manifest
	- cdi crd manifest
	- cdi package manifest
     - operatorsource manifest
     - subscription manifest
     - operatorgroup manifest
- Modify cdi-operator role not to be cluster-admin but more specific
- Move all final manifests to _out/manifests directory and update travis with new manifests location
- Provide API for vendoring CDI OLM manifests generation code

Note:
  - OLM CSV update to be supported in a separate PR
  - OLM bundle integration in travis is to be supported together with CSV update
2019-05-01 13:54:28 +03:00
Michael Henriksen
680e223277 allow for override of registry and tag in CDI object 2019-04-19 10:58:12 -04:00
Michael Henriksen
d2a3b1cc2f operator creates upload proxy route 2019-03-26 09:16:24 -04:00
Michael Henriksen
3892a7310d add configmap for insecure regestries 2019-02-25 20:12:56 -05:00
Michael Henriksen
cb660fcfe0 add tests for controller runtime bootstrapping. gotta hit our coverage numbers, bro. otherwise kind of useless IMO 2019-01-29 13:15:46 -05:00
Michael Henriksen
277193f18a operator unit tests 2019-01-29 12:50:27 -05:00