Commit Graph

56 Commits

Author SHA1 Message Date
Tuomas Katila
4e06690063 operator: gpu: prevent scenario where CRs both enable and disable resource management
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-11-10 12:31:24 +02:00
Tuomas Katila
f9221c46fd operator: remove one-cr-per-kind limitation
Differentiate objects by adding cr names as suffixes
Drop kind book keeping and related functions from controllers

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-11-09 13:05:40 +02:00
Tuomas Katila
88ae7c83eb sgx & gpu crds: improve comments and note sgx's initimage replacement with NFD rules
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-15 16:06:02 +03:00
Tuomas Katila
691dfc3483 gpu: refactor nfdhook functionality to plugin
NFD v0.14+ doesn't support binary NFD hooks by default, so there is
a need to move the label creation away from the GPU nfdhook.

Move extended resource label creation to plugin, and drop labels that were
already marked deprecated (platform_gen, media_version etc.).

Drop init-container from deployment files and operator. It is still possible
to use an initcontainer, but the default deployments do not support it.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-12 16:20:33 +03:00
Mikko Ylinen
69f5ccfe66 operator: update controller-gen to v0.13.0
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-05 14:30:10 +03:00
Mikko Ylinen
e428cd6c19 go.mod: update to k8s 1.27.1 and controller runtime 0.15.x
k8s 1.27.x triggers build errors on controller-runtime 0.14.x
so we will need to update to 0.15.x at the same time.

Changes include:

* k8s e2e framework moved to use Ginkgo context so we add
  test context to all our test nodes.
* adapt Ginkgo parameter modifications.
* adapt SGX admissionwebhook to InjectDecoder removal.
* adapt deviceplugins and FPGA CRDs to controller-runtime
  API changes.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-05-09 14:49:24 +03:00
Tuomas Katila
342554c666 lint fixes found from 0.26.1 release preparation
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-05-02 13:52:36 +03:00
Hyeongju Lee
ed08d11aa3
Merge pull request #1392 from mythi/PR-2023-019
sgx: stop using local source hooks for EPC registration
2023-05-02 12:26:12 +03:00
Mikko Ylinen
3a4c0e574f sgx: stop using local source hooks for EPC registration
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-04-28 14:59:41 +03:00
Mikko Ylinen
5bab034e47 operator: accept image SHA digests
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-04-28 14:59:21 +03:00
Hyeongju Johannes Lee
a6037eae3c
qat: add configuration of cfgServices to qat initcontainer
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-12-12 21:48:21 +02:00
Ed Bartosh
9dea92541a
Merge pull request #1088 from hj-johannes-lee/dlb-initcontainer
dlb: add initcontainer to plugin
2022-10-07 14:43:12 +03:00
Hyeongju Johannes Lee
11b04425c2 dlb: add initcontainer to plugin
initcontainer enables vfs and configures vfs
 - only first pf is used to configure a vf
 - only one vf is configured from the pf
add dlb-initcontainer kustomize overlay
update CRD to have initImage
implment operator to run initcontainer
update e2e test to run initcontainer overlay
update envtest to test initimage

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-10-06 17:11:03 +03:00
Tuomas Katila
8426fb975f operator: gpu: prevent resourcemanager's use without shared devices
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2022-10-05 13:09:06 +03:00
Mikko Ylinen
0f5afc258d operator: move to controller-tools v0.10.0
With the latest version of controller-tools, we get to set
reinvocationPolicy tag so that we no longer have to add that
field manually in our Admission Webhook manifests.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-09-21 19:37:00 +03:00
Hyeongju Johannes Lee
d3c8063ff3 qat: implement preferredAllocation policies
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-04-07 14:14:00 +03:00
Hyeongju Johannes Lee
df419b3a82 qat: add initimage to plugin
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-03-30 13:46:42 -07:00
Ed Bartosh
6b27cf1f7c Implement IAA plugin, operator, demo
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2022-03-04 15:58:42 +02:00
Oleg Zhurakivskyy
cfc8eb18cb operator: Support upgrade of plugins
The upgrade of the deployed plugins can be done by simply installing
a new release of the operator.

The operator auto-upgrades operator-managed plugins (CR images
and thus corresponding deployed daemonsets) to the current release
of the operator.

The [registry-url]/[namespace]/[image] are kept intact on the upgrade.

No upgrade is done for:

- Non-operator managed deployments
- Operator deployments without numeric tags

Closes #702

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2022-02-18 12:52:55 +02:00
Mikko Ylinen
51df411cb1 dsa: make initImage spec consistent with other APIs
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-01-11 08:17:35 +02:00
Hyeongju Johannes Lee
4c7219dee0 operator: update to 0.23.0 images
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-01-05 17:27:00 +02:00
Ed Bartosh
cec004c398 lint: enable wsl check
Fixes: #392

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-12-17 11:48:48 +02:00
Oleg Zhurakivskyy
6bba74acef dsa: Rename idxd-initcontainer to idxd-config-initcontainer
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-30 15:32:29 +02:00
Mikko Ylinen
2f514b9187 operator: fix SgxDevicePlugin webhook.Defaulter
The Defaulter wrongly assigned the "intel-sgx-initcontainer" to
Spec.Image instead of Spec.InitImage.

We don't see any problems because the default DaemonSet yaml
uses "intel-sgx-initcontainer".

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-22 15:39:36 +02:00
Dmitry Rozhkov
471549c11d
Merge pull request #753 from hj-johannes-lee/dlb-operator
operator: Add DLB support
2021-11-18 10:23:16 +02:00
Xu, Guoshu
e4c4a8f7ac GPU devices resource preferred allocation methods.
1. Implement PreferredAllocator interface.
2. Provide 3 preferred allocation policies: balancedPolicy, packedPolicy and nonePolicy.
3. Provide the cmdline interface: -allocation-policy balanced/packed/none, to select which preferred allocation policy to use.
4. Add operator support.

Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-17 22:55:10 +08:00
Hyeongju Johannes Lee
ff9034822b operator: Add DLB support
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-11-17 01:51:47 -08:00
Oleg Zhurakivskyy
a7c612f7fc dsa: Rename dsa initcontainer to idxd
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-09 12:00:44 +02:00
Oleg Zhurakivskyy
594a696879 operator: dsa: Add provisioning configurability
The provisioning config can be optionally stored in the ProvisioningConfig
configMap which is then passed to initcontainer through the volume mount.

There's also a possibility for a node specific congfiguration through
passing a nodename via NODE_NAME into initcontainer's environment
and passing a node specific profile via configMap volume mount.

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-09 10:31:50 +02:00
Mikko Ylinen
45f4666beb allow v1 CRD API only
controller-gen v0.7.0 dropped the support for v1beta1 CRD API as it
was also dropped in k8s.io v1.22.

update 'make generate' to only allow v1 CRD APIs and run it with
controller-gen v0.7.0.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-10-19 12:36:32 +03:00
Mikko Ylinen
3f5d92782f operator: update to 0.22.0 images
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-10-01 14:38:24 +03:00
Oleg Zhurakivskyy
94a13fc96f operator: dsa: Add InitImage for initcontainer
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-10-01 11:26:05 +03:00
Hyeongju Johannes Lee
8fc5df7e37 Add govet-fieldalignment
Add govet-fieldalignment to .golangci.yml
Fix errors that come from adding govet-fieldalignment
- by reordering the fields of structs
- by putting nolint:govet annotations

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-09-20 20:59:04 +03:00
Mikko Ylinen
c6b947e594 webhooks: use common constant for plugin min version
Move CRD validating webhooks' image min version checks to use
common constant across all the plugins.

After the change, we carry the same min version for all devices
and this version becomes easier to maintain when we make new
releases.

Each CRD webhook still carries its own xyzMinVersion if we decide
to go back to CRD specific versions later.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-09-13 18:28:17 +03:00
Eero Tamminen
4b26c94b29 Group struct members by type to minimize struct size
And avoid golangci-lint error.
2021-06-29 17:47:32 +03:00
Eero Tamminen
86a86e2863 Add "-enable-monitoring" GPU plugin option operator support
Based on Ukri's examples and tested by Ukri (thanks!).
2021-06-29 17:33:03 +03:00
Ukri Niemimuukko
39f7c4c747 gpu resource manager operator parts
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-06-24 11:49:08 +03:00
Oleg Zhurakivskyy
748863a3ea qatdeviceplugin_webhook: Reuse common function for the image name validation
Closes #605

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-06-14 09:00:22 +00:00
Mikko Ylinen
383778a24b qat: fix C4xxx driver name
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-06-10 08:45:23 +03:00
Oleg Zhurakivskyy
7d40758d14 webhook_common: Extract only relevant parts for the validation
Since we currently validate only the image name and the tag,
ignore registry, vendor and extract only relevant parts.

Closes #605

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-03-30 13:14:09 +00:00
Oleg Zhurakivskyy
5cea278170 deployments: Add 4xxxvf and c4xxvf to recognized QAT devices
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-02-18 10:37:10 +00:00
Mikko Ylinen
37618d4f85 operator: move deviceplugin/v1 CRDs to cluster scope
The device plugins daemonsets are cluster wide and currently only
one device plugin instance per device is possible so making the
corresponding deviceplugin/v1 CRDs non-namespaced (i.e., scope: cluster)
fits better.

Previously, the device plugin daemonset was deployed in the same
namespace as the CR for that device but with the cluster scoped CRDs
we default to use the same namespace as the operator, unless overridden
via DEVICEPLUGIN_NAMESPACE env variable or a command line parameter
to operator manager deployment.

Three additional changes in this commit:
- enable DSA envtest tests
- update controller-runtime to v0.8.1
- change device plugin envtest suite to use klog/v2

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-02-11 11:41:47 +02:00
Ed Bartosh
dac99ad81d operator: DSA: [re]generated files
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-02-09 02:13:35 +02:00
Ed Bartosh
884f8e3dfe operator: add DSA support
Fixes: #443

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-02-09 02:13:27 +02:00
Mikko Ylinen
3e7e818fb6
Merge pull request #518 from rojkov/full-operator-config-samples
operator: extend sample configs to include all possible specs
2020-12-21 20:53:21 +02:00
Dmitry Rozhkov
fdde9a8126 operator: extend sample configs to include all possible specs 2020-12-17 11:52:00 +02:00
Mikko Ylinen
d63037c2e1 Move to Admission v1 API
Update to controller-runtime v0.7.0 and Admission types to v1 with it.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-12-17 11:02:21 +02:00
Dmitry Rozhkov
f0fa9df292 operator: prepare for publishing at operatorhub.io 2020-11-24 18:35:56 +02:00
Dmitry Rozhkov
7e621f7905 upgrade controller-gen to v0.4.1
The new versions adds admissionReviewVersions annotation and makes it
mandatory.
2020-11-18 11:44:37 +02:00
Ukri Niemimuukko
c935570bab operator: GPU-plugin initImage
This adds the initImage field to the custom resource definition
and takes it into use.

The fpga webhook image validation function is split off into a
separate file.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-11-09 20:55:12 +02:00