Commit Graph

289 Commits

Author SHA1 Message Date
Tuomas Katila
95b7230374 gpu: enable monitoring for the default installations
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-12-08 08:42:08 +02:00
Mikko Ylinen
fdb376c46c operator: update kube-rbac-proxy to v0.15.0
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-11-16 10:25:44 +02:00
Mikko Ylinen
2ae4eb9f0f sgx: use NodeFeatureRule SGX label in the NFD overlay
Our SGX README guides users to first deploy NFD and create NodeFeatureRules
when sgx_plugin/overlays/epc-nfd is used. However, it turns out
the "SGX enabled" label is not being used by the plugin DaemonSet.

Use "intel.feature.node.kubernetes.io/sgx": "true" as the nodeSelector
value when the kustomization overlay with NFD is used.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-11-13 11:20:53 +02:00
Tuomas Katila
0164bf340c nfd/gpu: drop empty keys from platform rules
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-10-23 10:56:14 +03:00
Mikko Ylinen
319843c94e vpu: remove deprecated plugin
The VPU plugin can only be used with devices that are
no longer supported by upper layers, such as OpenVINO.

The deprecation plan for the plugin was announced earlier
this year and post v0.28 marks the date when the plugin is removed
from the repo.

Releases before v0.29 have the plugin available should it
be needed.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-10-02 15:28:11 +03:00
Tuomas Katila
646cee6e12 operator: update to 0.28.0 images
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-25 09:10:20 +03:00
Mikko Ylinen
834f598f80 deployments: update to NFD v0.14.1 and drop custom GPU deployment
With the NFD recent versions (v0.13+), it's no longer necessary to
start NFD with custom nfd-master args/rbac settings to get numeric
labels registered as extended resources.

The same can be specified via NodeFeatureRules which also works for
"local" source with feature files.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-20 14:02:52 +03:00
Tuomas Katila
88ae7c83eb sgx & gpu crds: improve comments and note sgx's initimage replacement with NFD rules
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-15 16:06:02 +03:00
Tuomas Katila
5ca3df666a xpum-sidecar: update deployment yaml to a newer one
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-14 13:21:35 +03:00
Tuomas Katila
ea659a5e4b nfd: add rules to label nodes with different GPUs
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-12 16:20:33 +03:00
Tuomas Katila
691dfc3483 gpu: refactor nfdhook functionality to plugin
NFD v0.14+ doesn't support binary NFD hooks by default, so there is
a need to move the label creation away from the GPU nfdhook.

Move extended resource label creation to plugin, and drop labels that were
already marked deprecated (platform_gen, media_version etc.).

Drop init-container from deployment files and operator. It is still possible
to use an initcontainer, but the default deployments do not support it.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-12 16:20:33 +03:00
Mikko Ylinen
69f5ccfe66 operator: update controller-gen to v0.13.0
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-05 14:30:10 +03:00
Mikko Ylinen
23314c8bd1 deployments: update NFD to v0.13.4
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-09-05 14:29:48 +03:00
Mikko Ylinen
7f685b5d89 sgx: add QuoteVerification demo and cleanup hostNetwork dependency
hostNetwork usage for SGX demo pods is not absolutely necessary so it's
better to clean it up and make IAS "security" scanners happier. It was
originally used to be able to use "localhost" PCCS but this change now
adds an example how proper PCCS url can be configured using jq.

Additionally, SGX DCAP Quote Verification is added.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-08-31 14:23:19 +03:00
Hersh Pathak
f1bb5b7270 Update references to OpenShift
Remove obsolete content related to OpenShift version of operator.
Update links to point to Intel Technology Enabling for OpenShift: https://github.com/intel/intel-technology-enabling-for-openshift.
Signed-off-by: Hersh Pathak hersh.pathak@intel.com
2023-08-24 08:56:44 -07:00
Tuomas Katila
4212145126 e2e: gpu: add a basic tensorflow test
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-22 15:51:35 +03:00
Mikko Ylinen
c3a3561cb8 webhooks: stop handling Pod updates
FPGA and SGX webhooks mutate container resources which
are immutable. Therefore, stop processing pod updates
and act on creation only.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-08-14 15:18:51 +03:00
Mikko Ylinen
6faf978ae1 deployments: update NFD to v0.13.3.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-08-07 09:13:45 +03:00
Tuomas Katila
e92b752d75 deployments: move from 'vars' to 'replacements'
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-03 10:37:44 +03:00
Tuomas Katila
cb04ca0deb deployments: move from 'patchesStrategicMerge' to 'patches'
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-03 10:37:44 +03:00
Tuomas Katila
ec2930b331 deployments: move from 'bases' to 'resources'
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-03 10:37:44 +03:00
Manish Regmi
c3259ee22f Add SELinux Labels for DSA and IAA
Proper SELinux labels are required for the plugins to run in SELinux
enabled clusters like openshift. These labels are custom made for
plugins and are part of container-selinux package.

Signed-off-by: Manish Regmi <manish.regmi@intel.com>
2023-07-20 16:02:08 -04:00
Mikko Ylinen
89986b9972
Merge pull request #1477 from hj-johannes-lee/PR-2023-023
Makefile: update versions & FPGA: fix naked return error from linter
2023-07-20 18:33:57 +03:00
Hyeongju Johannes Lee
bf286c689d update version of controller gen to v0.12.1
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2023-07-20 10:17:44 +03:00
Mikko Ylinen
34baf982b8 operator: add missing IaaDevicePlugin finalizers RBAC
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-07-18 08:25:19 +03:00
Tuomas Katila
71cac592cf
Merge pull request #1465 from hj-johannes-lee/PR-2023-019
dsa: replace token with read_buffer in conf
2023-06-30 09:56:24 +03:00
Hyeongju Johannes Lee
521b0275dc dsa: replace token with read_buffer in conf
token related attributes got deprecated. To remove warnings,
replace with read_buff.

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2023-06-29 00:07:42 +03:00
Tuomas Katila
20c5bf97ff operator: update sample versions to 0.27.1
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-06-26 16:13:53 +03:00
Tuomas Katila
967a043ca2 operator: update samples to 0.27.0 version
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-06-05 09:06:28 +03:00
Tuomas Katila
13097ac78d operator: increase memory resources to 100/120Mi
Fixes: #1416

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-05-22 08:42:40 +03:00
Mikko Ylinen
e428cd6c19 go.mod: update to k8s 1.27.1 and controller runtime 0.15.x
k8s 1.27.x triggers build errors on controller-runtime 0.14.x
so we will need to update to 0.15.x at the same time.

Changes include:

* k8s e2e framework moved to use Ginkgo context so we add
  test context to all our test nodes.
* adapt Ginkgo parameter modifications.
* adapt SGX admissionwebhook to InjectDecoder removal.
* adapt deviceplugins and FPGA CRDs to controller-runtime
  API changes.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-05-09 14:49:24 +03:00
Mikko Ylinen
16724043b2 operator: move to controller-tools v0.12.0
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-05-05 15:02:36 +03:00
Hyeongju Johannes Lee
807949d29e qat_dpdk_app: add test-crypto1 for gen4 devices
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2023-05-02 03:39:21 -07:00
Hyeongju Lee
ed08d11aa3
Merge pull request #1392 from mythi/PR-2023-019
sgx: stop using local source hooks for EPC registration
2023-05-02 12:26:12 +03:00
Mikko Ylinen
6b5e65a137 operator: update kube-rbac-proxy image to v0.14.1
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-05-02 09:16:24 +03:00
Mikko Ylinen
3a4c0e574f sgx: stop using local source hooks for EPC registration
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-04-28 14:59:41 +03:00
Hyeongju Johannes Lee
1a41402903 qat init: make conf optional 2023-04-27 12:48:27 -07:00
Tuomas Katila
974829ff7c gpu: try to fetch PodList from kubelet API
In large clusters and with resource management, the load
from gpu-plugins can become heavy for the api-server.
This change will start fetching pod listings from kubelet
and use api-server as a backup. Any other error than timeout
will also move the logic back to using api-server.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-03-30 12:43:02 +03:00
Mikko Ylinen
934c00f5fc qat: add support for QAT 402xx
Based on
https://lore.kernel.org/linux-crypto/20230303165650.81405-1-damian.muszynski@intel.com/

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-03-09 15:06:30 +02:00
Mikko Ylinen
1090c12c74 deployments: fix kubectl SGX w/ NFD instructions
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-03-07 09:19:30 +02:00
Mikko Ylinen
4d96c1b49d deployments: update SGX NodeFeatureRules
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-03-07 09:19:30 +02:00
Mikko Ylinen
3dc815cda9 deployments: fix FPGA plugin namespace
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-02-21 21:15:06 +02:00
Mikko Ylinen
eb632f625a deployments: remove unused deviceplugins RBAC rules
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-02-21 20:14:03 +02:00
Mikko Ylinen
5c6e60eeb1 operator: move to controller-tools v0.11.3
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-02-21 20:14:03 +02:00
Tuomas Katila
3a1880ec8b Remove overlays using kube-system
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-02-13 12:47:22 +02:00
Mikko Ylinen
f559d8717d
Merge pull request #1327 from eero-t/nfd-features
Use more generic name for NFD features host directory volume
2023-02-13 11:45:26 +02:00
Eero Tamminen
2f3dc23651 Use more generic name for NFD features host directory volume
NFD hooks are deprecated and going away:
https://github.com/kubernetes-sigs/node-feature-discovery/issues/856

This makes the mount names more future-proof, and shows where later
changes need to be done (to change operator mount directory, and
switch hook-using deployments e.g. to feature files).

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
2023-02-08 18:20:41 +02:00
Mikko Ylinen
c65d4ab896 operator: update to 0.26.0 images
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-01-20 11:49:51 +02:00
Mikko Ylinen
019c6b6cc5 deployments: update to NFD v0.12.1
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-01-19 12:00:06 +02:00
Mikko Ylinen
90aeca48c5 deployments: update SGX configuration
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-01-12 09:41:17 +02:00