Additional objects are shared between device plugin CRs. Once the last
CR is removed, the additional objects are also removed.
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
Differentiate objects by adding cr names as suffixes
Drop kind book keeping and related functions from controllers
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
NFD v0.14+ doesn't support binary NFD hooks by default, so there is
a need to move the label creation away from the GPU nfdhook.
Move extended resource label creation to plugin, and drop labels that were
already marked deprecated (platform_gen, media_version etc.).
Drop init-container from deployment files and operator. It is still possible
to use an initcontainer, but the default deployments do not support it.
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
In large clusters and with resource management, the load
from gpu-plugins can become heavy for the api-server.
This change will start fetching pod listings from kubelet
and use api-server as a backup. Any other error than timeout
will also move the logic back to using api-server.
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
NFD hooks are deprecated and going away:
https://github.com/kubernetes-sigs/node-feature-discovery/issues/856
This makes the mount names more future-proof, and shows where later
changes need to be done (to change operator mount directory, and
switch hook-using deployments e.g. to feature files).
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
The amount of GPU plugin parameters has increased but the
args slice capacity has not been changed. Update it to avoid
slice reallocations.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
The upgrade of the deployed plugins can be done by simply installing
a new release of the operator.
The operator auto-upgrades operator-managed plugins (CR images
and thus corresponding deployed daemonsets) to the current release
of the operator.
The [registry-url]/[namespace]/[image] are kept intact on the upgrade.
No upgrade is done for:
- Non-operator managed deployments
- Operator deployments without numeric tags
Closes#702
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
In order to make controllers consistent, I add a nodeselector constraint of daemonset to dlb, fpga, qat too.
Since the same code is commonly used in many files, I add a function that replaces duplicated code.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Resources in clusters with OwnerReferencesPermissionEnforcement
(e.g., OpenShift) get stricter checks for metadata.ownerReferences.
This appears via errors like:
“is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to
a resource you can’t set finalizers on: ...”
The fix is to add "update" permissions to finalizers subresource
for the xDevicePlugins resources.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
1. Implement PreferredAllocator interface.
2. Provide 3 preferred allocation policies: balancedPolicy, packedPolicy and nonePolicy.
3. Provide the cmdline interface: -allocation-policy balanced/packed/none, to select which preferred allocation policy to use.
4. Add operator support.
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
The device plugins daemonsets are cluster wide and currently only
one device plugin instance per device is possible so making the
corresponding deviceplugin/v1 CRDs non-namespaced (i.e., scope: cluster)
fits better.
Previously, the device plugin daemonset was deployed in the same
namespace as the CR for that device but with the cluster scoped CRDs
we default to use the same namespace as the operator, unless overridden
via DEVICEPLUGIN_NAMESPACE env variable or a command line parameter
to operator manager deployment.
Three additional changes in this commit:
- enable DSA envtest tests
- update controller-runtime to v0.8.1
- change device plugin envtest suite to use klog/v2
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
This adds the initImage field to the custom resource definition
and takes it into use.
The fpga webhook image validation function is split off into a
separate file.
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>