Commit Graph

15 Commits

Author SHA1 Message Date
Tuomas Katila
95b7230374 gpu: enable monitoring for the default installations
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-12-08 08:42:08 +02:00
Tuomas Katila
691dfc3483 gpu: refactor nfdhook functionality to plugin
NFD v0.14+ doesn't support binary NFD hooks by default, so there is
a need to move the label creation away from the GPU nfdhook.

Move extended resource label creation to plugin, and drop labels that were
already marked deprecated (platform_gen, media_version etc.).

Drop init-container from deployment files and operator. It is still possible
to use an initcontainer, but the default deployments do not support it.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-09-12 16:20:33 +03:00
Tuomas Katila
cb04ca0deb deployments: move from 'patchesStrategicMerge' to 'patches'
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-03 10:37:44 +03:00
Tuomas Katila
ec2930b331 deployments: move from 'bases' to 'resources'
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-08-03 10:37:44 +03:00
Tuomas Katila
974829ff7c gpu: try to fetch PodList from kubelet API
In large clusters and with resource management, the load
from gpu-plugins can become heavy for the api-server.
This change will start fetching pod listings from kubelet
and use api-server as a backup. Any other error than timeout
will also move the logic back to using api-server.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-03-30 12:43:02 +03:00
Tuomas Katila
3a1880ec8b Remove overlays using kube-system
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-02-13 12:47:22 +02:00
Tuomas Katila
d1e8350c6e gpu: add new nfd + monitoring + shared-dev deployment option
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2023-01-05 14:13:13 +02:00
Ukri Niemimuukko
1d09cd6549 align gpu kustomize object naming with operator naming
Operator has used "gpu-manager" as part of the cluster object names
it creates. Kustomize based deployments can be aligned with that.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2022-09-26 19:50:55 +03:00
Tuomas Katila
8ecf258a82 gpu: add nodeSelector to fractional overlay
Updated documentation indicates that fractional overlay
uses nfd so maybe it should.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2022-09-22 10:45:11 +03:00
Tuomas Katila
c562db9b28 gpu: Improve installation options and documentation
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2022-09-15 15:19:23 +03:00
Alex Nordlund
79986a6096 Replace to-be obsolete patchesStrategicMerge with patches 2022-08-25 21:07:32 +02:00
Alex Nordlund
0636e2d3a1 Replace obsolete patches with patchesStrategicMerge
This was made obsolete in v1.0.9
https://github.com/kubernetes-sigs/kustomize/blob/v1.0.9/pkg/types/kustomization.go#L129
And stopped working in v3.0.3
https://github.com/kubernetes-sigs/kustomize/issues/1373
2022-08-25 09:28:03 +02:00
Ukri Niemimuukko
2341119d5e remove monitoring arg from fractional resource overlay
An easter-egg slipped in the args. This removes it.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-06-16 12:09:07 +03:00
Ukri Niemimuukko
2c4d529d66 gpu_plugin: fractional resource management
Fractional resource management feature

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@intel.com>
2021-06-04 13:06:50 +03:00
Antti Kervinen
d568f050c5 gpu_plugin: add kustomizations
- Default deployment: `kubectl apply -k deployments/gpu_plugin`
- Default deployment does not specify namespace anymore
  (was: `kube-system`).
- Variant: deploy only on nodes with Intel GPU label by NFD:
  `kubectl apply -k deployments/gpu_plugin/overlays/nfd_labeled_nodes`
- Variant: deploy to `kube-system` instead of user-defined namespace
  (or "default"):
  `kubectl apply -k deployments/gpu_plugin/overlays/namespace_kube-system`
- GPU plugin README updated.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-02-07 14:56:52 +02:00