From 36046d90a4bd042aafc733d5f339c2bc77e12897 Mon Sep 17 00:00:00 2001 From: Eero Tamminen Date: Mon, 3 Jan 2022 19:05:06 +0200 Subject: [PATCH] Make GPU plugin / resource label limitations more explicit While the labeling limit is obvious after little thought, IMHO limitations like this should either be stated out front, or be in their own section in the README. Commit does former for the GPU plugin fractional resources, and latter for the NFD hook / labeling. --- cmd/gpu_nfdhook/README.md | 2 ++ cmd/gpu_plugin/README.md | 3 +++ 2 files changed, 5 insertions(+) diff --git a/cmd/gpu_nfdhook/README.md b/cmd/gpu_nfdhook/README.md index 30d532d6..9c20de2e 100644 --- a/cmd/gpu_nfdhook/README.md +++ b/cmd/gpu_nfdhook/README.md @@ -50,4 +50,6 @@ name | type | description| |`gpu.intel.com/platform_.tiles`| number | GPU tile count in the GPUs of the named platform. |`gpu.intel.com/platform_.present`| string | "true" for indicating the presense of the GPU platform. +## Limitations + For the above to work as intended, GPUs on the same node must be identical in their capabilities. diff --git a/cmd/gpu_plugin/README.md b/cmd/gpu_plugin/README.md index 0231da38..316b805a 100644 --- a/cmd/gpu_plugin/README.md +++ b/cmd/gpu_plugin/README.md @@ -138,6 +138,9 @@ With the experimental fractional resource feature you can use additional kuberne resources, such as GPU memory, which can then be consumed by deployments. PODs will then only deploy to nodes where there are sufficient amounts of the extended resources for the containers. +(For this to work properly, all GPUs in a given node should provide equal amount of resources +i.e. heteregenous GPU nodes are not supported.) + Enabling the fractional resource feature isn't quite as simple as just enabling the related command line flag. The DaemonSet needs additional RBAC-permissions and access to the kubelet podresources gRPC service, plus there are other dependencies to