Commit Graph

19 Commits

Author SHA1 Message Date
Eero Tamminen
ca9aa32556 Add "-enable-monitoring" option to GPU plugin
Make "i915_monitoring" resource (granting access to all GPUs) optional
so that it can be enabled only when it's needed.
2021-05-05 17:09:09 +03:00
Eero Tamminen
713c1ab170 Move GPU plugin CLI options to a struct
To help in:
* adding more CLI options in next and later commits, and
* to replace magic newDevicePlugin() input parameters with
  explicitly named one(s)
2021-05-05 17:09:09 +03:00
Eero Tamminen
79b86fea2d Skip PF for "i915" resource when it has VFs
NOTE: this has impact only for GPUs which are virtualized with SR-IOV.

Access to physical devices (PFs) is disabled for "i915" resource when
they have configured virtual devices (VFs).

This is because:

* GPU resources are expected to be evenly split between VFs in such
  configurations

* But PF resource amount is expected to differ from VFs and typically
  retain only enough resources (just few MB of RAM), to be able to
  provide GPU metrics that are not available from VFs

* Neither the current GPU plugin, nor Kubernetes scheduling in
  general, has proper support for heterogeneous GPUs (= capability
  based scheduling)

Therefore "i915" resource needs to be limited to GPU devices with
homogeneous amount of resources, which in SR-IOV configurations is
expected to be the case only with VFs (when such are present).
2021-05-05 14:13:48 +03:00
Eero Tamminen
e418c00fca Add "i915_monitoring" resource to GPU plugin
Which mounts all (Intel) GPU devices to requesting container.

This is needed e.g. to get GPU metrics from the node.  Requesting pod
does not know how many GPUs are on the node it gets assigned to, so
there needs to means to request them all.

(Only alternative for the new resource would be requesting Privileged
mode, which is clearly worse as that would grant pod access also to
all other devices and capabilities.)

This commit also:

* Adds "i915_monitoring" resource testing to: go test -v -run Scan

* Splits GPU plugin tests mock file system setup to a separate
  createTestFiles() function because otherwise TestScan() does not
  pass project's golangci-lint complexity limits
2021-04-27 14:21:05 +03:00
Eero Tamminen
f9158c1c3b Update GPU plugin copyrights 2021-04-01 15:20:35 +03:00
Eero Tamminen
384d37ead0 Add test for multiple GPU devices 2021-04-01 15:20:35 +03:00
Eero Tamminen
49354693fb Fix GPU plugin test setup + better error message
Tests fail depending in which order they are run, unless mocked files
are cleaned between test runs.

Without this, the next commit would fail.
2021-04-01 15:20:35 +03:00
Dmitry Rozhkov
73aea0aa1b linter: enable gosec check 2020-06-11 17:56:24 +03:00
Dmitry Rozhkov
70f862f2aa add golangci linter
In this initial commit the following checks are disabled due to
excessive amount of changes required:
- dupl (duplicate code)
- funlen (function length)
- goerr113 (errors handling expressions)
- gomnd (magic numbers)
- gosec (security)
- nakedret (naked returns)
- wsl (forces to use empty lines)
- errcheck (checking for unchecked errors)
- staticcheck (static analysis)
2020-06-08 14:01:13 +03:00
Dmitry Rozhkov
aabc45cbb5 gpu: increase code coverage for unit tests 2020-05-19 16:14:40 +03:00
Graham Whaley
626bbb6ee2 gpu: move to using klog
Move from fmt to klog for logging and debug.
Also add an extra info level message noting when we find
new devices.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-03-20 11:54:38 +00:00
Dmitry Rozhkov
44ff734be6 gpu: add log messages for not found cards
Let a user know the plugin can't find any Intel GPU if that's
the case. It might be cumbersome to realize that the plugin runs
on a host which doesn't have any Intel GPUs.

Also make the code less nested for better readability.
2019-05-24 16:19:06 +03:00
Dmitry Rozhkov
eccd70c600 replace glog with simpler home-grown debug logging 2018-08-16 17:40:16 +03:00
Dmitry Rozhkov
2ff6c5929a Use annotated errors for tracing 2018-08-16 17:31:19 +03:00
Dmitry Rozhkov
40246f64ad gpu_plugin: add -shared-dev-num option
The DRM driver of Intel i915 GPUs allows sharing one device
between many containers.

Make it possible to use the same device from different containers.
The exact number of containers sharing the same device can be limited
with the new option -shared-dev-num set to 1 by default.

closes #53
2018-08-14 14:54:49 +03:00
Dmitry Rozhkov
bbee3fde77 refactor device plugins to increase code reuse
Every device plugin is supposed to implement PluginInterfaceServer
interface to be exposed as a gRPC service. But this functionality is
common for all our device plugins and can be hidden in a Manager
which manages all gRPC servers dynamically.

The only mandatory functionality that needs to be provided by a device
plugin and which differentiate one plugin from another is the code
scanning the host for devices present on it.

Refactor the internal deviceplugin package to accept only
one mandatory method implementation from device plugins - Scan().

In addition to that  a device plugin can optionally implement a
PostAllocate() method which mutates responses returned by
PluginInterfaceServer.Allocate() method.

Also to narrow the gap between these device plugins and the
kubevirt's collection the naming scheme for resources has been changed.
Now device plugins provide a namespace for the device types they
operate with. E.g. for resources in format "color.example.com/<color>"
the namespace would be "color.example.com". So, the resource name
"intel.com/fpga-region-fffffff" becomes "fpga.intel.com/region-fffffff".
2018-07-30 15:29:33 +03:00
ssehgal
3eb2b10f75 Enabling support for QuickAssist Devices 2018-07-23 17:35:37 +01:00
Ed Bartosh
6a3953fc85 reformatted *.go with gofmt -s -w
This is done to fix https://goreportcard.com warnnigs:

gofmt 33%
Gofmt formats Go programs. We run gofmt -s on your code, where -s is for the "simplify" command

intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)

intel-device-plugins-for-kubernetes/internal/deviceplugin/deviceplugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)

intel-device-plugins-for-kubernetes/cmd/gpu_plugin/gpu_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)

intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin.go
Line 1: warning: file is not gofmted with -s (gofmt)
2018-05-28 16:59:19 +03:00
Alexander Kanevskiy
d4d77a71e4 Initial public code release 2018-05-18 18:30:54 +03:00