Move all the fpga components to using klog for logging
and debug. This includes replacing our homebrew 'fatal()'
with klog.Error().
Modify the deployment files to move from `-debug` to
`-v`, and set their default level to '1' (Info), rather
than full debug mode ('4').
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Extended fpga plugin to support both in-tree(DFL) and
out-of-tree (OPAE) kernel drivers.
- fpga_crihook: move JSON parsing to separate functions
- decreased cyclomatic complexity of the CRI hook main() function
- increased readability
- increased test coverage
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
When the device's firmware crashes the OPAE driver reports all properties
of the device as a stream of binary ones. This effectively makes
interface and afu IDs look like "ffffffffffffffffffffffffffffffff".
Mark such devices as Unhealthy.
closes#77
Every device plugin is supposed to implement PluginInterfaceServer
interface to be exposed as a gRPC service. But this functionality is
common for all our device plugins and can be hidden in a Manager
which manages all gRPC servers dynamically.
The only mandatory functionality that needs to be provided by a device
plugin and which differentiate one plugin from another is the code
scanning the host for devices present on it.
Refactor the internal deviceplugin package to accept only
one mandatory method implementation from device plugins - Scan().
In addition to that a device plugin can optionally implement a
PostAllocate() method which mutates responses returned by
PluginInterfaceServer.Allocate() method.
Also to narrow the gap between these device plugins and the
kubevirt's collection the naming scheme for resources has been changed.
Now device plugins provide a namespace for the device types they
operate with. E.g. for resources in format "color.example.com/<color>"
the namespace would be "color.example.com". So, the resource name
"intel.com/fpga-region-fffffff" becomes "fpga.intel.com/region-fffffff".
When kubelet notifies the plugin about its restart by removing
the plugin's socket we do reconnect to kubelet, but we don't
send the current list of monitored devices to kubelet. As result
kubelet is not aware of discovered devices if it restarts.
Always send the current list of monitored devices to kubelet
upon ListAndWatch() request.
Plugin sets container annotation com.intel.fpga.mode to
intel.com/fpga-region in region mode.
This should allow to configure CRI-O to run reprogramming hooks
only when annotation is set.
This refactoring brings in device Cache running in its own
thread and scanning FPGA devices once every 5 secs. Then no timers
are used inside ListAndWatch() method of device managers and
no need to run scanning periodically in every device manager's
thread.
Cache generates update events and the plugin creates, updates or
deletes device managers on the fly upon receiving the events.
Introducing new modes of operations is a matter of adding a single
function converting and filtering the content of Cache.
This is done to fix https://goreportcard.com warnnigs:
gofmt 33%
Gofmt formats Go programs. We run gofmt -s on your code, where -s is for the "simplify" command
intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/internal/deviceplugin/deviceplugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/cmd/gpu_plugin/gpu_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin.go
Line 1: warning: file is not gofmted with -s (gofmt)
Added mode ("af" or "region") prefix to the resource name to
distingush between announced functions and regions, e.g.
intel.com/fpga-af-f7df405cbd7acf7222f144b0b93acd18
intel.com/fpga-region-ce48969398f05f33946d560708be108a
By default the fpga plugin announce regions' interface IDs. With
added `-mode af` switch the plugins announces IDs of accelerator
functions instead of regions.