When kubelet notifies the plugin about its restart by removing
the plugin's socket we do reconnect to kubelet, but we don't
send the current list of monitored devices to kubelet. As result
kubelet is not aware of discovered devices if it restarts.
Always send the current list of monitored devices to kubelet
upon ListAndWatch() request.
When running in the region mode we don't need to know AFU IDs
thus don't read them while in the mode.
It's important not to try to read them because in the region mode
AFUs are supposed to be reprogrammed in the fly anyway and the
afu_id files may become busy.
Device descovery can get EBUSY error when AFU is being programmed.
It causes plugin to crash with error:
Device scan failed: read /sys/class/fpga/intel-fpga-dev.0/intel-fpga-port.0/afu_id:
device or resource busy
This error should be ignored as this is valid use case.
This is harmless as afu will be rediscovered on the next run, when
reprogramming is done.
Moved code that goes through sysfs to the separate function
getSysFsInfo to decrease cyclomatic complexity of the scanFPGAs
function.
This is required to get the next commit through our CI check.
Plugin sets container annotation com.intel.fpga.mode to
intel.com/fpga-region in region mode.
This should allow to configure CRI-O to run reprogramming hooks
only when annotation is set.
In the `af` mode the plugin announces AFUs and tells kubelet
to pass only AFU ports to containers.
In the `region` mode the plugin announces region interfaces and tells
kubelet to pass only AFU ports to containers.
In the `regiondevel` mode the plugin announces region interfaces and
tells kubelet to pass AFU ports and FME devices to containers, so the
conteainers have full access to the regions.
This refactoring brings in device Cache running in its own
thread and scanning FPGA devices once every 5 secs. Then no timers
are used inside ListAndWatch() method of device managers and
no need to run scanning periodically in every device manager's
thread.
Cache generates update events and the plugin creates, updates or
deletes device managers on the fly upon receiving the events.
Introducing new modes of operations is a matter of adding a single
function converting and filtering the content of Cache.
This is done to fix https://goreportcard.com warnnigs:
gofmt 33%
Gofmt formats Go programs. We run gofmt -s on your code, where -s is for the "simplify" command
intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/internal/deviceplugin/deviceplugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/cmd/gpu_plugin/gpu_plugin_test.go
Line 1: warning: file is not gofmted with -s (gofmt)
intel-device-plugins-for-kubernetes/cmd/fpga_plugin/fpga_plugin.go
Line 1: warning: file is not gofmted with -s (gofmt)
Fixed the following golint warnings:
./cmd/fpga_plugin/fpga_plugin.go:71:2: struct field fpgaId should be fpgaID
./cmd/fpga_plugin/fpga_plugin.go:78:44: func parameter fpgaId should be fpgaID
./cmd/fpga_plugin/fpga_plugin.go:104:8: var interfaceId should be interfaceID
./cmd/fpga_plugin/fpga_plugin.go:120:7: var interfaceIdFile should be interfaceIDFile
./cmd/fpga_plugin/fpga_plugin.go:156:8: range var fpgaId should be fpgaID
./cmd/fpga_plugin/fpga_plugin.go:254:6: range var fpgaId should be fpgaID
./cmd/fpga_plugin/fpga_plugin.go:254:14: should omit 2nd value from range; this loop is equivalent to `for fpgaId := range ...`
./internal/deviceplugin/deviceplugin.go:30:6: exported type DeviceInfo should have comment or be unexported
./internal/deviceplugin/deviceplugin.go:35:6: exported type Server should have comment or be unexported
./internal/deviceplugin/deviceplugin.go:39:1: exported method Server.Serve should have comment or be unexported
./internal/deviceplugin/deviceplugin.go:43:1: exported method Server.Stop should have comment or be unexported
Added mode ("af" or "region") prefix to the resource name to
distingush between announced functions and regions, e.g.
intel.com/fpga-af-f7df405cbd7acf7222f144b0b93acd18
intel.com/fpga-region-ce48969398f05f33946d560708be108a
Split into 3 parts:
- main part with high level description of the repository
- Build and test Intel GPU Device Plugin for Kubernetes
- Build and test Intel FPGA Device Plugin for Kubernetes
Added Intel logo to the main README.md
Fixes#2
By default the fpga plugin announce regions' interface IDs. With
added `-mode af` switch the plugins announces IDs of accelerator
functions instead of regions.