Commit Graph

63 Commits

Author SHA1 Message Date
Dmitry Rozhkov
e87d94d4fb fpga: finalize plugin kustomization
closes #318
2020-07-01 11:57:45 +03:00
Dmitry Rozhkov
73aea0aa1b linter: enable gosec check 2020-06-11 17:56:24 +03:00
Dmitry Rozhkov
70f862f2aa add golangci linter
In this initial commit the following checks are disabled due to
excessive amount of changes required:
- dupl (duplicate code)
- funlen (function length)
- goerr113 (errors handling expressions)
- gomnd (magic numbers)
- gosec (security)
- nakedret (naked returns)
- wsl (forces to use empty lines)
- errcheck (checking for unchecked errors)
- staticcheck (static analysis)
2020-06-08 14:01:13 +03:00
Dmitry Rozhkov
aabc45cbb5 gpu: increase code coverage for unit tests 2020-05-19 16:14:40 +03:00
Dmitry Rozhkov
99fcb69d33 fpga: compress fpga AF resource names 2020-04-29 11:59:50 +03:00
Dmitry Rozhkov
6c2eacfae5 webhook: remove mode of operation
fpga: make AFU resource name 63 char long

webhook: drop mode from README

webhook: extend mappings description

webhook: tighten CRD definitions

webhook: drop mapping to non-existing afuId

explicitly state mappings names can be in any format

use consistent terminology across fpga webhook and plugin
2020-04-22 13:55:43 +03:00
Dmitry Rozhkov
8fc187f4d8 move to k8s v1.18.2 release
Also fix the plugins and e2e tests
2020-04-17 12:40:18 +03:00
Ed Bartosh
2ec6677ab0 fpga tests cleanup
- used t.Run api for better visibility
- used ioutil.TempDir to create temporary directories

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2020-03-31 14:36:15 +03:00
Graham Whaley
71d08224ee fpga: move to using klog for logs and debug
Move all the fpga components to using klog for logging
and debug. This includes replacing our homebrew 'fatal()'
with klog.Error().

Modify the deployment files to move from `-debug` to
`-v`, and set their default level to '1' (Info), rather
than full debug mode ('4').

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-03-24 14:31:53 +00:00
Ed Bartosh
cf731f3c18 fpga plugin: increase test coverage 2020-03-24 15:46:39 +02:00
Ed Bartosh
29be713a96 fpga_plugin: use time.Ticker instead of time.Sleep
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2020-03-24 13:32:35 +02:00
Mikko Ylinen
332fbdc35c
Merge pull request #300 from askervin/55B_fpga_kustomization
fpga plugin kustomization, stage 2
2020-02-24 22:20:27 +02:00
Antti Kervinen
5fe8174077 fpga_plugin: add kustomization files
- Add script/fpga-plugin-prepare-for-kustomization.sh, creates contents
  for the secret needed by the fpga plugin webhook.
- Single-command fpga plugin + webhook deployment for both modes:
  - `kubectl create -k deployments/fpga_plugin/overlays/af`
  - `kubectl create -k deployments/fpga_plugin/overlays/region`
- Change intel-fpga-plugin image CMD to ENTRYPOINT.
2020-02-24 16:32:26 +02:00
Ed Bartosh
ca5d144e8e
Merge pull request #296 from mythi/gomod
fpga_plugin: drop dependency to k8s.io/kubernetes
2020-02-24 14:10:12 +02:00
Ed Bartosh
13836c2d09
Merge pull request #299 from mythi/gitclone
READMEs: use git clone to get the code
2020-02-24 12:42:32 +02:00
Mikko Ylinen
61c135d1d6 fpga_plugin: drop dependency to k8s.io/kubernetes
This commit drops fpga_plugin dependency to k8s.io/kubernetes which
was used to get GetHostname(). After this change, the plugin node
name can be set using new -node-name parameter. The default value for
is read from NODE_NAME environment variable.

If the node annotation override check fails, we continue with the default
mode parameter and do not exist like we did previously.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-02-21 18:48:30 +02:00
Mikko Ylinen
f145541caf READMEs: use git clone to get the code
go get'ing does not work due to our k8s.io/kubernetes dependency
so guide users to use git clone to get the code.

Fixes: #290

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-02-20 08:04:07 +02:00
Antti Kervinen
d04aa77ac5 fpga_plugin: orchestration/orchestrated fixed in READMEs
Not touching "orchestration programmed". Fixing only instances where
this refers directly to the mode recognized by the webhook-deploy.sh
script.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-02-17 16:32:54 +02:00
Ed Bartosh
1f4928790f Implement function for DeviceInfo creation
- Made DeviceInfo fields private
- Implement NewDeviceInfo constructor
2020-02-07 15:26:37 +02:00
Graham Whaley
27bc562478 fpga plugin: docs: Clean up and expand README
Expand and re-arrange the README. Add some details about what the
plugin and other FPGA components provide.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-01-30 16:19:19 +00:00
Graham Whaley
958ab2aa7e fpga: docs: Add diagrams for FPGA modes
Add draw.io and their generated PNG files for both
orchestrated and preprogrammed FPGA modes. These will
then be used in the documentation.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-01-27 14:55:15 +00:00
Dmitry Rozhkov
814e2e1a50 bump k8s dependencies up to v1.17.0 2020-01-09 11:19:58 +02:00
Ed Bartosh
06c07a5961 deployments/fpga_plugin: limit host mounts
The default deployment gives rather wide host mounts.

Limited sysfs mount only to the subdirectory the plugin
needs.

Mounted sysfs and dev  mounts read-only.

Added notes that FPGA plugin can be run as non-root user.
2019-12-12 13:07:19 +02:00
Ubuntu
4f28657b6b fpga: fixed documentation and demo 2019-09-10 19:30:20 -05:00
Ed Bartosh
de9df8373e fpga_plugin: support in-tree kernel driver
Extended fpga plugin to support both in-tree(DFL) and
out-of-tree (OPAE) kernel drivers.

- fpga_crihook: move JSON parsing to separate functions
- decreased cyclomatic complexity of the CRI hook main() function
- increased readability
- increased test coverage

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2019-08-24 18:27:15 +03:00
Dmitry Rozhkov
4bf8c5e685 Fix compilation issues 2019-02-19 16:12:56 +02:00
Dmitry Rozhkov
54332c5eea announce deviceplugin API public 2019-01-21 17:20:01 +02:00
Dmitry Rozhkov
7662cb9154 extend API to receive full specs instead of strings 2019-01-21 17:15:27 +02:00
Dmitry Rozhkov
5231a9cc1f fpga_plugin: don't exit if OPAE driver is not loaded 2018-09-05 14:41:30 +03:00
Dmitry Rozhkov
f08df95a28 Clarify the purpose of /var/run/kubernetes/admin.kubeconfig 2018-08-24 09:45:01 +03:00
Ed Bartosh
917b68206e
Merge pull request #90 from rojkov/readme
clarify fpga plugin modes in README.md
2018-08-20 11:41:23 +03:00
Dmitry Rozhkov
009a6ebfb6 clarify fpga plugin modes in README.md 2018-08-20 10:28:11 +03:00
Dmitry Rozhkov
eccd70c600 replace glog with simpler home-grown debug logging 2018-08-16 17:40:16 +03:00
Dmitry Rozhkov
2ff6c5929a Use annotated errors for tracing 2018-08-16 17:31:19 +03:00
Ed Bartosh
cf8d6bbc3f fix broken links in the FPGA plugin documentation 2018-08-14 15:00:48 +03:00
Dmitry Rozhkov
92f72e4196 fpga_plugin: indicate unhealthy devices
When the device's firmware crashes the OPAE driver reports all properties
of the device as a stream of binary ones. This effectively makes
interface and afu IDs look like "ffffffffffffffffffffffffffffffff".

Mark such devices as Unhealthy.

closes #77
2018-08-13 11:52:51 +03:00
Mary Camp
51bb79bc60
Merge branch 'master' into fpga-readme-edits 2018-07-31 13:18:40 -04:00
MCamp859
be66697049 Replaced "afu" with "af" in 2 places.
Signed-off-by: MCamp859 <mary.camp@ptiglobal.net>
2018-07-31 10:33:18 -04:00
MCamp859
a29e04f614 Edits to FPGA readme files for format and text flow.
Signed-off-by: MCamp859 <mary.camp@ptiglobal.net>
2018-07-30 16:13:47 -04:00
Dmitry Rozhkov
1e7dbac162 Update README.md files to reflect changes caused by refactoring
update demo files
2018-07-30 15:29:33 +03:00
Dmitry Rozhkov
bbee3fde77 refactor device plugins to increase code reuse
Every device plugin is supposed to implement PluginInterfaceServer
interface to be exposed as a gRPC service. But this functionality is
common for all our device plugins and can be hidden in a Manager
which manages all gRPC servers dynamically.

The only mandatory functionality that needs to be provided by a device
plugin and which differentiate one plugin from another is the code
scanning the host for devices present on it.

Refactor the internal deviceplugin package to accept only
one mandatory method implementation from device plugins - Scan().

In addition to that  a device plugin can optionally implement a
PostAllocate() method which mutates responses returned by
PluginInterfaceServer.Allocate() method.

Also to narrow the gap between these device plugins and the
kubevirt's collection the naming scheme for resources has been changed.
Now device plugins provide a namespace for the device types they
operate with. E.g. for resources in format "color.example.com/<color>"
the namespace would be "color.example.com". So, the resource name
"intel.com/fpga-region-fffffff" becomes "fpga.intel.com/region-fffffff".
2018-07-30 15:29:33 +03:00
Dmitry Rozhkov
ff813285ed typo fixed 2018-07-20 10:44:37 +03:00
Dmitry Rozhkov
8f977b7782 Send device list upon reconnecting to kubelet
When kubelet notifies the plugin about its restart by removing
the plugin's socket we do reconnect to kubelet, but we don't
send the current list of monitored devices to kubelet. As result
kubelet is not aware of discovered devices if it restarts.

Always send the current list of monitored devices to kubelet
upon ListAndWatch() request.
2018-07-11 12:04:43 +03:00
Dmitry Rozhkov
bb2403deb9 fpga: ignore afu_id in region mode
When running in the region mode we don't need to know AFU IDs
thus don't read them while in the mode.

It's important not to try to read them because in the region mode
AFUs are supposed to be reprogrammed in the fly anyway and the
afu_id files may become busy.
2018-07-04 12:02:07 +03:00
Ed Bartosh
54fd4f6f8f fpga: ignore EBUSY error when reading afu_id
Device descovery can get EBUSY error when AFU is being programmed.
It causes plugin to crash with error:
  Device scan failed: read /sys/class/fpga/intel-fpga-dev.0/intel-fpga-port.0/afu_id:
      device or resource busy

This error should be ignored as this is valid use case.
This is harmless as afu will be rediscovered on the next run, when
reprogramming is done.
2018-07-03 11:09:09 +03:00
Ed Bartosh
6a571e7d5b fpga: decrease cyclomatic complexity of scanFPGAs
Moved code that goes through sysfs to the separate function
getSysFsInfo to decrease cyclomatic complexity of the scanFPGAs
function.

This is required to get the next commit through our CI check.
2018-07-03 11:09:09 +03:00
Ed Bartosh
cbd7173b1f fpga: set container annotations
Plugin sets container annotation com.intel.fpga.mode to
intel.com/fpga-region in region mode.

This should allow to configure CRI-O to run reprogramming hooks
only when annotation is set.
2018-06-29 16:58:02 +03:00
Dmitry Rozhkov
861b23308d Check node's annotations to set mode of FPGA plugin 2018-06-20 09:45:43 +03:00
Dmitry Rozhkov
371376cf73 fpga: restore mode names in resource prefixes
The latest refactoring of FPGA scanning accidentally removed
mode names from resource name prefixes.

Restore them back.
2018-06-18 14:10:53 +03:00
Dmitry Rozhkov
4a1b311e62 fix up misspelling 2018-06-15 15:25:43 +03:00