Commit Graph

373 Commits

Author SHA1 Message Date
Mikko Ylinen
c064bfc4f1 demo: add intel-opencl-icd
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-02-24 11:06:27 +02:00
Hyeongju Johannes Lee
5fe2c3ef4d dlb: update the link to dlb driver
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-02-18 20:00:24 +02:00
Ed Bartosh
d4966e089c
Merge pull request #857 from ozhuraki/operator-upgrade
operator: Support upgrade of plugins
2022-02-18 17:55:53 +02:00
Oleg Zhurakivskyy
f29171b067 operator: Add a documentation on upgrade
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2022-02-18 12:52:55 +02:00
Mikko Ylinen
72c4552253 deployments: move SGX NFD config to an NFD kustomize overlay
Start using the newly created NodeFeatureRule configs with SGX.
This allows to drop the custom worker config.

Additionally, split the example NFD deployment into two steps

1) plain NFD (+SGX json patches)
2) NodeFeatureRule creation

NodeFeatureRule creation is not guaranteed to succeed when it's
part of the same kustomization with the CRD creation. Users may
also have NFD already running so allowing 2) alone works better
in that scenario.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-02-18 11:17:57 +02:00
Hyeongju Johannes Lee
d70397ebfb dlb: update README
Remove commands for building and loading dlb2 driver

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2022-02-16 16:10:51 +02:00
Mikko Ylinen
a74774f939 docs: update cert-manager installation instructions
The webhooks' default deployments depend on cert-manager. Our existing
documentation points to a specific cert-manager version giving users
the impression that it should be used. However, that is not the case.

Update the documentation so that we just point to cert-manager
installation page. With this, we don't have to hard-code to any
specific version.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-02-16 11:26:37 +02:00
Mikko Ylinen
1185f2329b crypto-perf: drop SYS_ADMIN capabilities
SYS_ADMIN capabilities are not necessary when using
vfio-pci.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-02-16 11:26:20 +02:00
Oleg Zhurakivskyy
656676b267 operator: Set klogr's format to FormatKlog
The default "Serialize" breaks multiline output.

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2022-02-09 16:49:35 +02:00
Ed Bartosh
8626d47d8b operator: implement NFD labelling rules
- added labelling rules for all supported devices
- updated operator installation instructions

Fixes: #768

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-02-08 17:01:03 +02:00
Ed Bartosh
55f3e17dd0 add 'annotations' parameter to the NewDeviceInfo API
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-02-07 15:15:30 +02:00
Tuomas Katila
6f57c55ef8 Add a total tile count to node's labels
This label isn't dependent on the debugfs as the platform
specific tile count is.

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
2022-01-26 09:57:33 +02:00
Ukri Niemimuukko
7520393041 gpu_nfdhook: gpu-numbers and pci-groups
This adds a new label "gpu-numbers" for short numbered lists of
gpus, omitting "card" from the names. Also adds splitting of long
label values.

Similarly this adds a new label "pci-groups" for PCI groups. Grouping
can be controlled by env var GPU_PCI_GROUPING_LEVEL. The env var
dictates, how many pci-folder names need to match, in order for GPUs
to be considered to belong in a group.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2022-01-25 09:17:56 +02:00
Mikko Ylinen
c306f5ef68 qat: detect noiommu mode with VFIO
If the kernel has CONFIG_VFIO_NOIOMMU enabled and the node admin
has explicitly set enable_unsafe_noiommu_mode VFIO parameter,
VFIO taints the kernel and writes "vfio-noiommu" to the IOMMU
group name. If these conditions are true, the /dev/vfio/ devices
are prefixed with "noiommu-".

This use-case is documented for DPDK so we don't want to break
it (as it was before because we added DeviceMounts to
/dev/vfio/<iommugroup> files that did not exist).

See DPDK documentation for further information and warnings.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-01-10 06:11:59 +02:00
Eero Tamminen
36046d90a4 Make GPU plugin / resource label limitations more explicit
While the labeling limit is obvious after little thought, IMHO
limitations like this should either be stated out front, or be in
their own section in the README.  Commit does former for the GPU
plugin fractional resources, and latter for the NFD hook / labeling.
2022-01-04 11:43:08 +02:00
Ukri Niemimuukko
46dcffc33e README typofix
Label descriptions had extra underscores.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-12-28 12:01:40 +02:00
Hyeongju Johannes Lee
a2d13eea4c dlb: update README
Remove the sentence for pre-built image since Dockerhub image for dlb
plugin is available.

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-12-22 03:49:49 -08:00
Hyeongju Johannes Lee
74ecd6919c dsa: Fix the names still left as idxd-initcontainer
There are a few things left un-renamed after \#771.
Rename those to idxd-config-initcontainer.

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-12-21 04:39:19 -08:00
Mikko Ylinen
e09d52f6ff
Merge pull request #816 from hj-johannes-lee/dlb-flag-parse
dlb:Fix the problem that klog is not printed
2021-12-21 13:31:49 +02:00
Hyeongju Johannes Lee
515bd5908c dlb:Fix the problem that klog is not printed
Add flag parsing to get command line parameters so that parameters about
klog can be not ignored
2021-12-21 01:58:58 -08:00
Mikko Ylinen
c7e18d8b25 qat: rework driver binding
The new_id based driver binding is failing on kernels 5.11+ when the
QAT VF is not bound to any driver: attempts to write to new_id with
the same device ID repeatedly error with "file exists".

Move the new_id initialization to the beginning of the startup and
write the enabled device IDs only once.

This commit also fixes an issue where VF devices where not correctly detected
in virtual machines where the VF was not bound any driver.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-12-21 08:20:02 +02:00
Mikko Ylinen
b48ca7f686 qat: update dpdkdrv unit tests
After a closer review, it was noticed that some of the QAT dpdkdrv
unit tests need updating:

- "Broken igb_uio DPDKdriver..." is actually testing unknown device ID
and we already have tests for it -> drop.
- "igb_uio DPDKdriver with one kernel bound device (not QAT device)" is
testing something impossible: an unknown VF devID is originated from a
QAT PF -> drop.
- creating files for unbind/new_id etc. is unnecessary because
os.WriteFile() creates them during the tests -> drop these lines to
simplify unit tests maintenance.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-12-21 08:20:02 +02:00
dependabot[bot]
9a16e80f2b build(deps): bump google.golang.org/grpc from 1.42.0 to 1.43.0
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.42.0 to 1.43.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.42.0...v1.43.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

---
In addition to changes made by dependabot, I add nolint comments to ignore staticcheck(SA1019) errors.
It is because insecure.NewCredentials() recommended as an alternative is still declared experimental.
So keep grpc.withInsecure() with nolint comment.

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-12-20 04:50:39 -08:00
Ed Bartosh
cec004c398 lint: enable wsl check
Fixes: #392

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-12-17 11:48:48 +02:00
Eero Tamminen
bcc737bd2a Adapt GPU label support to debugfs DRM entry changes
GPU generation "gen" number is replaced in the capability files of
latest kernels with separate display, graphics, and media versions.

For compatibility with newer kernels, provide "gen" based on the new
labels (but without decimals), and for older kernel compatibility, new
labels based on the "gen".

Because different kernels match different items from the action map,
whole capability file will get parsed. Capability file parsing is
optimized by using prefix check instead of scanf.

"platform_gen" label is deprecated, and can be dropped whenever it
becomes inconvenient (lint complains about line count etc).
2021-12-16 21:22:31 +02:00
Eero Tamminen
599fc18e71 Provide workaround for the media issue and document it
The issue is with VA-API and QSV, not VPL media API.
2021-12-15 18:40:33 +02:00
Hyeongju Johannes Lee
37dc1b124e dlb: update README
Add info on how to configure dlb driver and vfs.

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-12-14 12:05:35 -08:00
Mikko Ylinen
e83a811ec7 sgx: update README
The cmdline flags talked about the old device nodes. With the
upstream driver, the devices nodes are /dev/sgx_[enclave|provision].

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-12-01 14:33:33 +02:00
Oleg Zhurakivskyy
fee2e12996 idxd-initcontainer: Drop libkmod, libudev
- Make libkmod, libudev optional
- Include accel-config, libjson-c, libuuid sources

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-30 15:32:23 +02:00
Mikko Ylinen
1c4ee778b3 sgx: update NFD deployment
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-25 17:13:03 +02:00
Dmitry Rozhkov
db20ce1fe4
Merge pull request #754 from mythi/PR-2021-062
qat: update default flags and deploy without ConfigMap
2021-11-22 10:00:02 +02:00
Hyeongju Johannes Lee
84d8408a4f README: add that operator supports for DSA and DLB plugins
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-11-19 02:38:58 -08:00
Mikko Ylinen
b921a4a458 qat: update default flags and deploy without ConfigMap
To make QAT plugin deployment consistent with the other plugins
we update the default flags and deploy without the flag settings
provided by the ConfigMap.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-18 14:02:36 +02:00
Dmitry Rozhkov
471549c11d
Merge pull request #753 from hj-johannes-lee/dlb-operator
operator: Add DLB support
2021-11-18 10:23:16 +02:00
Dmitry Rozhkov
42cde4ff6c
Merge pull request #742 from guoshuxu/dev
GPU devices resource preferred allocation methods.
2021-11-18 10:22:03 +02:00
Xu, Guoshu
e4c4a8f7ac GPU devices resource preferred allocation methods.
1. Implement PreferredAllocator interface.
2. Provide 3 preferred allocation policies: balancedPolicy, packedPolicy and nonePolicy.
3. Provide the cmdline interface: -allocation-policy balanced/packed/none, to select which preferred allocation policy to use.
4. Add operator support.

Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-17 22:55:10 +08:00
Hyeongju Johannes Lee
ff9034822b operator: Add DLB support
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-11-17 01:51:47 -08:00
Leow Chun Fung
1bbb0a6a7c Support for PCI VPU device 8086/4fc0 and 8086/4fc1 2021-11-16 22:13:33 +07:00
Ed Bartosh
80829f72b1 ci: improve golangci job
- used the same go version as for the project build
- used verbose output
- fixed gofmt check failures

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-11-13 00:32:25 +02:00
Ed Bartosh
b03227f9d4 dlb: add documentation
Document DLB plugin

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2021-11-11 12:25:25 +02:00
Hyeongju Johannes Lee
8362028560 dlb: Add new device plugin
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-11-11 11:51:49 +02:00
Oleg Zhurakivskyy
a7c612f7fc dsa: Rename dsa initcontainer to idxd
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-09 12:00:44 +02:00
Oleg Zhurakivskyy
cdaf6b3807 dsa: Add a documentation on provisioning with ConfigMap
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-11-09 10:31:50 +02:00
Hyeongju Johannes Lee
13f4ce82a1 Remove nolint annot.
Remove the annotation nolint:funlen since funlen is not used anymore.
2021-10-11 11:36:24 +03:00
Mikko Ylinen
e6cf299750 gpu: update READMEs
Commit 00a59e8f7d was not complete in that it didn't update
the corresponding documentation. This commit fixes that.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-10-08 11:57:16 +03:00
Oleg Zhurakivskyy
30ebc8e5d1 dsa: Add a documentation on provisioning with initcontainer
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2021-10-01 12:16:50 +03:00
Mikko Ylinen
9d0d6cbe11 qat: set c6xxvf and 4xxxvf to default devices
The devices enabled by default are different between the
kustomize and operator based deployments.

This change harmonizes the defaults to c6xxvf and 4xxxvf
in both deployment options.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-09-23 10:50:38 +03:00
Dmitry Rozhkov
19d54b9fe8
Merge pull request #707 from uniemimu/mem_read
gpu nfdhook: new memory amount reading logic
2021-09-23 10:33:41 +03:00
Ukri Niemimuukko
64290020d7 gpu nfdhook: new memory amount reading logic
This changes the memory reading to be done through lmem_total_bytes
file instead of the addr_range file.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-09-21 13:50:41 +03:00
Hyeongju Johannes Lee
8fc5df7e37 Add govet-fieldalignment
Add govet-fieldalignment to .golangci.yml
Fix errors that come from adding govet-fieldalignment
- by reordering the fields of structs
- by putting nolint:govet annotations

Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
2021-09-20 20:59:04 +03:00