intel-device-plugins-for-kubernetes

github/intel-device-plugins-for-kubernetes

mirror of https://github.com/intel/intel-device-plugins-for-kubernetes.git synced 2025-06-03 03:59:37 +00:00

Author	SHA1	Message	Date
Hyeongju Johannes Lee	8362028560	dlb: Add new device plugin Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>	2021-11-11 11:51:49 +02:00
Oleg Zhurakivskyy	a7c612f7fc	dsa: Rename dsa initcontainer to idxd Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2021-11-09 12:00:44 +02:00
Oleg Zhurakivskyy	cdaf6b3807	dsa: Add a documentation on provisioning with ConfigMap Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2021-11-09 10:31:50 +02:00
Hyeongju Johannes Lee	13f4ce82a1	Remove nolint annot. Remove the annotation nolint:funlen since funlen is not used anymore.	2021-10-11 11:36:24 +03:00
Mikko Ylinen	e6cf299750	gpu: update READMEs Commit `00a59e8f7d` was not complete in that it didn't update the corresponding documentation. This commit fixes that. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-10-08 11:57:16 +03:00
Oleg Zhurakivskyy	30ebc8e5d1	dsa: Add a documentation on provisioning with initcontainer Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2021-10-01 12:16:50 +03:00
Mikko Ylinen	9d0d6cbe11	qat: set c6xxvf and 4xxxvf to default devices The devices enabled by default are different between the kustomize and operator based deployments. This change harmonizes the defaults to c6xxvf and 4xxxvf in both deployment options. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-09-23 10:50:38 +03:00
Dmitry Rozhkov	19d54b9fe8	Merge pull request #707 from uniemimu/mem_read gpu nfdhook: new memory amount reading logic	2021-09-23 10:33:41 +03:00
Ukri Niemimuukko	64290020d7	gpu nfdhook: new memory amount reading logic This changes the memory reading to be done through lmem_total_bytes file instead of the addr_range file. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-09-21 13:50:41 +03:00
Hyeongju Johannes Lee	8fc5df7e37	Add govet-fieldalignment Add govet-fieldalignment to .golangci.yml Fix errors that come from adding govet-fieldalignment - by reordering the fields of structs - by putting nolint:govet annotations Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>	2021-09-20 20:59:04 +03:00
Ukri Niemimuukko	0670a82cb1	gpu rm linter comment fixes Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-09-10 14:35:13 +03:00
Li Ning	dcc12d9089	documentation: remove deprecated toc section in README The 'Verify node kubelet config' content was removed in `6b208f8`. Signed-off-by: Li Ning <ning.a.li@transwarp.io>	2021-09-07 19:38:41 +08:00
Hyeongju Johannes Lee	4bc70ac544	Add goerr113 linter check Add goerr113 lintercheck Fix the usage of fmt.Errorf() by wrapping errors Fix the usage of errors.New()	2021-09-03 11:02:14 +03:00
Hyeongju Johannes Lee	09ba9fde00	Update tool versions and fix errors and warnings that originated from the update Update tool versions Fix the errors and warnings originated from the update: -Correct type deviceInfo (->DeviceInfo) to make it public -Fix gpu_plugin.go and vpu_plugin_test.go where stylecheck errors occur -Fix deprecation warnings -Rename type 'PatcherManager' to 'Manager' to solve exported errors -Rename type 'SgxMutator' to 'Mutator' to solve exported errors Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>	2021-08-25 07:09:34 +00:00
Mikko Ylinen	cfe2d65f32	Merge pull request #659 from 0x161e-swei/sgx-nfd-operator-dependency Add SGX webhook operator as dependency of sgx-nfd	2021-07-28 06:20:32 +03:00
Shijia Wei	9b66176ca5	Add SGX admissionwebhook as dependency of sgx-nfd daemonset; Mentioned dependency of the cert-manager in DaemonSet deployment method in SGX README.	2021-07-27 00:39:59 -05:00
Ed Bartosh	8a54a9ba64	webhook: document mappings deployment Fixes: #580 Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2021-07-26 14:23:10 +03:00
Eero Tamminen	83e7de0d41	Make GPU plugin intro information more generic & accurate - Information on specific HW & virtualization types on which GPU plugin is tested on, belongs to releases notes, not to README intro (where it has already became obsolete) - HW offloading is provided by driver backends, not frontends (e.g. OneVPL is just one of the media driver frontends)	2021-06-22 18:27:17 +03:00
Ukri Niemimuukko	b0130e693f	more documentation for fractional resources This adds a section heading, TOC link, command line flag description and a short explanation of what other dependendent configuration changes are needed with fractional resources in order for the command line flag to achieve something useful. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-06-14 16:25:38 +03:00
Ed Bartosh	98f80b5f47	Merge pull request #652 from uniemimu/hookupdate add link to gpu_nfdhook and update hook README	2021-06-13 12:15:46 +03:00
Eero Tamminen	a2faa3a8fc	Add section on GPU plugin options to its README	2021-06-11 19:55:43 +03:00
Ukri Niemimuukko	cbf7bab114	add link to gpu_nfdhook and update hook README This adds a link from gpu-plugin README to the nfdhook README, and updates the nfdhook README with label descriptions. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-06-11 18:54:44 +03:00
skaajas	956154c1db	Updated GPU plugin-specific readme general description.	2021-06-11 15:50:14 +03:00
Ed Bartosh	9d8fb392f5	Merge pull request #637 from uniemimu/skip add pf skip to gpu nfdhook	2021-06-11 10:57:39 +03:00
Ukri Niemimuukko	e3bf21dbe9	gpu_plugin: add documentation links to gpu aware scheduling Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-06-10 19:46:35 +03:00
Ukri Niemimuukko	7ca5cfcfd6	add pf skip to gpu nfdhook This corresponds to the previous gpu-plugin skip code. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-06-10 18:44:57 +03:00
Mikko Ylinen	383778a24b	qat: fix C4xxx driver name Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-06-10 08:45:23 +03:00
Ed Bartosh	e180bfdf07	Merge pull request #644 from mythi/PR-2021-034 qat: do not fail if driver/unbind file does not exist	2021-06-09 11:38:52 +03:00
Mikko Ylinen	e8115d1c8d	qat: do not fail if driver/unbind file does not exist <device>/driver symlink does not exist if the device is not bound to any driver. bindDevice() failed when writing to <device>/driver/unbind errored but IsNotExist() error is acceptable in case there's no driver to unbind. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-06-09 11:09:24 +03:00
Dmitry Rozhkov	6aa1a47c9a	Merge pull request #638 from uniemimu/fractional gpu_plugin: fractional resource management	2021-06-09 10:58:10 +03:00
Ukri Niemimuukko	2c4d529d66	gpu_plugin: fractional resource management Fractional resource management feature Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com> Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@intel.com>	2021-06-04 13:06:50 +03:00
Mikko Ylinen	facb4214a2	tree-wide: drop deprecated io/ioutil Go 1.16 release notes announced the deprecation of io/ioutil [1]. It's easy for us to move to use what is was recommended so just do it. [1] https://golang.org/doc/go1.16#ioutil Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-06-02 13:41:15 +03:00
Mikko Ylinen	06dbc1331b	images: move intel-qat-plugin-kerneldrv to Debian Also, update the documentation to reflect what is needed to enable and use '-mode kernel'. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-06-02 13:39:39 +03:00
Leow Chun Fung	8e4b58c0f6	Implement support for PCI-based VPU	2021-05-19 18:15:17 +07:00
Mikko Ylinen	c3cf958c85	images: move most plugin images to distroless/static All but one (VPU) of the published container images can be built with static binaries which allows us to use distroless/static as the base image. Moreover, when combined with stripping the plugin binaries, we can get both build time and image size savings. This is the part 1 (out of 2) of the rework. Part 2 will finish the change by making some adjustments to VPU plugin image and moving the FPGA/SGX/GPU initcontainers to distroless/static too. Partial: #516 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com> Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2021-05-19 09:44:47 +03:00
Eero Tamminen	c575ce9099	Document GPU plugin test code test-case struct members	2021-05-06 11:02:57 +03:00
Eero Tamminen	57c8d76e1b	Add minimal GPU plugin options testing Tests plugin scan results in setups having none, one and multiple eligible GPU devices, with and without SRIOV enabled, with two different options values. This does not cover verifying number of devices added under "i915_monitoring" resource as that would be much larger change.	2021-05-05 17:09:09 +03:00
Eero Tamminen	ca9aa32556	Add "-enable-monitoring" option to GPU plugin Make "i915_monitoring" resource (granting access to all GPUs) optional so that it can be enabled only when it's needed.	2021-05-05 17:09:09 +03:00
Eero Tamminen	713c1ab170	Move GPU plugin CLI options to a struct To help in: * adding more CLI options in next and later commits, and * to replace magic newDevicePlugin() input parameters with explicitly named one(s)	2021-05-05 17:09:09 +03:00
Eero Tamminen	06fac8128f	Move GPU plugin sysfs device compatibility checks to own function To reduce scan() function complexity before adding more functionality to it.	2021-05-05 17:08:49 +03:00
Eero Tamminen	79b86fea2d	Skip PF for "i915" resource when it has VFs NOTE: this has impact only for GPUs which are virtualized with SR-IOV. Access to physical devices (PFs) is disabled for "i915" resource when they have configured virtual devices (VFs). This is because: * GPU resources are expected to be evenly split between VFs in such configurations * But PF resource amount is expected to differ from VFs and typically retain only enough resources (just few MB of RAM), to be able to provide GPU metrics that are not available from VFs * Neither the current GPU plugin, nor Kubernetes scheduling in general, has proper support for heterogeneous GPUs (= capability based scheduling) Therefore "i915" resource needs to be limited to GPU devices with homogeneous amount of resources, which in SR-IOV configurations is expected to be the case only with VFs (when such are present).	2021-05-05 14:13:48 +03:00
Dmitry Rozhkov	38a59a57ea	Merge pull request #626 from mythi/PR-2021-028 sgx: add note about the SGX DCAP driver usage	2021-04-28 08:42:02 +03:00
Mikko Ylinen	111b833ea8	sgx: add note about the SGX DCAP driver usage The SGX DCAP out-of-tree v1.41 driver is also known to work with the SGX plugin. However, the default NFD labeling does not work with the out-of-tree driver so warn users about it. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-04-27 22:10:21 +03:00
Eero Tamminen	e418c00fca	Add "i915_monitoring" resource to GPU plugin Which mounts all (Intel) GPU devices to requesting container. This is needed e.g. to get GPU metrics from the node. Requesting pod does not know how many GPUs are on the node it gets assigned to, so there needs to means to request them all. (Only alternative for the new resource would be requesting Privileged mode, which is clearly worse as that would grant pod access also to all other devices and capabilities.) This commit also: * Adds "i915_monitoring" resource testing to: go test -v -run Scan * Splits GPU plugin tests mock file system setup to a separate createTestFiles() function because otherwise TestScan() does not pass project's golangci-lint complexity limits	2021-04-27 14:21:05 +03:00
Ed Bartosh	08c2094329	update to cert-manager v1.3.1 Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2021-04-22 14:45:39 +03:00
Dmitry Rozhkov	3892baa4be	Merge pull request #615 from eero-t/gpu-plugin-testing-improvements Gpu plugin testing improvements	2021-04-20 09:47:10 +03:00
Mikko Ylinen	280bdceb2a	sgx: add separate admissionwebhook image Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-04-14 08:09:33 +03:00
Ed Bartosh	31614592c6	Merge pull request #599 from ozhuraki/operator-select-device-type Make it possible to select supported devices in the operator	2021-04-12 19:09:59 +03:00
Ukri Niemimuukko	bb44156d4f	gpu_nfdhook: make memory parsing more robust This add support for parsing also hex and octal amounts. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-04-09 16:23:48 +03:00
Oleg Zhurakivskyy	6fbf7c9182	operator: README: Document per device deployment Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2021-04-08 10:53:04 +00:00
Oleg Zhurakivskyy	2d27602ed0	operator: Add --device command line to operator Add --device command line to operator's main.go which defines the controllers/webhooks to set up. Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2021-04-08 10:33:47 +00:00
Eero Tamminen	f9158c1c3b	Update GPU plugin copyrights	2021-04-01 15:20:35 +03:00
Eero Tamminen	8ca19d408f	Fix GPU plugin error messages	2021-04-01 15:20:35 +03:00
Eero Tamminen	384d37ead0	Add test for multiple GPU devices	2021-04-01 15:20:35 +03:00
Eero Tamminen	49354693fb	Fix GPU plugin test setup + better error message Tests fail depending in which order they are run, unless mocked files are cleaned between test runs. Without this, the next commit would fail.	2021-04-01 15:20:35 +03:00
Mikko Ylinen	97bcecda04	operator: update usage guidelines As the operator container image is available from a registry, we should guide users to use it rather than build and deploy it locally. Further, drop (un)deploy-operator targets in favor of simply using kubectl for deployment. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-03-30 15:33:09 +03:00
Dmitry Shmulevich	c8b5dce247	added an option to create a node label if epc memory is present updated README for SGX device plugin Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@gmail.com>	2021-03-18 11:53:49 -07:00
Ukri Niemimuukko	f89b61f923	add tile count label Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2021-02-26 20:39:48 +02:00
Mikko Ylinen	15ad4ed54b	ci: drop master branch from workflow triggers Also, polish the remaining docs hits to 'master'. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-02-23 10:51:04 +02:00
DougTW	7153923cfc	Edited qat_plugin README Replaced multiple instances of master with main. Reworded line 15 "Verify QAT device plugin is registered" removed 'on master' and corresponding section heading. Related to pr499. Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-18 13:59:40 +02:00
Mikko Ylinen	abfa3496a2	sgx: update SGX SDK/DCAP versions Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-02-18 09:31:28 +02:00
Mikko Ylinen	f8c20905aa	update to cert-manager v1.2.0 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-02-12 15:39:07 +02:00
Mikko Ylinen	37618d4f85	operator: move deviceplugin/v1 CRDs to cluster scope The device plugins daemonsets are cluster wide and currently only one device plugin instance per device is possible so making the corresponding deviceplugin/v1 CRDs non-namespaced (i.e., scope: cluster) fits better. Previously, the device plugin daemonset was deployed in the same namespace as the CR for that device but with the cluster scoped CRDs we default to use the same namespace as the operator, unless overridden via DEVICEPLUGIN_NAMESPACE env variable or a command line parameter to operator manager deployment. Three additional changes in this commit: - enable DSA envtest tests - update controller-runtime to v0.8.1 - change device plugin envtest suite to use klog/v2 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-02-11 11:41:47 +02:00
Mikko Ylinen	c1f609c34a	Merge pull request #560 from DougTW/dm-edits-gpu-plugin edited gpu_plugin README; changed 2 instances of master to main.	2021-02-10 15:00:54 +02:00
Mikko Ylinen	2409427939	Merge pull request #561 from DougTW/dm-edits-operator Edited operator README. Changed 1 instance of master to main, line 78.	2021-02-10 10:36:56 +02:00
Mikko Ylinen	667aa943a4	Merge pull request #563 from DougTW/dm-edits-sgx-plugin Editing sgx_plugin README. Replacing 'master' with 'main'.	2021-02-10 10:35:51 +02:00
Ed Bartosh	d446be3c3d	Merge pull request #558 from DougTW/dm-edits-fpga-adms-readme fpga_admissionwebhook README.md; changed master to main	2021-02-10 10:15:04 +02:00
DougTW	a856f3215d	Editing sgx_plugin README. Replacing 'master' with 'main'. Related to pr499. Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-09 17:17:05 -08:00
DougTW	80a7e4e651	Edited operator README. Changed 1 instance of master to main, line 78. Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-09 16:59:20 -08:00
DougTW	625b30fd1b	Fixes 560. Edited gpu_plugin README. Restored master to line 157 Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-09 16:49:30 -08:00
Mikko Ylinen	965936d8c3	Merge pull request #553 from bart0sh/PR0103-implement-dsa-operator operator: add DSA support	2021-02-09 16:24:41 +02:00
DougTW	28cbebc81b	edited gpu_plugin README; changed 2 instances of master to main. Related to PR 499. Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-08 18:40:47 -08:00
DougTW	467d4082d3	fpga_plugin-readme; changed one instance of master to main. Related to PR 499. Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-08 18:14:34 -08:00
DougTW	5ee1b6ce23	fpga_admissionwebhook README.md; changed master to main Signed-off-by: DougTW <doug.martin@intel.com>	2021-02-08 17:24:46 -08:00
Ed Bartosh	884f8e3dfe	operator: add DSA support Fixes: #443 Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2021-02-09 02:13:27 +02:00
Mikko Ylinen	7561501a51	Merge pull request #550 from dmitsh/ds-ext-res added implementation of EPC extended resource advertiser	2021-02-08 19:53:46 +02:00
Dmitry Shmulevich	3c3a3d1145	added implementation of EPC extended resource advertiser Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@gmail.com>	2021-02-04 17:35:17 -08:00
Mikko Ylinen	e94857ce5d	docs: harmonize device plugins operator naming Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-02-04 15:12:37 +02:00
Mikko Ylinen	0892a34705	move to k8s.io v1.20.x and klog/v2 v2.4.0 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-01-21 15:34:39 +02:00
Dmitry Rozhkov	771b0c7432	Merge pull request #544 from mythi/PR-2021-003 sgx: change getDefaultPodCount() logic	2021-01-13 10:31:16 +02:00
Mikko Ylinen	ed3a650ddd	sgx: change getDefaultPodCount() logic Decouple the default enclaveLimit/provisionLimit from core count. With this change, the default limit is constant and it can be made relative to core count by setting PODS_PER_CORE multiplier via env variable. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-01-12 20:24:46 +02:00
Ed Bartosh	6b208f8acf	documentation: remove kubelet configuration check Removed device plugin socket check from the documentation as device plugin support is enabled by default in Kubelet. Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2021-01-12 13:00:20 +02:00
Mikko Ylinen	da4a9fca96	qat: add note about vfio-pci module parameters Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2021-01-11 18:48:43 +02:00
Ed Bartosh	b007dc26f5	dsa: fix kubectl command line Fixed kubectl command line to get allocatable DSA resources Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-12-30 15:37:16 +02:00
Ed Bartosh	2e4de52f2b	implement DSA demo - Impelemented demo image that runs accel-config tests - Added testing instructions to the documentation Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-12-28 14:45:25 +02:00
Ukri Niemimuukko	5d31dca018	gpu_nfdhook: remove devfs dependency This removes the devfs dependency. Sysfs is sufficient for scanning presense of GPUs. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-12-23 15:43:48 +02:00
Mikko Ylinen	aef2e1655e	qat: run TestScanPrivate tests in parallel Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-23 11:18:21 +02:00
Mikko Ylinen	26d4b6f3a8	qat: fix device ID validation It looks that for a long time now we have accepted a setup where a valid QAT device ID is accepted as a QAT device resource even though the device is not "enabled" via kernelVfDrivers parameter. Fix device ID validation to skip valid QAT devices that are not explicitly specified in kernelVfDrivers. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-21 14:33:27 +02:00
Mikko Ylinen	85fce2dcab	qat: rework device scanning The updated dp.scan() changes the way how VF devices are detected. The main reason for the change is to take into account cases where the QAT VF driver is not present in the system at all but only the PF driver is loaded (and the SR-IOV devices are are enabled). The rework also takes into account bare metal and VM deployments and adds a test case for checking the virtualized environment. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-18 15:33:25 +02:00
Mikko Ylinen	2155a24e73	qat: add new devices and change defaults The plugin now detects/accepts 4xxx and c4xxx devices too and defaults to those drivers that are part of Linux mainline. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-17 15:23:00 +02:00
Mikko Ylinen	621122e456	sgx_epchook: update to cpuid/v2 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-15 19:58:13 +02:00
Ed Bartosh	2e7367eab3	fpga hook: language cleanup Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-12-10 10:58:40 +02:00
Mikko Ylinen	312b771ab7	Merge pull request #494 from bart0sh/PR0093-DSA-draft Implement DSA plugin	2020-12-09 15:15:46 +02:00
Mikko Ylinen	18ec3a449e	qat: move to path/filepath We have both "path" and "path/filepath" but the latter provides everything needed so move it completely. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-08 07:38:20 +02:00
Mikko Ylinen	ad8bbcea21	qat: rework bus-device-function handling The code was stripping out "0000:" (bus) and then adding it back in several places. That's not necessary so this change simplifies QAT VF addr handling by operating using full BDF IDs. Moveover, simplify function calls: use getDpdkDevice() once for each VF device. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-12-08 07:37:16 +02:00
Ed Bartosh	174643436a	implement DSA plugin Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-12-03 17:24:48 +02:00
Dmitry Rozhkov	f0fa9df292	operator: prepare for publishing at operatorhub.io	2020-11-24 18:35:56 +02:00
Mikko Ylinen	d65cb902e6	sgx: move to RFC v4x device API The SGX device nodes have changed from /dev/sgx/[enclave\|provision] to /dev/sgx_[enclave\|provision] in v4x RFC patches according to the LKML feedback. This changes moves to use the new device nodes. Backwards compatibility is provided by adding /dev/sgx directory mount to containers. This assumes the cluster admin has installed the udev rules provided in the README to make the old device nodes as symlinks to the new device nodes. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-11-18 21:17:28 +02:00
Dmitry Rozhkov	5ec466b2eb	add known issue for operator	2020-11-12 11:23:41 +02:00
Alexander D. Kanevskiy	75355c9937	Merge pull request #497 from bart0sh/PR0094-move-GetAPIVersion-out-of-NewPort fpga: move GetAPIVersion call out of NewPort and NewFME	2020-11-11 12:09:13 +02:00
Ed Bartosh	2c73e2a0b3	fpga: move GetAPIVersion call out of NewPort and NewFME This call is implemented by calling ioctl, which raises "open /dev/intel-fpga-port.X: operation not permitted" error when called inside unprivileged container. This breaks FPGA plugin. Calling this API from fpga_tool is still OK, so moving calls there should fix the issue. Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-11-10 16:44:20 +02:00
Dmitry Rozhkov	5f0da56045	Upgrade to k8s v1.19.3	2020-11-10 16:09:20 +02:00
Ed Bartosh	680da54fd9	fpga: improve port init Used generic newPort API instead of device-specific newDflPort and newIntelFpgaPort. Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-11-01 01:47:49 +02:00
Dmitry Rozhkov	25a52b0b74	Merge pull request #478 from bart0sh/PR0091-FPGA-SRIO-V fpga: reimplement device discovering	2020-10-30 10:05:05 +02:00
Mikko Ylinen	0f6eefee23	sgx: add documentation This commit documents the SGX building blocks for Kubernetes and how to deploy them in the cluster. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-10-27 15:02:40 +02:00
Ed Bartosh	243870a707	fpga: reimplement device discovering Reimplemented discovering of the FPGA devices using APIs from pkg/fpga/intel_fpga_linux. The APis are also used in the fpga_tool utility. The API is more advanced and supports SR-IOV among other things. Fixes: #372 Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-10-26 21:45:52 +02:00
Dmitry Rozhkov	87143355ba	Merge pull request #483 from mythi/sgx-nfd sgx: make SGX NFD kustomization overlay independent	2020-10-26 13:25:36 +02:00
Ukri Niemimuukko	5b5180ae00	gpu_nfdhook memory amount reading from sysfs This adds reading of the GPU memory amount from the sysfs. As a fallback the environment variable GPU_MEMORY_OVERRIDE remains. Another environment variable GPU_MEMORY_RESERVED can be used to reserve a dedicated byte amount outside of kubernetes usage. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-10-26 09:45:43 +02:00
Mikko Ylinen	161298190f	sgx: make SGX NFD kustomization overlay independent With the addition of SGX webhook in the operator, full SGX stack depends on having the operator deployed first. SgxDevicePlugin CRD is set to get intel-sgx-plugin and intel-sgx-initcontainer deployed by the operator. As a pre-requisite, node-feature-discovery must be deployed but it is currently deployed via sgx_plugin kustomization overlay only. It's better to allow NFD with the SGX specific settings deployed with a kustomization of its own. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-10-23 12:44:36 +03:00
Mikko Ylinen	e9dec450d6	improve docs for no_proxy when using cert-manager Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-10-21 14:57:41 +03:00
Mikko Ylinen	4e5eae62c4	update to cert-manager v1.0.3 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-10-16 22:37:57 +03:00
Ukri Niemimuukko	505eadaf94	gpu-plugin nfd-hook This adds an nfd-hook for the gpu-plugin, which will create labels for the GPUs that can then be used for POD deployment purposes or creation of GPU extended resources which allow then finer grained GPU resource management. The nfd-hook will install to the host system when the intel-gpu-initcontainer is run. It is added into the plugin deployment yaml. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-10-01 12:02:57 +03:00
Kevin Putnam	1d149ffee6	Documentation: Fixes broken links and standardizes headers. Signed-off-by: Kevin Putnam <kevin.putnam@intel.com>	2020-09-22 08:32:21 -07:00
Dmitry Rozhkov	1b82ab9df6	sync README.md files with the current state of the code Closes #356	2020-09-16 10:54:39 +03:00
Mikko Ylinen	33a4f8f546	sgx: add SgxDevicePlugin CRD and admission webhook Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-09-10 15:31:26 +03:00
Ukri Niemimuukko	b2991b94e1	gpu_plugin: reduce topology scanning for high shared dev count For every created device info, a new topology scan is performed in the filesystem. The shared dev count was implemented so that for each shared device, a new device info was created, which resulted in the topology scan happening as many times per Scan-round, as there were shared devs. This fixes the issue by making the device info to be shared among the shared devices. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-09-08 18:57:29 +03:00
Dmitry Rozhkov	9bdf3a4def	Merge pull request #440 from mythi/ctrl-runtime-062 go.mod: update controller-runtime to v0.6.2	2020-09-03 12:02:06 +03:00
Dmitry Rozhkov	41e23dab3f	Merge pull request #438 from mythi/updates-20200901 .gitignore + kind + cert-manager v1.0.0	2020-09-03 12:00:33 +03:00
Alexander Kanevskiy	c74cb563dc	Implemented SR-IOV Release/Assign ioctl fpgatool now able to prepare FME via kernel ioctl to release and assign ports for SR-IOV configurations.	2020-09-02 18:16:53 +03:00
Mikko Ylinen	f0d4754d53	move to cert-manager v1.0.0 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-09-02 18:07:05 +03:00
Mikko Ylinen	76aa7b91f0	go.mod: update controller-runtime to v0.6.2 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-09-02 15:16:12 +03:00
Dmitry Rozhkov	71075d4478	lint: enable exportloopref, prealloc and scopelint checks	2020-08-31 11:10:51 +03:00
Dmitry Rozhkov	be713f1c8b	lint: enable errcheck	2020-08-28 16:14:14 +03:00
Mikko Ylinen	6b2148d22c	Merge pull request #431 from rojkov/staticcheck linter: enable staticcheck	2020-08-26 18:08:09 +03:00
Ukri Niemimuukko	7244bd0f25	gpu_plugin: README.md update Move remark about GVT-d to end of introduction. Remove remarks about GVT-g for the time being. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-08-25 13:45:10 +03:00
Dmitry Rozhkov	7ff08ee874	linter: enable staticcheck	2020-08-25 09:54:59 +03:00
Mikko Ylinen	a5f648077e	sgx: add NFD EPC source, README and deployment YAMLs Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-08-24 16:33:45 +03:00
Ismo Puustinen	3ab60b4027	sgx: add tests for the plugin. Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>	2020-08-24 16:33:45 +03:00
Ismo Puustinen	8751afb6c7	sgx: add new plugin. The SGX plugin exposes two device files as separate resources: * /dev/sgx/enclave as sgx.intel.com/enclave * /dev/sgx/provision as sgx.intel.com/provision The number of resources is configurable, but it's intended to be equal to the pod count by default, so that any pod requiring access would have it. The access control (who can do SGX remote attestation) is done outside this plugin. Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>	2020-08-24 16:33:45 +03:00
Mikko Ylinen	cd068c797a	ci: update tool versions Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-08-21 17:04:04 +03:00
Dmitry Rozhkov	200e2f8181	operator: add simple FPGA operator combined with FPGA webhook	2020-08-18 17:32:23 +03:00
Ed Bartosh	4794072273	Merge pull request #422 from rojkov/fpga-kubebuilder fpga webhook: reimplement to use kubebuilder framework	2020-08-18 13:31:31 +03:00
Dmitry Rozhkov	a62c6f7d5e	fpga webhook: reimplement to use kubebuilder framework Simplify upgrade procedure to newer versions of kubernetes by relying on the kubebuilder framework rather than using codegen directly. Closes #377	2020-08-17 12:09:03 +03:00
Mikko Ylinen	1cfb849eef	qat: update QAT software stack Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-08-12 23:08:59 +03:00
Dmitry Rozhkov	e87d94d4fb	fpga: finalize plugin kustomization closes #318	2020-07-01 11:57:45 +03:00
Mikko Ylinen	2f16509fe3	Merge pull request #376 from rojkov/operator-v3 operator: initial version with gpu and qat controllers	2020-06-25 15:49:49 +03:00
Dmitry Rozhkov	6b2fa0a264	operator: initial version with gpu and qat controllers	2020-06-25 13:48:41 +03:00
Alexander D. Kanevskiy	79ef9d54e2	Merge pull request #397 from rojkov/nakedret fpga_tool: enable nakedret check	2020-06-24 20:52:33 +03:00
Dmitry Rozhkov	7177409f19	fpga webhook: rework deployment to use kustomize Contributes to #318	2020-06-23 15:53:36 +03:00
Dmitry Rozhkov	339cdee501	linter: enable nakedret check	2020-06-23 12:04:35 +03:00
Mikko Ylinen	bc22a07638	Merge pull request #398 from rojkov/gosec linter: enable gosec check	2020-06-16 16:16:02 +03:00
Dmitry Rozhkov	73aea0aa1b	linter: enable gosec check	2020-06-11 17:56:24 +03:00
Dmitry Rozhkov	828e12f896	doc: add note about proxy to webhook doc	2020-06-11 16:06:54 +03:00
Dmitry Rozhkov	70f862f2aa	add golangci linter In this initial commit the following checks are disabled due to excessive amount of changes required: - dupl (duplicate code) - funlen (function length) - goerr113 (errors handling expressions) - gomnd (magic numbers) - gosec (security) - nakedret (naked returns) - wsl (forces to use empty lines) - errcheck (checking for unchecked errors) - staticcheck (static analysis)	2020-06-08 14:01:13 +03:00
Dmitry Rozhkov	aabc45cbb5	gpu: increase code coverage for unit tests	2020-05-19 16:14:40 +03:00
Dmitry Rozhkov	c63dbf61b8	fpgawebhook: move to v2 API of fpga.intel.com group	2020-05-04 15:43:20 +03:00
Dmitry Rozhkov	99fcb69d33	fpga: compress fpga AF resource names	2020-04-29 11:59:50 +03:00
Dmitry Rozhkov	6c2eacfae5	webhook: remove mode of operation fpga: make AFU resource name 63 char long webhook: drop mode from README webhook: extend mappings description webhook: tighten CRD definitions webhook: drop mapping to non-existing afuId explicitly state mappings names can be in any format use consistent terminology across fpga webhook and plugin	2020-04-22 13:55:43 +03:00
Dmitry Rozhkov	8fc187f4d8	move to k8s v1.18.2 release Also fix the plugins and e2e tests	2020-04-17 12:40:18 +03:00
Mikko Ylinen	e4a57899d2	qat: fix UIO mounts DPDK uses /sys/class/uio/uioX/device/[control\|resource] and we had special mounts for the individual uioX paths. However, it turned out this wasn't working as expected: host's /sys/class/uio/uioX/device/ was mounted to container's /sys/class/uio and DPDK failed to find uioX/device/[control\|resource] files. Moreover, workloads requesting more than one QAT resource, still saw only one path. While cri-o/containerd give sysfs read-only mounts, DPDK needs device/config RW. Therefore, we need to mount host /sys/class/uio/uioX to container /sys/class/uio/uioX for each requested device. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-04-01 09:08:55 +03:00
Ed Bartosh	2ec6677ab0	fpga tests cleanup - used t.Run api for better visibility - used ioutil.TempDir to create temporary directories Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-03-31 14:36:15 +03:00
Ed Bartosh	a668c596b2	fpga_crihook: improve unit tests - increased test coverage to 91.4% - cleaned up the code - removed unused test data Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-03-31 11:57:06 +03:00
Alek Du	cfbb69ddd6	vpu: improve test coverage Changed code a little bit to improve test coverage: * call Scan in test code * call Scan without hddl socket * call Scan with 0 SharedDevNum * move SharedDevNum in newDevicePlugin * use Ticker instead of Sleep Signed-off-by: Alek Du <alek.du@intel.com>	2020-03-31 14:12:59 +08:00
Graham Whaley	71d08224ee	fpga: move to using klog for logs and debug Move all the fpga components to using klog for logging and debug. This includes replacing our homebrew 'fatal()' with klog.Error(). Modify the deployment files to move from `-debug` to `-v`, and set their default level to '1' (Info), rather than full debug mode ('4'). Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-03-24 14:31:53 +00:00
Ed Bartosh	cf731f3c18	fpga plugin: increase test coverage	2020-03-24 15:46:39 +02:00
Ed Bartosh	29be713a96	fpga_plugin: use time.Ticker instead of time.Sleep Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2020-03-24 13:32:35 +02:00
Mikko Ylinen	a6bf48f8db	dpdkdrv: improve unit test coverage Add NewDevicePlugin() tests to improve test coverage. This also contributes to "input validation" (part of #321) that wasn't done properly before. Fixes: #325 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-03-24 08:23:44 +02:00
Mikko Ylinen	336d2b34bc	Merge pull request #340 from grahamwhaley/20200316_klog_vpu vpu: move to using klog	2020-03-24 08:10:21 +02:00
Graham Whaley	626bbb6ee2	gpu: move to using klog Move from fmt to klog for logging and debug. Also add an extra info level message noting when we find new devices. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-03-20 11:54:38 +00:00
Graham Whaley	82713d0cf9	vpu: move to using klog Move to using klog for logging and debug for vpu plugin. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-03-20 11:38:20 +00:00
Mikko Ylinen	15d4b10715	Merge pull request #329 from grahamwhaley/20200312_klog klog: Add klog logging to framework and qat plugins	2020-03-19 16:59:44 +02:00
Graham Whaley	f8dbc896a1	devicemanager: qat: use klog for logging and debug Move the framework, and the qat driver, to use `klog` for logging and debug. This has a some noticeable effects: 1) Our default log output gains a bunch of annotation: From: QAT device plugin started in 'dpdk' mode To: I0312 11:51:02.057728 6053 qat_plugin.go:64] QAT device plugin started in 'dpdk' mode (there is now a command line option to drop those annotations if necessary). 2) We gain a bunch of command line parameters from klog for controlling log levels and output. We go from 5 arguments to 17: --- Usage of ./cmd/qat_plugin/qat_plugin: -add_dir_header If true, adds the file directory to the header -alsologtostderr log to standard error as well as files -debug enable debug output -dpdk-driver string DPDK Device driver for configuring the QAT device (default "vfio-pci") -kernel-vf-drivers string Comma separated VF Device Driver of the QuickAssist Devices in the system. Devices supported: DH895xCC,C62x,C3xxx and D15xx (default "dh895xccvf,c6xxvf,c3xxxvf,d15xxvf") -log_backtrace_at value when logging hits line file:N, emit a stack trace -log_dir string If non-empty, write log files in this directory -log_file string If non-empty, use this log file -log_file_max_size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800) -logtostderr log to standard error instead of files (default true) -max-num-devices int maximum number of QAT devices to be provided to the QuickAssist device plugin (default 32) -mode string plugin mode which can be either dpdk (default) or kernel (default "dpdk") -skip_headers If true, avoid header prefixes in the log messages -skip_log_headers If true, avoid headers when opening log files -stderrthreshold value logs at or above this threshold go to stderr (default 2) -v value number for the log level verbosity -vmodule value comma-separated list of pattern=N settings for file-filtered logging --- 3) Our `-debug` flag is now replaced by the `klog` `-v n` flag. NOTE: This is potentially a minor breaking change. Applying this debug overlay to any previous (pre-klog edit) images will cause the container to fail to launch, as it will not recognise the new `-v` arguments. We also update the kustomize deployment to move from using DEBUG env vars to adding a VERBOSITY var that controls both the log verbosity and now the debug mode enabling. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-03-19 11:20:48 +00:00
Mikko Ylinen	b021152eb8	qat: kerneldrv: fix device registration when run in VMs Kerneldrv checks for available devices based on adf_ctl output. We only accepted two cases: PFs if IOMMU is off and VFs if IOMMU is on. The right check is to only skip PFs if IOMMU is on and allow other cases. This fixes two scenarios: when run in VMs, we accept VFs regardless of (v)IOMMU presence. Moreover, do not hard code domain '0000:' because it is not the case always. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-03-16 20:17:57 +02:00
Alek Du	7c2bc3bda0	vpu_plugin: add kustomizations - Default deployment: `kubectl apply -k deployments/vpu_plugin` - Default deployment does not specify namespace anymore (was: `kube-system`) - Variant: deploy to `kube-system` instead of user-defined namespace (or `default`) `kubectl apply -k deployments/vpu_plugin/overlays/namespace_kube-system` - VPU plugin README updated. - Change volume mounts to readonly when possible Signed-off-by: Alek Du <alek.du@intel.com>	2020-02-25 14:53:26 +08:00
Mikko Ylinen	332fbdc35c	Merge pull request #300 from askervin/55B_fpga_kustomization fpga plugin kustomization, stage 2	2020-02-24 22:20:27 +02:00
Antti Kervinen	5fe8174077	fpga_plugin: add kustomization files - Add script/fpga-plugin-prepare-for-kustomization.sh, creates contents for the secret needed by the fpga plugin webhook. - Single-command fpga plugin + webhook deployment for both modes: - `kubectl create -k deployments/fpga_plugin/overlays/af` - `kubectl create -k deployments/fpga_plugin/overlays/region` - Change intel-fpga-plugin image CMD to ENTRYPOINT.	2020-02-24 16:32:26 +02:00
Ed Bartosh	ca5d144e8e	Merge pull request #296 from mythi/gomod fpga_plugin: drop dependency to k8s.io/kubernetes	2020-02-24 14:10:12 +02:00
Ed Bartosh	13836c2d09	Merge pull request #299 from mythi/gitclone READMEs: use git clone to get the code	2020-02-24 12:42:32 +02:00
Mikko Ylinen	61c135d1d6	fpga_plugin: drop dependency to k8s.io/kubernetes This commit drops fpga_plugin dependency to k8s.io/kubernetes which was used to get GetHostname(). After this change, the plugin node name can be set using new -node-name parameter. The default value for is read from NODE_NAME environment variable. If the node annotation override check fails, we continue with the default mode parameter and do not exist like we did previously. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-02-21 18:48:30 +02:00
Mikko Ylinen	f145541caf	READMEs: use git clone to get the code go get'ing does not work due to our k8s.io/kubernetes dependency so guide users to use git clone to get the code. Fixes: #290 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-02-20 08:04:07 +02:00
Antti Kervinen	d04aa77ac5	fpga_plugin: orchestration/orchestrated fixed in READMEs Not touching "orchestration programmed". Fixing only instances where this refers directly to the mode recognized by the webhook-deploy.sh script. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>	2020-02-17 16:32:54 +02:00
Dmitry Rozhkov	3db440d2d4	Merge pull request #288 from askervin/kustomize-gpu gpu_plugin: add kustomizations	2020-02-11 10:54:14 +02:00
Ed Bartosh	1f4928790f	Implement function for DeviceInfo creation - Made DeviceInfo fields private - Implement NewDeviceInfo constructor	2020-02-07 15:26:37 +02:00
Antti Kervinen	d568f050c5	gpu_plugin: add kustomizations - Default deployment: `kubectl apply -k deployments/gpu_plugin` - Default deployment does not specify namespace anymore (was: `kube-system`). - Variant: deploy only on nodes with Intel GPU label by NFD: `kubectl apply -k deployments/gpu_plugin/overlays/nfd_labeled_nodes` - Variant: deploy to `kube-system` instead of user-defined namespace (or "default"): `kubectl apply -k deployments/gpu_plugin/overlays/namespace_kube-system` - GPU plugin README updated. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>	2020-02-07 14:56:52 +02:00
Mikko Ylinen	f036b72cff	Merge pull request #286 from askervin/kustomize qat_plugin: add kustomizations	2020-02-06 13:53:08 +02:00
Antti Kervinen	ec8eef6daa	qat_plugin: add kustomizations - Default deployment: `kubectl apply -k deployments/qat_plugin` - Debug variant: `kubectl apply -k deployments/qat_plugin/overlays/debug` - Single-resource `yaml` naming convention: applying x-y-z.yaml configures k8s resource named x-y-z. - QAT plugin README updated. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>	2020-02-05 15:48:57 +02:00
Mikko Ylinen	28a89a2820	qat: README: clarify crypto-perf usage crypto-perf instructions were outdated and hand implicit assumptions about the environment. More specifically: Clear Linux builds DPDK libraries as shared so for the compress and crypto test applications to run, the memory and QAT PMD libraries must be explicitly preloaded using '-d' parameter. Also, the test-crypto1 and test-compress1 deployments expect the cluster is configured with CPU Manager's static policy. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2020-02-04 13:32:10 +02:00
Mikko Ylinen	0c89f242aa	Merge pull request #283 from alekdu/fix_readme vpu: refactor the vpu plugin readme	2020-02-04 13:28:38 +02:00
Alek Du	6321c424ca	vpu: refactor the vpu plugin readme Just follow the standard format to fix the vpu plugin readme. Also added the ubuntu OpenVINO demo job long logs. Signed-off-by: Alek Du <alek.du@intel.com>	2020-02-04 18:15:27 +08:00
Ed Bartosh	20ea365e62	Merge pull request #268 from grahamwhaley/20200117_fpga_readme fpga: docs: update all the READMEs	2020-02-03 12:52:09 +02:00
Ed Bartosh	7e6e053349	Merge pull request #279 from rojkov/cleanup Cleanup	2020-01-31 15:59:34 +02:00
Graham Whaley	07e902334f	fpga: crio: docs: update README Update the CRI-O webhook README, adding notes about what it is and does, and that it is normally installed as part of the device plugin daemonset. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-30 16:19:19 +00:00
Graham Whaley	f39a374e9d	fpga_admission: docs: expand README Expand the FPGA webhook admission controller README. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-30 16:19:19 +00:00
Graham Whaley	27bc562478	fpga plugin: docs: Clean up and expand README Expand and re-arrange the README. Add some details about what the plugin and other FPGA components provide. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-30 16:19:19 +00:00
Dmitry Rozhkov	456c8f3ff1	fpga: fix stutter reported by golint	2020-01-30 15:17:27 +02:00
Dmitry Rozhkov	7695e450de	fpga_crihook: remove unused struct field	2020-01-29 17:17:06 +02:00
Dmitry Rozhkov	3a845cfe15	fpga: rename files to make them linux-only	2020-01-29 17:17:06 +02:00
Graham Whaley	6537e38499	gpu: do not fail if device scanning fails If we fail to scan for GPU devices (note, that is potentially different from not finding any devices during a scan), then warn on it, and go around the poll loop again. Do not treat it as a fatal error or we might end up in a re-launch death deploy loop... Of course, getting a warning in your logs every 5s could also be annoying, but is somewhat 'less fatal'. Fixes: #260 Fixes: #230 Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-29 09:24:50 +00:00
Mikko Ylinen	9d76946b49	Merge pull request #269 from grahamwhaley/20200121_qat_readme qat: docs: Update the README	2020-01-29 07:29:27 +02:00
Alek Du	887e56e780	VPU: Add Intel Movidius MyriadX VPU plugin support This patch is to support Intel VCAC-A card (with MyriadX 2485 VPUs), for other later on VPUs, we will reuse this plugin and add support. VCAC-A board info is at: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/media-analytics-vcac-a-accelerator-card-by-celestica-datasheet.pdf Also add openvino HDDL VPU demo for Intel VCAC-A card. Signed-off-by: Alek Du <alek.du@intel.com>	2020-01-28 23:17:50 +08:00
Graham Whaley	1ca19696e0	qat: docs: Update the README Update the QAT README. Add some descriptions. Add information about the command line and config options. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-27 16:51:00 +00:00
Graham Whaley	958ab2aa7e	fpga: docs: Add diagrams for FPGA modes Add draw.io and their generated PNG files for both orchestrated and preprogrammed FPGA modes. These will then be used in the documentation. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-27 14:55:15 +00:00
Graham Whaley	88cec1fd16	fpga_tool: doc: add a basic README The fpga_tool had no README. Add a basic one. Desired as we should at least reference the tool from the fpga_plugins document. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-17 16:36:40 +00:00
Graham Whaley	79a86c10e8	docs: gpu: Add more details, re-arrange section order Re-arrange the section order a little (such as putting the use of the DaemonSet before the sudo hand-deploy), and add a lot more detail of what to expect, and how to check if the pod has launched correctly. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-17 13:34:13 +00:00
Graham Whaley	6705a8e461	docs: gpu: add high level details to README Fill out the introduction to the GPU README to give some details around what the plugin supports and how. Signed-off-by: Graham Whaley <graham.whaley@intel.com>	2020-01-16 15:27:22 +00:00
Ed Bartosh	7aca59e032	Merge pull request #245 from rojkov/update-v1.17.0 bump k8s dependencies up to v1.17.0	2020-01-15 13:07:55 +02:00
Ed Bartosh	1b1206e39a	fpga: change webhook service port Changed port webhook is listening on from 443 to 8443 to be able to bind to it from non-root user account.	2020-01-14 16:31:12 +02:00
Dmitry Rozhkov	814e2e1a50	bump k8s dependencies up to v1.17.0	2020-01-09 11:19:58 +02:00
Ed Bartosh	06c07a5961	deployments/fpga_plugin: limit host mounts The default deployment gives rather wide host mounts. Limited sysfs mount only to the subdirectory the plugin needs. Mounted sysfs and dev mounts read-only. Added notes that FPGA plugin can be run as non-root user.	2019-12-12 13:07:19 +02:00
Mikko Ylinen	fd631fc31c	deployments/gpu_plugin: limit host mounts The default deployment gives rather wide host mounts. We can limit the mounts only to the subdirectories the plugin needs and mount them read-only. Also, add notes that both QAT and GPU plugins can be run as non-root user. Fixes: #228 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-12-11 12:54:36 +02:00
Alexander Kanevskiy	67825dcc06	Fix admission hook for pods generated by ReplicaSet In the pods generated automatically by Deployment/ReplicaSets fields name and namespace might be missing. We can use information about namespace from request itself.	2019-10-25 17:40:42 +03:00
Ed Bartosh	57b4927eda	crihook: simplified NewHookEnv signature	2019-09-16 12:56:35 +03:00
Ed Bartosh	8d21aff5ac	crihook: removed unused field	2019-09-16 12:51:50 +03:00
Ed Bartosh	73ac87cd8d	crihook; fix forgotten error check	2019-09-16 12:50:29 +03:00
Ed Bartosh	a6b3a217e8	crihook: fix ineffective Errorf call Returned error instead of calling errors.Errorf with no effect.	2019-09-16 12:49:26 +03:00
Ubuntu	4f28657b6b	fpga: fixed documentation and demo	2019-09-10 19:30:20 -05:00
Alexander Kanevskiy	cd263ba287	Update README file for fpga_crihook Initcontainer is now built in main build process, no need to download anythin special. Added note about checking OCI hooks configuration parameter in CRI-O Fixes: #192	2019-08-25 02:37:07 +03:00
Alexander Kanevskiy	2430e204d5	fpga_tool: UX improvements - user readable output for fpgainfo/fmeinfo/portinfo commands - new commands: list, list-fme, list-port - new -q flag to suppres headers, progress and too verbose messages - install command will now fail if destination file already exist - new --force flag: allows overwrite files in install command - removed development and debug output	2019-08-25 02:37:07 +03:00
Alexander Kanevskiy	71bb38f496	Implemented native FPGA flashing Removed dependency to OPAE libraries	2019-08-25 02:37:01 +03:00
Ed Bartosh	de9df8373e	fpga_plugin: support in-tree kernel driver Extended fpga plugin to support both in-tree(DFL) and out-of-tree (OPAE) kernel drivers. - fpga_crihook: move JSON parsing to separate functions - decreased cyclomatic complexity of the CRI hook main() function - increased readability - increased test coverage Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2019-08-24 18:27:15 +03:00
Alexander Kanevskiy	186ec6613c	FPGA: migrate to ClearLinux environment - Migrate to OPAE 1.3.2 - Build all the tools from the source - ignore files in workspace - minimal fpga_tool utility to check gbs/aocx file parsing and flashing - implemented kernel IOCTL based flashing of bitstreams - add PCI and sysfs functions	2019-08-24 02:55:19 +03:00
Mikko Ylinen	832e4aaf3c	crypto-perf: add kustomization and move to deployments We plan to use crypto-perf for simple QAT testing. This commit adds kustomization to make the deployment easier. The original .yaml is also moved to deployments/ with some changes. For instance, it turns out also vfio-pci mode with DPDK needs CAP_SYS_ADMIN (See PR: #187 which states that only igb_uio would need it). kustomize is available part of kubectl since kubernetes v1.14. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-08-20 22:01:44 +03:00
Dmitry Rozhkov	a2debf6fb4	qat: fix typo	2019-08-19 12:52:16 +03:00
Mikko Ylinen	d92b528ab6	qat: document kerneldrv mode and build instructions -mode kerneldrv comes with no documentation. This patch adds few notes about it and instructions how to get it build if a user chooses to have it enabled. Closes: #197 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-08-19 09:56:57 +03:00
Dmitry Rozhkov	8390388f89	qat: make users explicitly opt in to have kernel mode compiled in	2019-08-14 13:41:44 +03:00
Mikko Ylinen	08a079ead2	crypto-perf: use IPC_LOCK to ensure mmap() works Change SYS_ADMIN to IPC_LOCK capability to ensure DPDK gets to mmap() hugepages. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-06-12 07:31:01 +03:00
Dmitry Rozhkov	156970adca	Merge pull request #185 from mythi/iommu qat_plugin: kerneldrv: register VF devices when IOMMU is on	2019-05-31 14:08:29 +03:00
Mikko Ylinen	4ba6af14b9	qat_plugin: kerneldrv: register VF devices when IOMMU is on When IOMMU is on in the system, the physical function (PF) devices cannot be used. This prevented using kerneldrv as it was only written to work with PFs. However, QAT bare metal functions can also be used when IOMMU is enabled. In this case, they must be used via virtual functions (VF). This commit makes it possible to use kerneldrv when IOMMU is on. The added side benefit is we can now slice the same QAT HW for both "dpdk" and "kernel" usages simultaneously. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-05-29 22:10:26 +03:00
Alexander Kanevskiy	4dc19851ee	Pass correct PCI bus/device/function to fpgaconf Partially helps with #148	2019-05-29 16:08:52 +03:00
Mikko Ylinen	4a80aa83e2	qat_plugin: kerneldrv: get device.id from inst_id In adf_ctl output, qat_devX is a sequence number that includes both PF and VF devices: qat_dev0 - type: c6xx, inst_id: 0, node_id: 1, bsf: 84:00.0, #accel: 5 #engines: 10 state: up qat_dev1 - type: c6xx, inst_id: 1, node_id: 1, bsf: 85:00.0, #accel: 5 #engines: 10 state: up qat_dev2 - type: c6xx, inst_id: 2, node_id: 1, bsf: 86:00.0, #accel: 5 #engines: 10 state: up qat_dev3 - type: c6xxvf, inst_id: 0, node_id: 1, bsf: 84:01.0, #accel: 1 #engines: 1 state: up qat_dev4 - type: c6xxvf, inst_id: 1, node_id: 1, bsf: 84:01.1, #accel: 1 #engines: 1 state: up ... X cannot be used as the config file identified because it does not match the real id of the device. inst_id gives this so move to use that to find the right config file. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2019-05-28 15:49:45 +03:00
Alexander D. Kanevskiy	5a7c5079d4	Merge pull request #183 from rojkov/admission-readme gpu: fix grammar	2019-05-24 17:29:54 +02:00
Alexander D. Kanevskiy	dd70b69c76	Merge pull request #182 from rojkov/verbosity gpu: add log messages for not found cards	2019-05-24 16:33:50 +02:00
Dmitry Rozhkov	f5d5cd32ed	gpu: fix grammar	2019-05-24 16:45:59 +03:00
Dmitry Rozhkov	44ff734be6	gpu: add log messages for not found cards Let a user know the plugin can't find any Intel GPU if that's the case. It might be cumbersome to realize that the plugin runs on a host which doesn't have any Intel GPUs. Also make the code less nested for better readability.	2019-05-24 16:19:06 +03:00
Dmitry Rozhkov	ea63ad94f2	webhook: add note on mapping applicability	2019-05-24 10:28:37 +03:00
Dmitry Rozhkov	da132f6584	qat: add kernel mode plugin	2019-04-25 14:15:32 +03:00
Rivera Gonzalez, Julio C	22b9c61c4d	Adding support for dh895xcc devices This commit adds the possibility to qat2_plugin use pci, devices with communication chipset 8925 to 8955. Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>	2019-04-25 14:14:09 +03:00
Dmitry Rozhkov	ca569b0f70	qat: initial support for openssl QAT engine	2019-04-25 14:14:09 +03:00
Ed Bartosh	ea5a06dfae	Merge pull request #172 from rojkov/issue-167-namespaced-fpga-mappings fpga: mutate pods with CRDs from its corresponding namespace	2019-04-09 14:35:56 +03:00
Dmitry Rozhkov	565045f6f2	fpga: mutate pods with CRDs from its corresponding namespace CRDs for AF or Region mappings are scoped to namespaces. So an admitted pod has to be mutated with CRDs existing in the same namespace as the pod's. Closes #167	2019-04-02 12:17:08 +03:00
Dmitry Rozhkov	4bf8c5e685	Fix compilation issues	2019-02-19 16:12:56 +02:00
nolancon	52df9329e4	Re-order devices in scan loop Fixes: #146 Removed whitespace	2019-01-23 13:41:22 +00:00
Dmitry Rozhkov	54332c5eea	announce deviceplugin API public	2019-01-21 17:20:01 +02:00
Dmitry Rozhkov	7662cb9154	extend API to receive full specs instead of strings	2019-01-21 17:15:27 +02:00
Dmitry Rozhkov	58b62f579b	qat: fix numbering of env vars An `Allocate()` request can be used to allocate resources for many containers thus `counter` needs to be reset for each container response.	2018-12-12 13:42:05 +02:00
ssehgal	100ecf8340	Improving consumption of devices by updating the environment variable name based on number of devices requested in a container(e.g. QAT0, QAT1)	2018-12-05 15:11:23 +00:00
nolancon	1bb035cc64	PostAllocate implemented in QAT device plugin	2018-12-05 15:11:23 +00:00
ssehgal	eb6d48a512	QAT README update and crypto perf image tag correction	2018-12-03 14:03:55 +00:00
Ed Bartosh	1215bc7fb7	admissionwebhook: fix region regexp Region regexp doesn't allow to have dots, which results in incorrect matching of arria10.dcp1.0 region.	2018-11-28 19:56:35 +02:00
Mikko Ylinen	794b3077bd	qat_plugin: readme: list all known VF devices Not all QAT chips (e.g, 37c9) are available in pci.ids which makes "grep QAT" to not show them. Scan all known VF PCI ids in a loop to ensure all configured devices are shown. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2018-11-28 10:32:31 +02:00
Mikko Ylinen	187f8040f0	qat_plugin: use vfio-pci as the default driver vfio-pci uses IOMMU memory protection and is a safer default. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2018-11-28 10:32:31 +02:00
Frederik Carlier	d6016dedf9	Fix typos	2018-11-22 20:44:00 +00:00
Mikko Ylinen	00bbe922de	qat: deployment: set parameters via ConfigMap For easier deployments, fetch plugin command line arguments from ConfigMap. When using ConfigMaps, qat_plugin.yaml needs no changes and can always be used as is. qat_plugin_default_configmap.yaml uses built-in defaults. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2018-11-20 13:43:00 +02:00
Dmitry Rozhkov	c2b635e627	webhook: reformat source code with gofmt 1.11	2018-10-04 11:03:24 +03:00
Dmitry Rozhkov	06487dcded	crihook: do program multiple devices at once	2018-10-04 10:19:23 +03:00
Dmitry Rozhkov	6ce053a0a6	crihook: drop unused test data	2018-10-04 10:19:23 +03:00
Dmitry Rozhkov	dc21749a83	crihook: optimize regexp application	2018-10-04 10:19:23 +03:00
Dmitry Rozhkov	f1623cc5e9	webhook: add support for multiple FPGAs per container	2018-10-04 10:19:23 +03:00
Dmitry Rozhkov	90776a63c7	webhook: make debug message meaningful	2018-10-04 10:19:23 +03:00
Ed Bartosh	14b4168cbd	add GPU plugin deployment Added DaemonSet yaml Added deployment instructions to plugin's README	2018-09-14 13:55:08 +03:00

... 3 4 5 6 7 ...

533 Commits