Commit Graph

88 Commits

Author SHA1 Message Date
Dmitry Rozhkov
5f0da56045 Upgrade to k8s v1.19.3 2020-11-10 16:09:20 +02:00
Ukri Niemimuukko
c935570bab operator: GPU-plugin initImage
This adds the initImage field to the custom resource definition
and takes it into use.

The fpga webhook image validation function is split off into a
separate file.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-11-09 20:55:12 +02:00
Mikko Ylinen
a8105befe0 demo: kustomize sgx sample deployments
adding kustomization to deploy sample jobs that demonstrate

1. launching of plain sample enclave application
2. SGX ECDSA quote generation "out-of-proc" using aesmd
3. SGX ECDSA quote generation "in-proc"

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-27 15:02:40 +02:00
Dmitry Rozhkov
87143355ba
Merge pull request #483 from mythi/sgx-nfd
sgx: make SGX NFD kustomization overlay independent
2020-10-26 13:25:36 +02:00
Mikko Ylinen
0bffaf2f2d SGX: provide SGX aesmd sample
SGX aesmd (architectural enclave service daemon) can be used for SGX
DCAP Quote Generation. This commit adds a sample deployment that by
default talks to an Intel reference PCCS (Provisioning Certificate
Caching Service).

The default config provided is for a "single node" cluster that has
PCCS service localhost.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-23 13:21:17 +03:00
Mikko Ylinen
790bfd0fd2 operator: add sgxdeviceplugin-sample CRD
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-23 13:20:20 +03:00
Mikko Ylinen
161298190f sgx: make SGX NFD kustomization overlay independent
With the addition of SGX webhook in the operator, full SGX stack
depends on having the operator deployed first. SgxDevicePlugin CRD
is set to get intel-sgx-plugin and intel-sgx-initcontainer deployed
by the operator.

As a pre-requisite, node-feature-discovery must be deployed but it
is currently deployed via sgx_plugin kustomization overlay only.

It's better to allow NFD with the SGX specific settings deployed with
a kustomization of its own.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-23 12:44:36 +03:00
Mikko Ylinen
f0a6302282 CRDs: disable CRD conversion webhooks
We currently build using trivialVersions=true and don't deal with
multiversion APIs and their conversion webhooks.

Therefore, drop the registration of the conversion webooks.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-14 14:48:40 +03:00
Mikko Ylinen
e054440a32 webhooks: move to admissionregistration.k8s.io/v1
With controller-gen 0.4.0, admissionregistration defaults to v1 API.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-10-14 14:48:40 +03:00
Ukri Niemimuukko
505eadaf94 gpu-plugin nfd-hook
This adds an nfd-hook for the gpu-plugin, which will create labels
for the GPUs that can then be used for POD deployment purposes or
creation of GPU extended resources which allow then finer grained
GPU resource management.

The nfd-hook will install to the host system when the
intel-gpu-initcontainer is run. It is added into the plugin deployment
yaml.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-10-01 12:02:57 +03:00
Mikko Ylinen
335ca93d39 qat: add kustomize overlay to enable SR-IOV
This commit adds two initcontainers in a kustomize overlay to QAT
deployment. The overlay can be used to prepare QAT setup on a freshly
booted system.

Note: containerd/cri-o seem to have issues mounting sysfs rw in even
if the container is privileged. Therefore, we do a special /sys:/sys
bind mount for 'cat sriov_totalvs | tee sriov_numvfs' to work.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-09-15 07:39:25 +03:00
Mikko Ylinen
33a4f8f546 sgx: add SgxDevicePlugin CRD and admission webhook
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-09-10 15:31:26 +03:00
Mikko Ylinen
f0d4754d53 move to cert-manager v1.0.0
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-09-02 18:07:05 +03:00
Dmitry Rozhkov
378620b54b
Merge pull request #434 from mythi/update-20200828
operator updates
2020-09-01 14:49:09 +03:00
Mikko Ylinen
d8cd5814d7 operator: regenerate CRDs and small webhook/controller updates
this commits also changes validatePluginImage() to allow
image version as a parameter so that it can be used by by
other webooks too.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-08-31 11:29:04 +03:00
Mikko Ylinen
597b985cdf sgx: move hookinstall job to an initcontainer
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-08-28 11:01:35 +03:00
Mikko Ylinen
a5f648077e sgx: add NFD EPC source, README and deployment YAMLs
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-08-24 16:33:45 +03:00
Dmitry Rozhkov
200e2f8181 operator: add simple FPGA operator combined with FPGA webhook 2020-08-18 17:32:23 +03:00
Dmitry Rozhkov
a62c6f7d5e fpga webhook: reimplement to use kubebuilder framework
Simplify upgrade procedure to newer versions of kubernetes by relying on the
kubebuilder framework rather than using codegen directly.

Closes #377
2020-08-17 12:09:03 +03:00
Dmitry Rozhkov
e87d94d4fb fpga: finalize plugin kustomization
closes #318
2020-07-01 11:57:45 +03:00
linjiach
9cdb9a1446 add mappings for d5005-matrix-mult-orchestrated 2020-06-29 14:01:00 +00:00
Ed Bartosh
0c9831bf5c mapping-collection: add mappings for arria10.dcp1.2-nlb3-preprogrammed
This mapping will be used in the new demo screencast for FPGA plugin
deployment in preprogrammed mode.
2020-06-29 12:01:17 +03:00
Mikko Ylinen
2f16509fe3
Merge pull request #376 from rojkov/operator-v3
operator: initial version with gpu and qat controllers
2020-06-25 15:49:49 +03:00
Dmitry Rozhkov
6b2fa0a264 operator: initial version with gpu and qat controllers 2020-06-25 13:48:41 +03:00
linjiach
179a70179d
extend afu id length to 40 for aocx unique id
OpenCL bitstream .aocx has longer than 32 unique ID. Extend to 40 to accommodate it.
2020-06-25 00:31:50 -07:00
Dmitry Rozhkov
7177409f19 fpga webhook: rework deployment to use kustomize
Contributes to #318
2020-06-23 15:53:36 +03:00
Mikko Ylinen
c8ed2bb798 deployments: qat: add an overlay for Apparmor annotations
Some Ubuntu systems may run with Apparmor LSM policy enformements making
the default QAT daemonset to fail with (un)bind errors.

This commit adds a sample kustomize overlay to deploy the QAT daemonset with
Apparmor uconfined policy.

Fixes: #381

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-06-01 07:50:35 +03:00
Ed Bartosh
8b429fd99d
Merge pull request #358 from rojkov/webhook-modeless
fpga: make admission webhook mode-less
2020-05-05 10:24:13 +03:00
Dmitry Rozhkov
c63dbf61b8 fpgawebhook: move to v2 API of fpga.intel.com group 2020-05-04 15:43:20 +03:00
Dmitry Rozhkov
99fcb69d33 fpga: compress fpga AF resource names 2020-04-29 11:59:50 +03:00
Dan Garfield
3be22ab9af
Add node selector to restrict to x86
Without the node selector this will deploy on arm nodes with a constantly retrying pod.
2020-04-23 20:55:20 -06:00
Dmitry Rozhkov
6c2eacfae5 webhook: remove mode of operation
fpga: make AFU resource name 63 char long

webhook: drop mode from README

webhook: extend mappings description

webhook: tighten CRD definitions

webhook: drop mapping to non-existing afuId

explicitly state mappings names can be in any format

use consistent terminology across fpga webhook and plugin
2020-04-22 13:55:43 +03:00
Dmitry Rozhkov
bb03a8916f
Merge pull request #359 from bart0sh/PR0080-fpga-e2e-tests
implement e2e tests for FPGA plugin
2020-04-14 16:01:25 +03:00
Ed Bartosh
182601bdf7 run fpga plugin in arbitrary namespace 2020-04-09 17:01:56 +03:00
Ed Bartosh
7d8a33c30f fpga webhook: fix deployment issue
Webhook uses region CRDs even if run in preprogrammed mode.

Adding them to the base configuration should fix this deployment error:
  Failed to list *v1.FpgaRegion: the server could not find the requested resource

Fixes: #361
2020-04-09 15:21:33 +03:00
Ed Bartosh
1ce6a1fb89 fix flag provided but not defined error again
The same fix as previous:
  The `-v 1` arg is treated as single word thus klog throws
  "flag provided but not defined: -v 1" error.

This time it's in the webhook kustomize base.
2020-04-09 10:58:31 +03:00
Dmitry Rozhkov
7a86e8416f fix flag provided but not defined error
The `-v 1` arg is treated as single word thus klog throws
"flag provided but not defined: -v 1" error.
2020-04-06 11:38:59 +03:00
Graham Whaley
71d08224ee fpga: move to using klog for logs and debug
Move all the fpga components to using klog for logging
and debug. This includes replacing our homebrew 'fatal()'
with klog.Error().

Modify the deployment files to move from `-debug` to
`-v`, and set their default level to '1' (Info), rather
than full debug mode ('4').

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-03-24 14:31:53 +00:00
Mikko Ylinen
15d4b10715
Merge pull request #329 from grahamwhaley/20200312_klog
klog: Add klog logging to framework and qat plugins
2020-03-19 16:59:44 +02:00
Graham Whaley
f8dbc896a1 devicemanager: qat: use klog for logging and debug
Move the framework, and the qat driver, to use `klog`
for logging and debug.

This has a some noticeable effects:

1) Our default log output gains a bunch of annotation:
From:
    QAT device plugin started in 'dpdk' mode
To:
    I0312 11:51:02.057728    6053 qat_plugin.go:64] QAT device plugin started in 'dpdk' mode

(there is now a command line option to drop those annotations if
necessary).

2) We gain a bunch of command line parameters from klog for controlling log
levels and output. We go from 5 arguments to 17:

---
Usage of ./cmd/qat_plugin/qat_plugin:
  -add_dir_header
        If true, adds the file directory to the header
  -alsologtostderr
        log to standard error as well as files
  -debug
        enable debug output
  -dpdk-driver string
        DPDK Device driver for configuring the QAT device (default "vfio-pci")
  -kernel-vf-drivers string
        Comma separated VF Device Driver of the QuickAssist Devices in the system. Devices supported: DH895xCC,C62x,C3xxx and D15xx (default "dh895xccvf,c6xxvf,c3xxxvf,d15xxvf")
  -log_backtrace_at value
        when logging hits line file:N, emit a stack trace
  -log_dir string
        If non-empty, write log files in this directory
  -log_file string
        If non-empty, use this log file
  -log_file_max_size uint
        Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
  -logtostderr
        log to standard error instead of files (default true)
  -max-num-devices int
        maximum number of QAT devices to be provided to the QuickAssist device plugin (default 32)
  -mode string
        plugin mode which can be either dpdk (default) or kernel (default "dpdk")
  -skip_headers
        If true, avoid header prefixes in the log messages
  -skip_log_headers
        If true, avoid headers when opening log files
  -stderrthreshold value
        logs at or above this threshold go to stderr (default 2)
  -v value
        number for the log level verbosity
  -vmodule value
        comma-separated list of pattern=N settings for file-filtered logging
---

3) Our `-debug` flag is now replaced by the `klog` `-v n` flag.

*NOTE:* This is potentially a minor breaking change. Applying
this debug overlay to any previous (pre-klog edit) images will
cause the container to fail to launch, as it will not recognise
the new `-v` arguments.

We also update the kustomize deployment to move from using
DEBUG env vars to adding a VERBOSITY var that controls both
the log verbosity and now the debug mode enabling.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2020-03-19 11:20:48 +00:00
Alek Du
05fee4bf41 vpu: mount myd_ion device for topology hints to work
Previously, /dev/ion device was just arbitrary string and the plugin
did not need the device for anything. After adding the checks for
topology hints, the device node must be bind mounted in the plugin
container.

Signed-off-by: Alek Du <alek.du@intel.com>
2020-03-17 10:50:59 +08:00
Mikko Ylinen
1d41852013 qat: mount VFIO devices for topology hints to work
Previously, /dev/vfio/xx devices were just arbitrary strings and the
plugin did not need the devices for anything. After adding the checks
for topology hints, we need to read the devices attached to those so
the device nodes must be bind mounted in the plugin container.

Moreover, be more verbose about any errors coming from the topology code.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-03-02 08:13:09 +02:00
Alek Du
7c2bc3bda0 vpu_plugin: add kustomizations
- Default deployment: `kubectl apply -k deployments/vpu_plugin`
- Default deployment does not specify namespace anymore
  (was: `kube-system`)
- Variant: deploy to `kube-system` instead of user-defined namespace
  (or `default`)
  `kubectl apply -k deployments/vpu_plugin/overlays/namespace_kube-system`
- VPU plugin README updated.
- Change volume mounts to readonly when possible

Signed-off-by: Alek Du <alek.du@intel.com>
2020-02-25 14:53:26 +08:00
Antti Kervinen
5fe8174077 fpga_plugin: add kustomization files
- Add script/fpga-plugin-prepare-for-kustomization.sh, creates contents
  for the secret needed by the fpga plugin webhook.
- Single-command fpga plugin + webhook deployment for both modes:
  - `kubectl create -k deployments/fpga_plugin/overlays/af`
  - `kubectl create -k deployments/fpga_plugin/overlays/region`
- Change intel-fpga-plugin image CMD to ENTRYPOINT.
2020-02-24 16:32:26 +02:00
Antti Kervinen
d568f050c5 gpu_plugin: add kustomizations
- Default deployment: `kubectl apply -k deployments/gpu_plugin`
- Default deployment does not specify namespace anymore
  (was: `kube-system`).
- Variant: deploy only on nodes with Intel GPU label by NFD:
  `kubectl apply -k deployments/gpu_plugin/overlays/nfd_labeled_nodes`
- Variant: deploy to `kube-system` instead of user-defined namespace
  (or "default"):
  `kubectl apply -k deployments/gpu_plugin/overlays/namespace_kube-system`
- GPU plugin README updated.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-02-07 14:56:52 +02:00
Mikko Ylinen
f036b72cff
Merge pull request #286 from askervin/kustomize
qat_plugin: add kustomizations
2020-02-06 13:53:08 +02:00
Antti Kervinen
ec8eef6daa qat_plugin: add kustomizations
- Default deployment: `kubectl apply -k deployments/qat_plugin`
- Debug variant: `kubectl apply -k deployments/qat_plugin/overlays/debug`
- Single-resource `yaml` naming convention:
  applying x-y-z.yaml configures k8s resource named x-y-z.
- QAT plugin README updated.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-02-05 15:48:57 +02:00
Mikko Ylinen
df7492d763 crypto-perf: fix readonly rootfs deployment
We had securityContext specified twice and the latter was overwriting
readOnlyRootFilesystem=true.

With this commit, the container is properly mounted readonly. However,
we need a tmpfs for DPDK runtime data so an emptyDir volume is added
(NB: see kubernetes/issues/48912 for discussion on emptyDir mount options).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-02-04 13:39:07 +02:00
Alek Du
887e56e780 VPU: Add Intel Movidius MyriadX VPU plugin support
This patch is to support Intel VCAC-A card (with MyriadX 2485 VPUs), for other
later on VPUs, we will reuse this plugin and add support.

VCAC-A board info is at:
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/media-analytics-vcac-a-accelerator-card-by-celestica-datasheet.pdf

Also add openvino HDDL VPU demo for Intel VCAC-A card.

Signed-off-by: Alek Du <alek.du@intel.com>
2020-01-28 23:17:50 +08:00
Dmitry Rozhkov
a44fc06b21
Merge pull request #242 from bart0sh/PR0066-secure-fpga-weebhook
Secure fpga weebhook
2020-01-15 10:22:42 +02:00