klog has added ktesting/textlogger and is going to deprecate
klogr. The deprecation is going to trigger golangci-lint (staticcheck)
errors so rework the logging and move to ktesting/textlogger.
The commit also fixes the loglevel setting with operator.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Differentiate objects by adding cr names as suffixes
Drop kind book keeping and related functions from controllers
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
Strictly speaking, 'Gen4' and 'Gen2' are wrong expressions in this case,
because Gen4 resources are read as 'generic' in VMs. To prevent any
confusion, use just the names of the QAT services such as 'dc', 'cy' or
'generic'
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("GPU plugin")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
Signed-off-by: hj-johannes-lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("FPGA plugin")
Context("af mode")
BeforeEach("run device plugin")
It("runs a pod requesting resources")
Context("region mode")
BeforeEach("run device plugin")
It("runs a pod requesting resources")
Signed-off-by: hj-johannes-lee <hyeongju.lee@intel.com>
when err is declared and any parts below that declare again,
linter complains as follows:
shadow: declaration of err shadows declaration at line 51
so, we name the first declaration as errFailedToLocateRepoFile so
that other 'err's do not need to be named all in different names
or can be declared as 'err' without linter error.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Instead creating another overlay, copy the existing overlay and modify it.
This helps with multi-level overlays with specific namespace selections.
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
Structure is as follows:
Describe("SGX plugin")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
AfterEach("undeploys plugin")
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
move duplicate code for testing plugins using operator to
operator module
replace the code for deploying operator webhook in operator module
with the code using utils.Kubectl to make simple for undeploying
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
setInitContainer() adds "init-sriov-numvfs" to initContainers
but uses initcontainerName constant to search where to add
the QAT configMap volumeMount. Fix by moving all code to use
the const.
It was also noticed in the controller logs that setting Pod
Volumes is not idempotent but broken DaemonSet gets created:
""intel-device-plugins-manager: Reconciler error "err="DaemonSet.apps
\"intel-qat-plugin\" is invalid: spec.template.spec.volumes[6].name:
Duplicate value: \"qat-config\"" controller="qatdeviceplugin"
controllerGroup="deviceplugin.intel.com"
Finally, change 'qat-config' to 'intel-qat-config-volume' to
better describe that it's a volume.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
runTests=4 is dsa test, and it does not run anything for now.
So, use runTests=1, which is symmetric test code.
In addition, make CpaSampleCode's params const and add comments
for the sample codes' param numbering.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
to mitigate spurious errors with:
E0515 15:28:06.626887 1995892 listener.go:48] "controller-runtime/metrics:
metrics server failed to listen. You may want to disable the metrics
server or use another port if it is due to conflicts" err="error lis
tening on :8080: listen tcp :8080: bind: address already in use"
disable metrics completely. Moreover, check for error value from
NewManager() before proceeding with the tests to avoid crashes.
This makes envtest more robust but the up()/down() logic needs
careful review to ensure there are no race conditions.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
k8s 1.27.x triggers build errors on controller-runtime 0.14.x
so we will need to update to 0.15.x at the same time.
Changes include:
* k8s e2e framework moved to use Ginkgo context so we add
test context to all our test nodes.
* adapt Ginkgo parameter modifications.
* adapt SGX admissionwebhook to InjectDecoder removal.
* adapt deviceplugins and FPGA CRDs to controller-runtime
API changes.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
- add /usr/share/qat/calgary32 necessary for running dc test
to Dockerfile
- add e2e test for dc that runs cpa-sample with different
resource and command.
- remove openssl yaml file and use podSpec instead for cy test
- make a common func runCpaSampleCode for both cy and dc test
that have same process with a few differences
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
In the e2e test running in CI/CD, crypto-perf cannot be run.
So, we do not make it run by default.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Add a process to create a configMap for cy service
The structure of e2e test flows is as follows:
BeforeEach(createing a configMap)
JustBeforeEach(deploying QAT plugin)
Context(
- BeforeEach(checks if resources are available)
- It(runs a test pod)
[- It(runs another test pod)]*
)
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
'AfterEach' was made to prevent the failure of a plugin pod that
occurs due to 'BeforeEach' that deploys a plugin pod. If it is
inside a 'Context' it will still occur the same problem. Since
current e2e tests have one 'Context' in general, problems were not
visible, but it still have problem in logic and would cause the
same problems if more 'Contexts' are made. So, this commit fixes
'AfterEach' to be in the correct location.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Though namespace is deleted after each It(), it does not ensure
that it is deleted. Because of this reason, device plugin did not
get deleted before the next one is deployed. This can cause a
temporary crash of the new plugin and sometimes becomes the cause
of e2e test's failure. This commit fixes it by ensuring previous
device plugin gets deleted after each run.
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("QAT plugin")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
It("runs another pod requesting resources if there is")
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("IAA plugin")
Describe("without using operator")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
Describe("with using operator")
It("deploys with operator")
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("DSA plugin")
Describe("without using operator")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
Describe("with using operator")
It("deploys with operator")
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Structure is as follows:
Describe("DLB plugin")
BeforeEach("deploys plugin")
Context("When device resources are available")
BeforeEach("checks if resources are available")
It("runs a pod requesting resources")
Signed-off-by: Hyeongju Johannes Lee <hyeongju.lee@intel.com>
Apparently some of simulated CI nodes have out of sync boot configuration.
In order to be able to get the configuration in sync, the e2e NFD timeout
should be increased (it takes 250-260 seconds for NFD to get up with the right
boot parameters in the simulated environments).
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>