Best Practice
In this section you will learn how to configure cert-manager to comply with popular security standards such as the CIS Kubernetes Benchmark, the NSA Kubernetes Hardening Guide, or the BSI Kubernetes Security Recommendations.
And you will learn about best practices for deploying cert-manager in production; such as those enforced by tools like Datree and its built in rules, and those documented by the likes of Learnk8s in their "Kubernetes production best practices" checklist.
Overview
The default cert-manager resources in the Helm chart or YAML manifests (Deployment, Pod, ServiceAccount etc) are designed for backwards compatibility rather than for best practice or maximum security. You may find that the default resources do not comply with the security policy on your Kubernetes cluster and in that case you can modify the installation configuration using Helm chart values to override the defaults.
Use Liveness Probes
An example of this recommendation is found in the Datree Documentation: Ensure each container has a configured liveness probe:
Liveness probes allow Kubernetes to determine when a pod should be replaced. They are fundamental in configuring a resilient cluster architecture.
The cert-manager webhook and controller Pods do have liveness probes, but only the webhook liveness probe is enabled by default. The cainjector Pod does not have a liveness probe, yet. More information below.
webhook
The cert-manager webhook has a liveness probe which is enabled by default and the timings and thresholds can be configured using Helm values.
controller
âšī¸ The cert-manager controller liveness probe was introduced in cert-manager release
1.12
.
The cert-manager controller has a liveness probe, but it is disabled by default.
You can enable it using the Helm chart value livenessProbe.enabled=true
,
but first read the background information below.
đĸ The controller liveness probe is a new feature in cert-manager release 1.12 and it is disabled by default, as a precaution, in case it causes problems in the field. Please get in touch and tell us if you have enabled the controller liveness probe in production and whether you would like it to be turned on by default. Please also include any circumstances where the controller has become stuck and where the liveness probe has been necessary to automatically restart the process.
The liveness probe for the cert-manager controller is an HTTP probe which connects
to the /livez
endpoint of a healthz server which listens on port 9443 and runs in its own thread.
The /livez
endpoint currently reports the combined status of the following sub-systems
and each sub-system has its own /livez
endpoint. These are:
/livez/leaderElection
: Returns an error if the leader election record has not been renewed or if the leader election thread has exited without also crashing the parent process.
âšī¸ In future more sub-systems could be checked by the
/livez
endpoint, similar to how Kubernetes ensure logging is not blocked and have health checks for each controller.đ Read about how to access individual health checks and verbose status information (cert-manager uses the same healthz endpoint multiplexer as Kubernetes).
cainjector
The cainjector Pod does not have a liveness probe or a /livez
healthz endpoint,
but there is justification for it in the GitHub issue:
cainjector in a zombie state after attempting to shut down.
Please add your remarks to that issue if you have also experienced this specific problem,
and add your remarks to Helm: Allow configuration of readiness, liveness and startup probes for all created Pods if you have a general request for a liveness probe in cainjector.
Background Information
The cert-manager controller
process and the cainjector
process,
both use the Kubernetes leader election library,
to ensure that only one replica of each process can be active at any one time.
The Kubernetes control-plane components also use this library.
The leader election code runs in a loop in a separate thread (go routine). If it initially wins the leader election race and if it later fails to renew its leader election lease, it exits. If the leader election thread exits, all the other threads are gracefully shutdown and then the process exits. Similarly, if any of the other main threads exit unexpectedly, that will trigger the orderly shutdown of the remaining threads and the process will exit.
This adheres to the principle that Containers should crash when there's a fatal error. Kubernetes will restart the crashed container, and if it crashes repeatedly, there will be increasing time delays between successive restarts.
For this reason, the liveness probe should only be needed if there is a bug in this orderly shutdown process, or if there is a bug in one of the other threads which causes the process to deadlock and not shutdown.
You may want to enable the liveness probe anyway, for defense against unforeseen bugs and deadlocks, but you will need to monitor the processes closely and, tweak the various liveness probe time settings and thresholds, if necessary.
đ Read Configure Liveness, Readiness and Startup Probes in the Kubernetes documentation, paying particular attention to the notes and cautions in that document.
đ Read Shooting Yourself in the Foot with Liveness Probes for more cautionary information about liveness probes.
Restrict Auto-Mount of Service Account Tokens
This recommendation is described in the Kyverno Policy Catalogue as follows:
Kubernetes automatically mounts ServiceAccount credentials in each Pod. The ServiceAccount may be assigned roles allowing Pods to access API resources. Blocking this ability is an extension of the least privilege best practice and should be followed if Pods do not need to speak to the API server to function. This policy ensures that mounting of these ServiceAccount tokens is blocked
The cert-manager components do need to speak to the API server but we still recommend setting automountServiceAccountToken: false
for the following reasons:
- Setting
automountServiceAccountToken: false
will allow cert-manager to be installed on clusters where Kyverno (or some other policy system) is configured to deny Pods that have this field set totrue
. The Kubernetes default value istrue
. - With
automountServiceAccountToken: true
, all the containers in the Pod will mount the ServiceAccount token, including side-car and init containers that might have been injected into the cert-manager Pod resources by Kubernetes admission controllers. The principle of least privilege suggests that it is better to explicitly mount the ServiceAccount token into the cert-manager containers.
So it is recommended to set automountServiceAccountToken: false
and manually add a projected Volume
to each of the cert-manager Deployment resources, containing the ServiceAccount token, CA certificate and namespace files that would normally be added automatically by the Kubernetes ServiceAccount controller,
and to explicitly add a read-only VolumeMount
to each of the cert-manager containers.
An example of this configuration is included in the Helm Chart Values file below.
Best Practice Helm Chart Values
Download the following Helm chart values file and supply it to helm install
, helm upgrade
, or helm template
using the --values
flag:
đ values.best-practice.yaml
# Helm chart values which make cert-manager comply with CIS, BSI and NSA# security benchmarks and other best practices for deploying cert-manager in# production.## Read the rationale for these values in:# * https://cert-manager.io/docs/installation/best-practice/global:priorityClassName: system-cluster-criticalreplicaCount: 2podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truewebhook:replicaCount: 3podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truecainjector:extraArgs:- --namespace=cert-manager- --enable-certificates-data-source=falsereplicaCount: 2podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truestartupapicheck:automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: true
Other
This list of recommendations is a work-in-progress. If you have other best practice recommendations please contribute to this page.