Taints and Tolerations are a powerful toolkit to leverage workload affinities to nodegroups within your Kubernetes environment. However, you’ll want to be careful when leveraging them for Deployment objects.

If for instance, you’re deploying a node agent meant as a DaemonSet to run on each node in your cluster, you may want to bypass all taints (i.e. type:cpu-optimized) a node has to ensure that workload truly get scheduled on the entire nodegroup. The way you do this looks like the following (getting applied to the podspec):

        tolerations:
          - operator: "Exists"

While this makes sense and is probably safe for a DaemonSet, this override isn’t suited for Deployment type workloads. This is because Deployments may get placed on any available node in your cluster, depending on how the scheduler ranks the available nodes. However, using an override like the above means that workloads could get scheduled on Cordoned nodes!

The reason this can happen is because Cordoning is implemented as a Taint on the node, specifically:

node.kubernetes.io/unschedulable:NoSchedule

If, say your Deployment has a replica of 1 and the above override is implemented, the scheduler is more likely to schedule it to a Cordoned node because that node was probably drained so its got ample resources available.

You can read more about Taints and Tolerations here.

Mario Loria is a builder of diverse infrastructure with modern workloads on both bare-metal and cloud platforms. He's traversed roles in system administration, network engineering, and DevOps. You can learn more about him here.