Do Kubernetes requests have a runtime impact?
Trick question ahead!
Explain the difference between CPU and memory requests/limits in terms of scheduling and runtime impact on Kubernetes pods. Do requests have a runtime impact?
Requests are first and foremost used for scheduling decisions. Pods with resource requests
will only be placed on nodes that have enough CPU and memory resources available as requested by the pod.
Limits are primary used to enforce max resource usage during runtime. If limits
are exceeded, the container will be throttled (CPU) or killed (memory). Thus memory limits are actually more dangerous than CPU limits as if exceeded those lead to a full reschedule.
Interestingly one can argue that requests
have an indirect runtime impact. The combination of requests
and limits
dictates the Quality of Service (QoS) class of a pod in Kubernetes:
BestEffort
(no requests/limits),Burstable
(some requests, no same limits definied) orGuaranteed
(requests=limits).
When a node faces resource pressure (especially memory pressure -> OOM-killing), pods are evicted in the order of QoS class (BestEffort < Burstable < Guaranteed
), so for example BestEffort
pods are killed first. Thus requests
have an indirect runtime impact as they affect the QoS class, which in turn affects eviction behavior.