Do Kubernetes requests have a runtime impact?

25 Apr 2025

Trick question ahead!

Explain the difference between CPU and memory requests/limits in terms of scheduling and runtime impact on Kubernetes pods. Do requests have a runtime impact?

Requests are first and foremost used for scheduling decisions. Pods with resource requests will only be placed on nodes that have enough CPU and memory resources available as requested by the pod.

Limits are primary used to enforce max resource usage during runtime. If limits are exceeded, the container will be throttled (CPU) or killed (memory). Thus memory limits are actually more dangerous than CPU limits as if exceeded those lead to a full reschedule.

Interestingly one can argue that requests have an indirect runtime impact. The combination of requests and limits dictates the Quality of Service (QoS) class of a pod in Kubernetes:

BestEffort (no requests/limits),
Burstable (some requests, no same limits definied) or
Guaranteed (requests=limits).

When a node faces resource pressure (especially memory pressure -> OOM-killing), pods are evicted in the order of QoS class (BestEffort < Burstable < Guaranteed), so for example BestEffort pods are killed first. Thus requests have an indirect runtime impact as they affect the QoS class, which in turn affects eviction behavior.

About the author