Target Audience: Backend developers who want to efficiently manage resources in Kubernetes Prerequisites: Pod, Deployment concepts After reading this: You will understand the difference between requests and limits, and how to configure appropriate resources

TL;DR
  • requests: Minimum guaranteed resources used for scheduling
  • limits: Maximum usable resources
  • CPU excess causes throttling, memory excess causes OOMKilled

Why Resource Configuration is Needed?#

Running Pods without resource configuration causes several problems.

ProblemAfter Resource Configuration
One Pod monopolizes node resourcesLimit max usage with limits
Resources not considered during schedulingSchedule based on requests
Random Pod termination on memory shortagePriority based on QoS class
Unpredictable resource usageExplicit resource allocation

requests and limits#

Kubernetes provides two resource control methods.

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"
flowchart LR
    subgraph Resources
        REQ[requests: 256Mi]
        USE[Actual usage]
        LIM[limits: 512Mi]
    end
    REQ -->|guaranteed| USE
    USE -->|limited| LIM

Comparing requests and limits:

Itemrequestslimits
RoleMinimum guaranteeMaximum allowance
SchedulingUsed as criteriaNot used
During runtimeAlways availableLimited on excess
If unsetSame as limitsUnlimited

CPU Resources#

CPU is a compressible resource. Exceeding causes throttling but not termination.

CPU Units#

NotationMeaningExample
11 vCPUcpu: 1
1000m1 vCPUcpu: 1000m
500m0.5 vCPUcpu: 500m
100m0.1 vCPUcpu: 100m

m means millicores. 1000m = 1 CPU.

CPU Behavior#

flowchart LR
    subgraph "requests: 250m"
        A[Pod A]
    end
    subgraph "requests: 500m"
        B[Pod B]
    end
    CPU[1 CPU]
    A -->|guaranteed 250m| CPU
    B -->|guaranteed 500m| CPU
    CPU -->|remaining 250m| C[Contention]
SituationBehavior
Total usage < node capacityAll Pods use as needed
Total usage > node capacityDistributed by requests ratio
Pod usage > limitsCPU throttling (performance degradation)

Memory Resources#

Memory is an incompressible resource. Exceeding causes Pod termination (OOMKilled).

Memory Units#

NotationMeaning
256Mi256 mebibytes (256 × 2^20 bytes)
1Gi1 gibibyte (1 × 2^30 bytes)
256M256 megabytes (256 × 10^6 bytes)
1G1 gigabyte (1 × 10^9 bytes)
Mi vs M
Mi (mebibyte) is 2^20 bytes, M (megabyte) is 10^6 bytes. Kubernetes typically uses Mi, Gi.

Memory Behavior#

SituationBehavior
Usage < requestsNormal execution
requests < usage < limitsNormal execution (if node has room)
Usage > limitsOOMKilled (container restart)
Node memory shortageTermination based on QoS class

QoS Classes#

Kubernetes assigns QoS (Quality of Service) classes to Pods based on resource configuration.

ClassConditionPriority
GuaranteedAll containers have requests=limitsHighest (terminated last)
Burstablerequests < limits or only partially setMedium
BestEffortNo resource configurationLowest (terminated first)
# Guaranteed
resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "256Mi"
    cpu: "250m"

# Burstable
resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

# BestEffort (no resource configuration)
# resources: {}

Termination order when node memory is insufficient:

flowchart LR
    M[Memory shortage] --> BE[BestEffort terminated first]
    BE --> BU[Burstable terminated]
    BU --> GU[Guaranteed last]

Starting points for commonly used workload types in production.

Workloadrequests (CPU/Mem)limits (CPU/Mem)Characteristics
Spring Boot API250m / 512Mi1000m / 1GiConsider JVM heap
Node.js API100m / 128Mi500m / 256MiLightweight
Python Flask100m / 128Mi500m / 512MiIncrease for ML libraries
Nginx (proxy)50m / 64Mi200m / 128MiVery lightweight
Redis (cache)100m / 256Mi500m / 512MiMemory-focused
Batch Job500m / 512Mi2000m / 2GiThroughput-focused
Warning
These values are starting points. Monitor actual usage (kubectl top pods) and adjust.

Java Applications#

resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "1Gi"
    cpu: "1000m"

For Java applications, set memory limits generously but also adjust JVM heap size.

env:
- name: JAVA_OPTS
  value: "-Xms256m -Xmx768m"

JVM heap is typically set to 70-80% of container memory limits.

Resource Configuration Guidelines#

ItemRecommendation
requests90% of normal usage
limits (CPU)2-4x requests or unset
limits (memory)1.5-2x requests
GuaranteedUse for critical workloads

LimitRange#

Set default resource limits at namespace level.

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - default:          # Default limits
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:   # Default requests
      cpu: "100m"
      memory: "128Mi"
    max:              # Maximum allowed
      cpu: "2"
      memory: "2Gi"
    min:              # Minimum allowed
      cpu: "50m"
      memory: "64Mi"
    type: Container

Default values are applied to Pods without specified resources.

ResourceQuota#

Limit total resource amount for entire namespace.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
spec:
  hard:
    requests.cpu: "4"
    requests.memory: "8Gi"
    limits.cpu: "8"
    limits.memory: "16Gi"
    pods: "10"

This limits the following in that namespace:

ItemLimit
CPU requests total4 cores
Memory requests total8Gi
Maximum Pod count10

Practice: Configuring and Checking Resources#

Check Resource Usage#

# Check node resources
kubectl describe node <node-name>

# Pod resource usage (requires metrics-server)
kubectl top pods
kubectl top nodes

Simulate Resource Shortage#

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
spec:
  containers:
  - name: memory-demo
    image: polinux/stress
    resources:
      requests:
        memory: "50Mi"
      limits:
        memory: "100Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]
# Create Pod
kubectl apply -f memory-demo.yaml

# Check status (OOMKilled occurs)
kubectl get pod memory-demo
kubectl describe pod memory-demo

Check QoS Class#

kubectl get pod <pod-name> -o jsonpath='{.status.qosClass}'

Next Steps#

Once you understand resource management, proceed to the next steps:

GoalRecommended Doc
Auto-scalingScaling
Configure health checksHealth Checks
Resource optimizationResource Optimization