Skip to main content

Overview

The Nodes page provides comprehensive monitoring of Kubernetes cluster nodes. View node status, resource capacity, resource allocation, and which pods are running on each node. Nodes List

List View

Features

Node Status

Real-time node health and readiness status

Resource Monitoring

CPU, memory, and pod capacity tracking

Pod Distribution

See how many pods are running on each node

Utilization Metrics

Percentage of resources allocated

Node Status

Nodes can have multiple conditions indicating their health:
Node is healthy and ready to accept pods.Indicator: Green badge with “Ready”What it means:
  • Kubelet is functioning properly
  • Sufficient resources available
  • Network connectivity established
  • Container runtime operational
Node has issues and cannot accept new pods.Indicator: Red badge with “NotReady”Common causes:
  • Kubelet not responding
  • Network connectivity lost
  • Disk pressure
  • Memory pressure
  • PID pressure
Node is cordoned and won’t accept new pods.Indicator: Orange badge with “SchedulingDisabled”What it means:
  • Node manually cordoned (kubectl cordon)
  • Under maintenance
  • Being prepared for removal
  • Existing pods continue running
Node is running out of disk space.Indicator: Yellow badge with “DiskPressure”Thresholds:
  • Available disk < 10% (default)
  • Available inodes < 5%
Actions taken:
  • Evict pods with BestEffort QoS
  • Prevent new pod scheduling
Node is running low on memory.Indicator: Yellow badge with “MemoryPressure”Thresholds:
  • Available memory < 100Mi (default)
Actions taken:
  • Evict pods based on QoS and memory usage
  • Prevent new BestEffort pods
Node is running out of process IDs.Indicator: Yellow badge with “PIDPressure”What it means:
  • Too many processes running
  • Approaching PID limit
  • Pods may fail to start

Table Columns

ColumnDescription
NameNode name (clickable to view details)
StatusNode health status badge
RolesNode roles (master, worker, etc.)
CPUCPU capacity and allocation percentage
MemoryMemory capacity and allocation percentage
PodsNumber of pods running / maximum pods
AgeTime since node joined cluster

Resource Indicators

Each node shows resource utilization with color-coded progress bars:
Display: 8 cores (45% allocated)
  • Green: < 70% allocated
  • Yellow: 70-90% allocated
  • Red: > 90% allocated
Shows allocatable CPU vs requested CPU by pods

Detail View

Click any node name to view comprehensive details.

Overview Section

1

Basic Information

  • Name: Node hostname
  • Status: Overall node health status
  • Roles: Kubernetes roles assigned
  • Labels: All node labels (topology, instance type, zone, etc.)
  • Annotations: Node annotations
  • Created: When node joined cluster
2

Node Info

  • Kubelet Version: Kubernetes version running
  • Container Runtime: Docker, containerd, CRI-O version
  • OS Image: Operating system and version
  • Kernel Version: Linux kernel version
  • Architecture: CPU architecture (amd64, arm64, etc.)
  • Operating System: linux, windows
3

Network Information

  • Internal IP: Node internal IP address
  • Hostname: Node hostname
  • External IP: Public IP (if available)

Resource Capacity

Detailed breakdown of node resources: Node Resources

CPU Resources

Total CPU cores available on the nodeExample: 8 cores (physical CPU)

Memory Resources

Total memory available on the nodeExample: 32 GB

Pod Capacity

Maximum number of pods the node can run:
  • Capacity: Maximum pods (e.g., 110)
  • Running: Current pod count (e.g., 48)
  • Available: Remaining capacity (e.g., 62)
Pod capacity is determined by kubelet --max-pods flag and CNI plugin limitations

Storage Resources

Temporary storage on node’s root filesystem
  • Capacity: Total ephemeral storage
  • Allocatable: Available for pods
  • Requests: Storage requested by pods
Used for:
  • Container layers
  • EmptyDir volumes
  • Container logs
  • Container writable layers
List of PersistentVolumes bound to this nodeShows:
  • PV name
  • Capacity
  • Access modes
  • Status
  • Claim reference

Pods Section

View all pods running on this node: Pods on Node Pod Table Columns:
  • Name: Pod name with namespace
  • Status: Pod phase (Running, Pending, etc.)
  • Namespace: Pod namespace
  • CPU Requests: CPU requested by pod
  • Memory Requests: Memory requested by pod
  • Restarts: Container restart count
  • Age: Pod uptime
Click any pod name to navigate to the pod detail page
Pod Grouping: Pods are grouped by namespace for easier viewing:
  • System pods (kube-system)
  • Application pods (default, custom namespaces)
  • Monitoring pods (monitoring namespace)

Conditions

Detailed node condition history: Node Conditions
ConditionStatusDescription
ReadyTrue/FalseNode is healthy and ready
MemoryPressureTrue/FalseNode is low on memory
DiskPressureTrue/FalseNode is low on disk space
PIDPressureTrue/FalseNode is low on process IDs
NetworkUnavailableTrue/FalseNetwork is incorrectly configured
Each condition shows:
  • Status: True, False, Unknown
  • Reason: Why condition is in current state
  • Message: Detailed explanation
  • Last Heartbeat: When condition last updated
  • Last Transition: When condition changed state

Events

Recent events related to this node:
Routine node operationsExamples:
  • NodeReady: Node became ready
  • RegisteredNode: Node registered with API server
  • Starting kubelet: Kubelet service started
  • NodeHasSufficientMemory: Memory available
Node issues and problemsExamples:
  • NodeNotReady: Node lost connectivity
  • Rebooted: Node rebooted
  • EvictionThresholdMet: Disk/memory pressure
  • ContainerGCFailed: Garbage collection failed
  • ImageGCFailed: Image cleanup failed

Taints

Node taints prevent certain pods from scheduling: Node Taints Taint Effects:
Hard constraint - Pods without matching toleration won’t scheduleExample:
taints:
- key: "dedicated"
  value: "gpu"
  effect: "NoSchedule"
Requires pod toleration:
tolerations:
- key: "dedicated"
  operator: "Equal"
  value: "gpu"
  effect: "NoSchedule"
Common Taints:
Taint KeyPurpose
node.kubernetes.io/not-readyNode not ready (automatic)
node.kubernetes.io/unreachableNode unreachable (automatic)
node.kubernetes.io/disk-pressureDisk pressure detected
node.kubernetes.io/memory-pressureMemory pressure detected
node.kubernetes.io/pid-pressurePID pressure detected
node.kubernetes.io/network-unavailableNetwork not ready
node.kubernetes.io/unschedulableNode cordoned

Resource Allocation

Understanding Requests vs Limits

Minimum guaranteed resources for a pod
  • Used by scheduler to place pods
  • Pod won’t be scheduled if node lacks requested resources
  • Pod always gets at least this much
Example:
resources:
  requests:
    cpu: "500m"
    memory: "256Mi"

Quality of Service (QoS)

Pods are assigned QoS classes affecting eviction order:
Highest priority - Last to be evictedRequirements:
  • Every container has CPU and memory limits
  • Requests equal limits for both CPU and memory
resources:
  requests:
    cpu: "500m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "256Mi"
Medium priority - Evicted after BestEffortRequirements:
  • At least one container has CPU or memory request/limit
  • Doesn’t meet Guaranteed criteria
resources:
  requests:
    cpu: "250m"
    memory: "128Mi"
  limits:
    cpu: "1000m"
    memory: "512Mi"
Lowest priority - First to be evictedRequirements:
  • No containers have CPU or memory requests/limits
# No resources specified
containers:
- name: app
  image: nginx

Eviction Policies

When node runs out of resources, kubelet evicts pods: Eviction Signals:
  • memory.available < 100Mi
  • nodefs.available < 10% (root filesystem)
  • nodefs.inodesFree < 5%
  • imagefs.available < 15% (image filesystem)
Eviction Order:
  1. BestEffort pods using most resources
  2. Burstable pods exceeding requests
  3. Burstable pods within requests
  4. Guaranteed pods (last resort)
Within each category, evict lowest priority pods first.

Troubleshooting

Node NotReady

Symptom: Node shows NotReady status Diagnostic Steps:
1

Check Node Conditions

View Conditions section for specific issues:
  • MemoryPressure
  • DiskPressure
  • PIDPressure
  • NetworkUnavailable
2

Check Events

Look for warning events:
  • “Kubelet stopped posting node status”
  • “Container runtime not responding”
  • “Failed to initialize network plugin”
3

Check Kubelet

SSH to node and check kubelet:
systemctl status kubelet
journalctl -u kubelet -f
4

Check Resources

Verify node has sufficient resources:
df -h    # Disk space
free -h  # Memory
top      # CPU and processes
Common Causes:
Node can’t communicate with API serverSolutions:
  • Check network connectivity
  • Verify firewall rules
  • Test DNS resolution
  • Check API server endpoints
Node out of disk spaceSolutions:
  • Clean up container images: docker system prune
  • Remove old logs: journalctl --vacuum-time=3d
  • Increase disk size
  • Check for filled volumes
Node low on memorySolutions:
  • Evict non-essential pods
  • Increase node memory
  • Reduce pod memory requests
  • Fix memory leaks in applications
Docker/containerd not respondingSolutions:
  • Restart container runtime: systemctl restart docker
  • Check runtime logs
  • Verify runtime configuration
  • Update runtime if outdated

High Resource Utilization

Symptom: Node consistently at >90% CPU or memory Solutions:
1

Identify Resource Hogs

Check Pods section to find pods using most resources
2

Optimize Pod Requests

Reduce over-requested resources:
# Before
requests:
  cpu: "2000m"

# After (if only using 500m)
requests:
  cpu: "500m"
3

Distribute Pods

Move pods to other nodes:
kubectl drain <node> --ignore-daemonsets
kubectl uncordon <node>
4

Scale Cluster

Add more nodes to cluster:
  • Increase node pool size
  • Add larger nodes
  • Enable cluster autoscaling

Pod Evictions

Symptom: Pods being evicted from node Reasons:
  1. Resource Pressure
    • MemoryPressure
    • DiskPressure
    • PIDPressure
  2. Manual Drain
    • Node maintenance
    • Node upgrade
    • Cluster scaling
  3. Preemption
    • Higher priority pods need resources
    • Cluster autoscaler needs to consolidate
Solutions:
  • Increase node resources
  • Reduce pod resource usage
  • Adjust pod priorities
  • Clean up unused resources

Best Practices

Keep node utilization at 60-80% for optimal performance and headroom
Base requests on actual usage, not maximum possible usage
Label nodes for targeted scheduling:
kubectl label nodes <node> workload=gpu
kubectl label nodes <node> environment=production
Automatically add/remove nodes based on demand
Schedule node updates and security patches regularly
Set up alerts for NotReady nodes and resource pressure

Next Steps