Nodes

Overview

The Nodes page provides comprehensive monitoring of Kubernetes cluster nodes. View node status, resource capacity, resource allocation, and which pods are running on each node. Nodes List

List View

Features

Node Status

Real-time node health and readiness status

Resource Monitoring

CPU, memory, and pod capacity tracking

Pod Distribution

See how many pods are running on each node

Utilization Metrics

Percentage of resources allocated

Node Status

Nodes can have multiple conditions indicating their health:

Ready

Node is healthy and ready to accept pods.Indicator: Green badge with “Ready”What it means:

Kubelet is functioning properly
Sufficient resources available
Network connectivity established
Container runtime operational

NotReady

Node has issues and cannot accept new pods.Indicator: Red badge with “NotReady”Common causes:

Kubelet not responding
Network connectivity lost
Disk pressure
Memory pressure
PID pressure

SchedulingDisabled

Node is cordoned and won’t accept new pods.Indicator: Orange badge with “SchedulingDisabled”What it means:

Node manually cordoned (kubectl cordon)
Under maintenance
Being prepared for removal
Existing pods continue running

DiskPressure

Node is running out of disk space.Indicator: Yellow badge with “DiskPressure”Thresholds:

Available disk < 10% (default)
Available inodes < 5%

Actions taken:

Evict pods with BestEffort QoS
Prevent new pod scheduling

MemoryPressure

Node is running low on memory.Indicator: Yellow badge with “MemoryPressure”Thresholds:

Available memory < 100Mi (default)

Actions taken:

Evict pods based on QoS and memory usage
Prevent new BestEffort pods

PIDPressure

Node is running out of process IDs.Indicator: Yellow badge with “PIDPressure”What it means:

Too many processes running
Approaching PID limit
Pods may fail to start

Table Columns

Column	Description
Name	Node name (clickable to view details)
Status	Node health status badge
Roles	Node roles (master, worker, etc.)
CPU	CPU capacity and allocation percentage
Memory	Memory capacity and allocation percentage
Pods	Number of pods running / maximum pods
Age	Time since node joined cluster

Resource Indicators

Each node shows resource utilization with color-coded progress bars:

CPU
Memory
Pods

Display: 8 cores (45% allocated)

Green: < 70% allocated
Yellow: 70-90% allocated
Red: > 90% allocated

Shows allocatable CPU vs requested CPU by pods

Display: 32 GB (62% allocated)

Green: < 80% allocated
Yellow: 80-95% allocated
Red: > 95% allocated

Shows allocatable memory vs requested memory by pods

Display: 48 / 110 pods

Green: < 80% capacity
Yellow: 80-95% capacity
Red: > 95% capacity

Shows current pod count vs maximum pod capacity

Detail View

Click any node name to view comprehensive details.

Overview Section

Basic Information

Name: Node hostname
Status: Overall node health status
Roles: Kubernetes roles assigned
Labels: All node labels (topology, instance type, zone, etc.)
Annotations: Node annotations
Created: When node joined cluster

Node Info

Kubelet Version: Kubernetes version running
Container Runtime: Docker, containerd, CRI-O version
OS Image: Operating system and version
Kernel Version: Linux kernel version
Architecture: CPU architecture (amd64, arm64, etc.)
Operating System: linux, windows

Network Information

Internal IP: Node internal IP address
Hostname: Node hostname
External IP: Public IP (if available)

Resource Capacity

Detailed breakdown of node resources:

CPU Resources

Capacity
Allocatable
Requests
Limits

Total CPU cores available on the nodeExample: 8 cores (physical CPU)

CPU cores available for pods (capacity minus system reserved)Example: 7.8 cores (0.2 reserved for system)Calculation:

Allocatable = Capacity - System Reserved - Kube Reserved

Total CPU requested by all pods on this nodeExample: 3.5 cores (45% of allocatable)Determines scheduling decisions

Total CPU limits set by all podsExample: 7.0 cores (90% of allocatable)Maximum CPU pods can burst to

Memory Resources

Capacity
Allocatable
Requests
Limits

Total memory available on the nodeExample: 32 GB

Memory available for pods (capacity minus system reserved)Example: 30.5 GB (1.5 GB reserved)

Total memory requested by all podsExample: 18 GB (59% of allocatable)

Total memory limits set by all podsExample: 28 GB (92% of allocatable)

Pod Capacity

Maximum number of pods the node can run:

Capacity: Maximum pods (e.g., 110)
Running: Current pod count (e.g., 48)
Available: Remaining capacity (e.g., 62)

Pod capacity is determined by kubelet --max-pods flag and CNI plugin limitations

Storage Resources

Ephemeral Storage

Temporary storage on node’s root filesystem

Capacity: Total ephemeral storage
Allocatable: Available for pods
Requests: Storage requested by pods

Used for:

Container layers
EmptyDir volumes
Container logs
Container writable layers

Persistent Volumes

List of PersistentVolumes bound to this nodeShows:

PV name
Capacity
Access modes
Status
Claim reference

Pods Section

View all pods running on this node: Pods on Node

Pod Table Columns:

Name: Pod name with namespace
Status: Pod phase (Running, Pending, etc.)
Namespace: Pod namespace
CPU Requests: CPU requested by pod
Memory Requests: Memory requested by pod
Restarts: Container restart count
Age: Pod uptime

Click any pod name to navigate to the pod detail page

Pod Grouping: Pods are grouped by namespace for easier viewing:

System pods (kube-system)
Application pods (default, custom namespaces)
Monitoring pods (monitoring namespace)

Conditions

Detailed node condition history: Node Conditions

Condition	Status	Description
Ready	True/False	Node is healthy and ready
MemoryPressure	True/False	Node is low on memory
DiskPressure	True/False	Node is low on disk space
PIDPressure	True/False	Node is low on process IDs
NetworkUnavailable	True/False	Network is incorrectly configured

Each condition shows:

Status: True, False, Unknown
Reason: Why condition is in current state
Message: Detailed explanation
Last Heartbeat: When condition last updated
Last Transition: When condition changed state

Events

Recent events related to this node:

Normal Events

Routine node operationsExamples:

NodeReady: Node became ready
RegisteredNode: Node registered with API server
Starting kubelet: Kubelet service started
NodeHasSufficientMemory: Memory available

Warning Events

Node issues and problemsExamples:

NodeNotReady: Node lost connectivity
Rebooted: Node rebooted
EvictionThresholdMet: Disk/memory pressure
ContainerGCFailed: Garbage collection failed
ImageGCFailed: Image cleanup failed

Taints

Node taints prevent certain pods from scheduling:

Taint Effects:

NoSchedule
PreferNoSchedule
NoExecute

Hard constraint - Pods without matching toleration won’t scheduleExample:

taints:
- key: "dedicated"
  value: "gpu"
  effect: "NoSchedule"

Requires pod toleration:

tolerations:
- key: "dedicated"
  operator: "Equal"
  value: "gpu"
  effect: "NoSchedule"

Soft constraint - Scheduler tries to avoid, but allows if necessaryExample:

taints:
- key: "workload"
  value: "batch"
  effect: "PreferNoSchedule"

Pods without toleration may still schedule if no other nodes available

Eviction - Existing pods without toleration are evictedExample:

taints:
- key: "node.kubernetes.io/not-ready"
  effect: "NoExecute"
  tolerationSeconds: 300

Evicts pods after tolerationSeconds (if specified)

Common Taints:

Taint Key	Purpose
`node.kubernetes.io/not-ready`	Node not ready (automatic)
`node.kubernetes.io/unreachable`	Node unreachable (automatic)
`node.kubernetes.io/disk-pressure`	Disk pressure detected
`node.kubernetes.io/memory-pressure`	Memory pressure detected
`node.kubernetes.io/pid-pressure`	PID pressure detected
`node.kubernetes.io/network-unavailable`	Network not ready
`node.kubernetes.io/unschedulable`	Node cordoned

Resource Allocation

Understanding Requests vs Limits

Requests
Limits

Minimum guaranteed resources for a pod

Used by scheduler to place pods
Pod won’t be scheduled if node lacks requested resources
Pod always gets at least this much

Example:

resources:
  requests:
    cpu: "500m"
    memory: "256Mi"

Maximum resources a pod can use

Pod can’t exceed these limits
CPU: Throttled when limit reached
Memory: OOMKilled when limit exceeded

Example:

resources:
  limits:
    cpu: "1000m"
    memory: "512Mi"

Quality of Service (QoS)

Pods are assigned QoS classes affecting eviction order:

Guaranteed

Highest priority - Last to be evictedRequirements:

Every container has CPU and memory limits
Requests equal limits for both CPU and memory

resources:
  requests:
    cpu: "500m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "256Mi"

Burstable

Medium priority - Evicted after BestEffortRequirements:

At least one container has CPU or memory request/limit
Doesn’t meet Guaranteed criteria

resources:
  requests:
    cpu: "250m"
    memory: "128Mi"
  limits:
    cpu: "1000m"
    memory: "512Mi"

BestEffort

Lowest priority - First to be evictedRequirements:

No containers have CPU or memory requests/limits

# No resources specified
containers:
- name: app
  image: nginx

Eviction Policies

When node runs out of resources, kubelet evicts pods: Eviction Signals:

memory.available < 100Mi
nodefs.available < 10% (root filesystem)
nodefs.inodesFree < 5%
imagefs.available < 15% (image filesystem)

Eviction Order:

BestEffort pods using most resources
Burstable pods exceeding requests
Burstable pods within requests
Guaranteed pods (last resort)

Within each category, evict lowest priority pods first.

Troubleshooting

Node NotReady

Symptom: Node shows NotReady status Diagnostic Steps:

Check Node Conditions

View Conditions section for specific issues:

MemoryPressure
DiskPressure
PIDPressure
NetworkUnavailable

Check Events

Look for warning events:

“Kubelet stopped posting node status”
“Container runtime not responding”
“Failed to initialize network plugin”

Check Kubelet

SSH to node and check kubelet:

systemctl status kubelet
journalctl -u kubelet -f

Check Resources

Verify node has sufficient resources:

df -h    # Disk space
free -h  # Memory
top      # CPU and processes

Common Causes:

Network Issues

Node can’t communicate with API serverSolutions:

Check network connectivity
Verify firewall rules
Test DNS resolution
Check API server endpoints

Disk Full

Node out of disk spaceSolutions:

Clean up container images: docker system prune
Remove old logs: journalctl --vacuum-time=3d
Increase disk size
Check for filled volumes

Memory Pressure

Node low on memorySolutions:

Evict non-essential pods
Increase node memory
Reduce pod memory requests
Fix memory leaks in applications

Container Runtime Issues

Docker/containerd not respondingSolutions:

Restart container runtime: systemctl restart docker
Check runtime logs
Verify runtime configuration
Update runtime if outdated

High Resource Utilization

Symptom: Node consistently at >90% CPU or memory Solutions:

Identify Resource Hogs

Check Pods section to find pods using most resources

Optimize Pod Requests

Reduce over-requested resources:

# Before
requests:
  cpu: "2000m"

# After (if only using 500m)
requests:
  cpu: "500m"

Distribute Pods

Move pods to other nodes:

kubectl drain <node> --ignore-daemonsets
kubectl uncordon <node>

Scale Cluster

Add more nodes to cluster:

Increase node pool size
Add larger nodes
Enable cluster autoscaling

Pod Evictions

Symptom: Pods being evicted from node Reasons:

Resource Pressure
- MemoryPressure
- DiskPressure
- PIDPressure
Manual Drain
- Node maintenance
- Node upgrade
- Cluster scaling
Preemption
- Higher priority pods need resources
- Cluster autoscaler needs to consolidate

Solutions:

Increase node resources
Reduce pod resource usage
Adjust pod priorities
Clean up unused resources

Best Practices

Monitor Resource Utilization

Keep node utilization at 60-80% for optimal performance and headroom

Set Resource Requests Accurately

Base requests on actual usage, not maximum possible usage

Use Node Labels

Label nodes for targeted scheduling:

kubectl label nodes <node> workload=gpu
kubectl label nodes <node> environment=production

Implement Node Auto-scaling

Automatically add/remove nodes based on demand

Regular Maintenance

Schedule node updates and security patches regularly

Monitor Node Health

Set up alerts for NotReady nodes and resource pressure

Next Steps

Pods

View pods running on nodes

Events

Monitor cluster events

Dashboard

View cluster overview

HPA

Configure autoscaling

Features

GitHub Integration

Configuration

Overview

List View

Features

Node Status

Resource Monitoring

Pod Distribution

Utilization Metrics

Node Status

Table Columns

Resource Indicators

Detail View

Overview Section

Resource Capacity

CPU Resources

Memory Resources

Pod Capacity

Storage Resources

Pods Section

Conditions

Events

Taints

Resource Allocation

Understanding Requests vs Limits

Quality of Service (QoS)

Eviction Policies

Troubleshooting

Node NotReady

High Resource Utilization

Pod Evictions

Best Practices

Next Steps

Pods

Events

Dashboard

HPA

Features

GitHub Integration

Configuration

​Overview

​List View

​Features

Node Status

Resource Monitoring

Pod Distribution

Utilization Metrics

​Node Status

​Table Columns

​Resource Indicators

​Detail View

​Overview Section

​Resource Capacity

​CPU Resources

​Memory Resources

​Pod Capacity

​Storage Resources

​Pods Section

​Conditions

​Events

​Taints

​Resource Allocation

​Understanding Requests vs Limits

​Quality of Service (QoS)

​Eviction Policies

​Troubleshooting

​Node NotReady

​High Resource Utilization

​Pod Evictions

​Best Practices

​Next Steps

Pods

Events

Dashboard

HPA

Overview

List View

Features

Node Status

Table Columns

Resource Indicators

Detail View

Overview Section

Resource Capacity

CPU Resources

Memory Resources

Pod Capacity

Storage Resources

Pods Section

Conditions

Events

Taints

Resource Allocation

Understanding Requests vs Limits

Quality of Service (QoS)

Eviction Policies

Troubleshooting

Node NotReady

High Resource Utilization

Pod Evictions

Best Practices

Next Steps