- High-Risk Operations
- Clusters
- Cluster Overview
- Upgrading a Cluster
- Cluster Upgrade Overview
- Before You Start
- Migrating Services Across Clusters of Different Versions
- Troubleshooting for Pre-upgrade Check Exceptions
- Pre-upgrade Check
- Node Restrictions
- Upgrade Management
- Add-ons
- Helm Charts
- SSH Connectivity of Master Nodes
- Node Pools
- Security Groups
- Arm Node Restrictions
- Residual Nodes
- Discarded Kubernetes Resources
- Compatibility Risks
- CCE Agent Versions
- Node CPU Usage
- CRDs
- Node Disks
- Node DNS
- Node Key Directory File Permissions
- kubelet
- Node Memory
- Node Clock Synchronization Server
- Node OS
- Node CPU Cores
- Node Python Commands
- ASM Version
- Node Readiness
- Node journald
- containerd.sock
- Internal Error
- Node Mount Points
- Kubernetes Node Taints
- Everest Restrictions
- cce-hpa-controller Limitations
- Enhanced CPU Policies
- Health of Worker Node Components
- Health of Master Node Components
- Memory Resource Limit of Kubernetes Components
- Discarded Kubernetes APIs
- IPv6 Support in CCE Turbo Clusters
- NetworkManager
- Node ID File
- Node Configuration Consistency
- Node Configuration File
- CoreDNS Configuration Consistency
- sudo
- Key Node Commands
- Mounting of a Sock File on a Node
- HTTPS Load Balancer Certificate Consistency
- Node Mounting
- Login Permissions of User paas on a Node
- Private IPv4 Addresses of Load Balancers
- Historical Upgrade Records
- CIDR Block of the Cluster Management Plane
- CCE AI Suite (NVIDIA GPU)
- Nodes' System Parameters
- Residual Package Version Data
- Node Commands
- Node Swap
- NGINX Ingress Controller
- Upgrade of Cloud Native Cluster Monitoring
- containerd Pod Restart Risks
- Key CCE AI Suite (NVIDIA GPU) Parameters
- GPU Pod Rebuild Risks
- ELB Listener Access Control
- Master Node Flavor
- Subnet Quota of Master Nodes
- Node Runtime
- Node Pool Runtime
- Number of Node Images
- OpenKruise Compatibility Check
- Compatibility Check of Secret Encryption
- Compatibility Between the Ubuntu Kernel and GPU Driver
- Drainage Tasks
- Image Layers on a Node
- Cluster Rolling Upgrade
- Rotation Certificates
- Ingress and ELB Configuration Consistency
- Network Policies of Cluster Network Components
- Cluster and Node Pool Configurations
- Time Zone of Master Nodes
- SNATIPRanges
- Add-on Configuration Consistency
- Change History
Node CPU Usage
Check Items
Check whether the node's CPU usage is above 90%.
Solution
- Upgrade the cluster during off-peak hours.
- Check whether too many pods are deployed on the node. If yes, reschedule pods to other idle nodes.
Parent topic: Troubleshooting for Pre-upgrade Check Exceptions
- Check Items
- Solution