CCE Node Problem Detector
Add-on Overview
CCE Node Problem Detector (node-problem-detector, NPD) is an add-on that monitors abnormal events of cluster nodes and connects to a third-party monitoring platform. It is a daemon running on each node. It collects node issues from different daemons and reports them to the API server. It can run as a DaemonSet or a daemon.
Add-on Parameters
Parameter | Mandatory | Type | Description |
---|---|---|---|
description | No | String | Add-on description |
name | Yes | String | Add-on specification name. The value is fixed at Single-instance. |
replicas | Yes | String | Number of pods. The default value is 1. |
resources | Yes | resources object | Container resource (CPU and memory) quotas |
Parameter | Mandatory | Type | Description |
---|---|---|---|
feature_gate | No | String | Feature gate, which is used to enable the beta features |
multiAZBalance | No | Bool | Multi AZ deployment |
multiAZEnabled | No | Bool | Whether to deploy the add-on pods in multiple AZs. The default value is false. If this parameter is set to true, cross-AZ deployment is forcibly performed. If this parameter is set to false, cross-AZ deployment is preferred. |
npc | Yes | object Table 5 | node-problem-controller configuration |
tolerations | No | List<Object> Table 7 | Tolerations of the add-on |
node_match_expressions | No | List<Object> Table 7 | Node affinity configuration of the add-on |
Parameter | Mandatory | Type | Description |
---|---|---|---|
limitsCpu | Yes | String | CPU size limit (unit: m) |
limitsMem | Yes | String | Memory size limit (unit: Mi) |
name | Yes | String | Add-on name. The value is fixed at custom-resources. |
requestsCpu | Yes | String | Requested CPU size (unit: m) |
requestsMem | Yes | String | Requested memory size (unit: Mi) |
Parameter | Mandatory | Type | Description |
---|---|---|---|
maxTaintedNode | Yes | String or Int | The maximum number of nodes that NPC can add taints to when a single fault occurs on multiple nodes for minimizing impact. The value can be in int or percentage format. |
Parameter | Mandatory | Type | Description |
---|---|---|---|
key | No | String | Taint key |
effect | No | String | Taint policy |
operator | No | String | Operator |
tolerationSeconds | No | Int | Toleration time window |
Parameter | Mandatory | Type | Description |
---|---|---|---|
key | No | String | Taint key |
values | No | List<String> | Node affinity name |
operator | No | String | Operator |
Example Request
{"kind": "Addon","apiVersion": "v3","metadata": {"annotations": {"addon.install/type": "install"}},"spec": {"clusterID": "b78fb690-b82c-11ee-83cf-0255ac100b0f","version": "1.18.48","addonTemplateName": "npd","values": {"basic": {"image_version": "1.18.48","swr_addr": "***","swr_user": "***","rbac_enabled": true,"cluster_version": "v1.23"},"flavor": {"description": "custom resources","name": "custom-resources","replicas": 2,"resources": [{"limitsCpu": "100m","limitsMem": "300Mi","name": "node-problem-controller","requestsCpu": "30m","requestsMem": "100Mi"},{"limitsCpu": "100m","limitsMem": "300Mi","name": "node-problem-detector","requestsCpu": "30m","requestsMem": "100Mi"}],"category": ["CCE","Turbo"]},"custom": {"annotations": {},"common": {},"feature_gates": "","multiAZBalance": false,"multiAZEnabled": false,"node_match_expressions": [],"npc": {"maxTaintedNode": "10%"},"tolerations": [{"key": "node.kubernetes.io/not-ready","operator": "Exists","effect": "NoExecute","tolerationSeconds": 60},{"key": "node.kubernetes.io/unreachable","operator": "Exists","effect": "NoExecute","tolerationSeconds": 60}]}}}}
- Add-on Overview
- Add-on Parameters
- Example Request