nav-img
Advanced

Modifying Cluster Configurations

Cluster configuration parameters are underlying rules that define node behavior, resource allocation, communication rules, and scaling policies in a distributed system. They affect the cluster's performance, stability, scalability, and fault tolerance. You can customize the core components of a CCE cluster by adjusting these parameter settings. The following table lists cluster configuration parameters that you can adjust as needed.

Table 1 Cluster configuration parameters

Category

Description

Control resource scheduling, manage requests, ensure security, and toggle feature statuses to maintain efficient cluster operation and proper resource allocation.

Modifying kube-apiserver parameters will restart the cluster and terminate existing persistent connections. Exercise caution when performing this operation.

Manage and optimize resource scheduling, request control, and GPU resource allocation in clusters. You can dynamically adjust scheduling policies based on cluster loads and resource requirements to ensure efficient cluster running and maximize resource utilization.

Control the behavior and synchronization frequency of different controllers in a cluster to optimize cluster resource management and task scheduling.

Control and optimize the management of network resources in clusters, especially in heavy-load and large-scale clusters, to ensure efficient network running and proper resource allocation.

Specify the IP address range that does not require SNAT. This avoids unnecessary SNAT and optimizes network performance.

Restrict the resource usage in the namespace to ensure fair and reasonable resource allocation.

Modifying Cluster Configurations

  1. Log in to the CCE console. In the navigation pane, choose Clusters.
  2. Locate the target cluster, click ... to view more operations on the cluster, and choose Manage. This function allows you to modify parameter settings of Kubernetes native components and proprietary components.
  3. On the Manage Component page, change the values of the Kubernetes parameters listed in the following table.

    Table 2 kube-apiserver configurations

    Item

    Parameter

    Description

    Value

    Toleration time for nodes in NotReady state

    default-not-ready-toleration-seconds

    Tolerance time during which containers can continue running before being automatically evicted if their node becomes unavailable. This setting applies to all containers by default. In a CCE cluster, you can configure separate tolerance policies for different pods for refined management. For details, see Configuring Tolerance Policies.

    Configuration suggestion: Unless otherwise specified, keep the default settings.

    Potential risks: If the specified tolerance time is too short, pods may be frequently migrated due to transient issues like network jitter. If the specified tolerance time is too long, services may remain interrupted for an extended period after a node failure.

    Default: 300s

    Toleration time for nodes in unreachable state

    default-unreachable-toleration-seconds

    Tolerance time during which containers can continue running before being automatically evicted if their node cannot be accessed. This setting applies to all containers by default. In a CCE cluster, you can configure separate tolerance policies for different pods for refined management. For details, see Configuring Tolerance Policies.

    Configuration suggestion: Unless otherwise specified, keep the default settings.

    Potential risks: If the specified tolerance time is too short, pods may be frequently migrated due to transient issues like network jitter. If the specified tolerance time is too long, services may remain interrupted for an extended period after a node failure.

    Default: 300s

    Maximum Number of Concurrent Modification API Calls

    max-mutating-requests-inflight

    Maximum number of concurrent mutating requests. Any requests exceeding the specified value will be rejected. Value 0 indicates that there is no limit on the maximum number of concurrent mutating requests.

    Configuration suggestion: Retain the default setting.

    Potential risks: Increasing the value of this parameter may cause overload.

    Manual configuration is no longer supported since cluster v1.21. The value is automatically specified based on the cluster scale.

    • 200 for clusters with 50 or 200 nodes
    • 500 for clusters with 1000 nodes
    • 1000 for clusters with 2000 nodes

    Maximum Number of Concurrent Non-Modification API Calls

    max-requests-inflight

    Maximum number of concurrent non-mutating requests. Any requests exceeding the specified value will be rejected. Value 0 indicates that there is no limit on the maximum number of concurrent non-mutating requests.

    Configuration suggestion: Retain the default setting.

    Potential risks: Increasing the value of this parameter may cause overload.

    Manual configuration is no longer supported since cluster v1.21. The value is automatically specified based on the cluster scale.

    • 400 for clusters with 50 or 200 nodes
    • 1000 for clusters with 1000 nodes
    • 2000 for clusters with 2000 nodes

    NodePort port range

    service-node-port-range

    Port range for a NodePort Service. After changing the value, go to the security group page and change the TCP/UDP port range of node security groups 30000 to 32767. Otherwise, ports other than the default port cannot be accessed externally.

    Configuration suggestion: Retain the default setting.

    Potential risks: If the port number is smaller than 20106, a conflict may occur between the port and the CCE health check port, which may further lead to unavailable cluster. If the port number is greater than 32767, a conflict may occur between the port and the ports in net.ipv4.ip_local_port_range, which may further affect the network performance.

    Default: 30000 to 32767

    Value range:

    Min > 20105

    Max < 32768

    Request Timeout

    request-timeout

    Request timeout of the kube-apiserver component.

    Configuration suggestion: Retain the default setting to prevent frequent API timeouts and other exceptions.

    Applicable cluster version: This parameter is available only in clusters of v1.19.16-r30, v1.21.10-r10, v1.23.8-r10, v1.25.3-r10, or later.

    Default: 1m0s

    Value range:

    Min ≥ 1s

    Max ≤ 1 hour

    Overload Control

    support-overload

    Cluster overload control. After this function is enabled, concurrent requests will be dynamically controlled based on the resource demands received by master nodes, ensuring stable running of the master nodes and the cluster.

    Configuration suggestion: Enable this function. In scenarios like short-term request bursts, a cluster may still become overloaded even with overload control enabled. In such cases, you are advised to manage and control access to the cluster promptly.

    Applicable cluster version: This parameter is available only in clusters of v1.23 or later.

    • false: Overload control is disabled.
    • true: Overload control is enabled.

    Node Restriction Add-on

    enable-admission-plugin-node-restriction

    Restrict kubelet to modify only the pods on its own node. This prevents unauthorized operations and enhances isolation in high-security or multi-tenant scenarios.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: true

    Pod Node Selector Add-on

    enable-admission-plugin-pod-node-selector

    Allow cluster administrators to configure default node selectors through namespace annotations. In this way, pods run only on specific nodes and configurations are simplified.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: true

    Pod Toleration Limit Add-on

    enable-admission-plugin-pod-toleration-restriction

    Allow cluster administrators to configure default pod toleration values and limits through namespaces. This enables fine-grained control over pod scheduling and protects key resources.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: false

    API Audience Settings

    api-audiences

    Specify the audiences associated with a ServiceAccount token in service account token volume projection. For details, see the official document.

    Configuration suggestion: Retain the default setting. To ensure the proper running of the original service account authentication, add audiences instead of deleting existing ones when configuring this parameter.

    Potential risks: Deleting the original configuration that is still in use or setting it to an incorrect URL may result in a service account authentication failure.

    Applicable cluster version: This parameter is available only in clusters of v1.23.16-r0, v1.25.11-r0, v1.27.8-r0, v1.28.6-r0, v1.29.2-r0, or later.

    Default value: "https://kubernetes.default.svc.cluster.local"

    Multiple values can be configured, which are separated by commas (,).

    Service Account Token Issuer Identity

    service-account-issuer

    Entity identifier for issuing a service account token, which is the value identified by the iss field in the payload of the service account token.

    Configuration suggestion: Ensure the configured issuer URL can be accessed in the cluster and trusted by the authentication system in the cluster.

    Potential risks: If your specified issuer URL is untrusted or inaccessible, the authentication process based on the service account may fail.

    Applicable cluster version: This parameter is available only in clusters of v1.23.16-r0, v1.25.11-r0, v1.27.8-r0, v1.28.6-r0, v1.29.2-r0, or later.

    Default value: "https://kubernetes.default.svc.cluster.local"

    Multiple values can be configured, which are separated by commas (,).

    Table 3 Scheduler configurations

    Item

    Parameter

    Description

    Value

    Default cluster scheduler

    default-scheduler

    Select different schedulers as needed.

    • kube-scheduler: provides the standard scheduling capability of the community.
    • Volcano: compatible with kube-scheduler scheduling capabilities and provides enhanced scheduling capabilities. For details, see Volcano Scheduling.

    Default: kube-scheduler

    QPS for communicating with kube-apiserver

    kube-api-qps

    QPS for communicating with kube-apiserver.

    • If the number of nodes in a cluster is less than 1000, the default value is 100.
    • If the number of nodes in a cluster is 1000 or more, the default value is 200.

    Burst for communicating with kube-apiserver

    kube-api-burst

    Burst QPS for communicating with kube-apiserver.

    • If the number of nodes in a cluster is less than 1000, the default value is 100.
    • If the number of nodes in a cluster is 1000 or more, the default value is 200.

    Whether to enable GPU sharing

    enable-gpu-share

    Determine whether to enable GPU sharing as needed.

    • When disabled, ensure that pods in the cluster cannot use shared GPUs (no cce.io/gpu-decision annotation in pods) and that GPU virtualization is disabled.
    • When enabled, ensure that there is a cce.io/gpu-decision annotation on all pods that use GPU resources in the cluster.

    Applicable cluster version: This parameter is available only in clusters of v1.23.7-r10, v1.25.3-r0, or later.

    Default: true

    Table 4 kube-controller-manager configurations

    Item

    Parameter

    Description

    Value

    Number of concurrent processing of deployment

    concurrent-deployment-syncs

    Number of deployment objects that are allowed to sync concurrently

    Default: 5

    Concurrent processing number of endpoint

    concurrent-endpoint-syncs

    Number of service endpoint syncing operations that will be done concurrently

    Default: 5

    Concurrent number of garbage collectors

    concurrent-gc-syncs

    Number of garbage collector workers that can be synchronized concurrently

    Default: 20

    Number of job objects allowed to sync simultaneously

    concurrent-job-syncs

    Number of job objects that can be synchronized concurrently

    Default: 5

    Number of CronJob objects allowed to sync simultaneously

    concurrent-cron-job-syncs

    Number of scheduled jobs that can be synchronized concurrently

    Default: 5

    Number of concurrent processing of namespace

    concurrent-namespace-syncs

    Number of namespace objects that can be synchronized concurrently

    Default: 10

    Concurrent processing number of replicaset

    concurrent-replicaset-syncs

    Number of replica sets that can be synchronized concurrently

    Default: 5

    Number of concurrent processing of resource quota

    concurrent-resource-quota-syncs

    Number of resource quotas that can be synchronized concurrently

    Default: 5

    Service

    concurrent-service-syncs

    Number of services that can be synchronized concurrently

    Default: 10

    Concurrent processing number of serviceaccount-token

    concurrent-serviceaccount-token-syncs

    Number of service account token objects that can be synchronized concurrently

    Default: 5

    Concurrent processing of ttl-after-finished

    concurrent-ttl-after-finished-syncs

    Number of ttl-after-finished-controller workers that can be synchronized concurrently

    Default: 5

    RC

    concurrent_rc_syncs (used in clusters of v1.19 or earlier)

    concurrent-rc-syncs (used in clusters of v1.21 through v1.25.3-r0)

    Number of replication controllers that can be synchronized concurrently

    This parameter is deprecated in clusters of v1.25.3-r0 and later versions.

    Default: 5

    HPA

    concurrent-horizontal-pod-autoscaler-syncs

    Number of HPA auto scaling requests that can be concurrently processed

    Default 1 for clusters earlier than v1.27 and 5 for clusters of v1.27 or later

    Value range: 1 to 50

    Cluster elastic computing period

    horizontal-pod-autoscaler-sync-period

    Period for the horizontal pod autoscaler to perform auto scaling on pods. A smaller value will result in a faster auto scaling response and higher CPU load.

    Configuration suggestion: Retain the default setting.

    Potential risks: A lengthy period can cause the controller to respond slowly, while a short period may overload the cluster control plane.

    Default: 15 seconds

    Horizontal Pod Scaling Tolerance

    horizontal-pod-autoscaler-tolerance

    The configuration determines how quickly HPA will act to auto scaling policies. If the parameter is set to 0, auto scaling will be triggered immediately when the related metrics are met.

    Configuration suggestion: Configure this parameter based on service resource usage. If the service resource usage increases sharply over time, configure a tolerance to prevent unexpected auto scaling in short-term high-resource usage scenarios.

    Default: 0.1

    HPA CPU Initialization Period

    horizontal-pod-autoscaler-cpu-initialization-period

    The built-in delay of the HPA for collecting CPU usage after pods start. You can use this parameter to filter out unstable CPU usage data during the early stage of pod startup. This helps prevent incorrect scaling decisions based on momentary peak values.

    Configuration suggestion: If HPA makes an incorrect scaling decision due to fluctuating CPU usage during pod startup, increase the value of this parameter.

    Potential risks: A small parameter value may trigger unnecessary scaling based on peak CPU usage, while a large value may cause delayed scaling.

    Applicable cluster version: This parameter is available only in clusters of v1.23.16-r0, v1.25.11-r0, v1.27.8-r0, v1.28.6-r0, v1.29.2-r0, or later.

    Default: 5 minutes

    HPA Initial Readiness Delay

    horizontal-pod-autoscaler-initial-readiness-delay

    The waiting time before the HPA starts automatic scaling based on pod readiness.

    Configuration suggestion: To prevent HPA misjudgment caused by pod readiness fluctuations after startup, increase the value of this parameter.

    Potential risks: If this parameter is set to a small value, an unnecessary scale-out may occur due to CPU data fluctuations when the pod is just ready. If it is set to a large value, the HPA may not respond quickly enough in situations requiring rapid scaling.

    Applicable cluster version: This parameter is available only in clusters of v1.23.16-r0, v1.25.11-r0, v1.27.8-r0, v1.28.6-r0, v1.29.2-r0, or later.

    Default: 30s

    QPS for communicating with kube-apiserver

    kube-api-qps

    QPS for communicating with kube-apiserver

    • If the number of nodes in a cluster is less than 1000, the default value is 100.
    • If the number of nodes in a cluster is 1000 or more, the default value is 200.

    Burst for communicating with kube-apiserver

    kube-api-burst

    Burst QPS for communicating with kube-apiserver

    • If the number of nodes in a cluster is less than 1000, the default value is 100.
    • If the number of nodes in a cluster is 1000 or more, the default value is 200.

    The maximum number of terminated pods that can be kept before the Pod GC deletes the terminated pod

    terminated-pod-gc-threshold

    Number of terminated pods that can exist in a cluster. When the number of terminated pods exceeds the expected threshold, the excess terminated pods will be automatically deleted.

    If this parameter is set to 0, all terminated pods will be retained.

    Default: 1000

    Value range: 10 to 12500

    If the cluster version is v1.21.11-r40, v1.23.8-r0, v1.25.6-r0, v1.27.3-r0, or later, the value range is changed to 0 to 100000.

    Unhealthy AZ Threshold

    unhealthy-zone-threshold

    If the number of not-ready nodes exceeds the specified threshold in a given AZ, that AZ will be marked as unhealthy. In such unhealthy AZs, the frequency of service migration for faulty nodes will be reduced to prevent further negative impacts caused by large-scale migrations during major fault scenarios.

    Configuration suggestion: Retain the default setting.

    Potential risks: If the parameter is set to a large value, pods in unhealthy AZs will be migrated in a large scale, which can lead to risks such as overloading the cluster.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: 0.55

    Value range: 0 to 1

    Node Eviction Rate

    node-eviction-rate

    The maximum number of pods that can be evicted per second when a node is faulty in a healthy AZ. The default value is 0.1, indicating that pods from at most one node can be evicted every 10 seconds.

    Configuration suggestion: Ensure that the number of pods migrated in each batch does not exceed 300. If the parameter is set to a large value, the cluster may be overloaded. Additionally, if too many pods are evicted, they cannot be rescheduled, which will slow down fault recovery.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: 0.1

    Secondary Node Eviction Rate

    secondary-node-eviction-rate

    The maximum number of pods that can be evicted per second when a node is faulty in an unhealthy AZ. The default value is 0.01, indicating that pods from at most one node can be evicted every 100 seconds.

    Configuration suggestion: Configure this parameter to be one-tenth of node-eviction-rate.

    Potential risks: For nodes in an unhealthy AZ, there is no need to set this parameter to a large value. Doing so may result in overloaded clusters.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: 0.01

    Large Cluster Threshold

    large-cluster-size-threshold

    The criterion for determining whether a cluster is a large-scale cluster. If the number of nodes in a cluster exceeds the value of this parameter, the cluster is considered a large-scale cluster.

    Configuration suggestion: For the clusters with a large number of nodes, configure a relatively larger value than the default one for higher performance and faster responses of controllers. Retain the default value for small clusters. Before adjusting the value of this parameter in a production environment, check the impact of the change on cluster performance in a test environment.

    Potential risks: In a large-scale cluster, kube-controller-manager adjusts specific configurations to optimize the performance of the cluster. Setting an excessively small threshold for small clusters will deteriorate the cluster performance.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: 50

    Table 5 Network components (supported only by CCE Turbo clusters)

    Item

    Parameter

    Description

    Value

    The minimum number of network cards bound to the container at the cluster level

    nic-minimum-target

    Minimum number of container NICs bound to a node

    The parameter value must be a positive integer. The value 10 indicates that at least 10 container NICs must be bound to a node. If the number you specified exceeds the container NIC quota of the node, the NIC quota will be used.

    Default: 10

    Cluster-level node preheating container NIC upper limit check value

    nic-maximum-target

    After the number of NICs bound to a node exceeds the nic-maximum-target value, CCE will not proactively pre-bind NICs.

    Checking the upper limit of pre-bound container NICs is enabled only when the value of this parameter is greater than or equal to the minimum number of container NICs (nic-minimum-target) bound to a node.

    The parameter value must be a positive integer. The value 0 indicates that checking the upper limit of pre-bound container NICs is disabled. If the number you specified exceeds the container NIC quota of the node, the NIC quota will be used.

    Default: 0

    Number of NICs for dynamically warming up containers at the cluster level

    nic-warm-target

    The target number of NICs to be pre-bound to a node before it starts.

    When the sum of the nic-warm-target value and the number of NICs already bound to the node exceeds the nic-maximum-target value, CCE will pre-bind the number of NICs specified by the difference between the nic-maximum-target value and the current number of NICs bound to the node.

    Default: 2

    Cluster-level node warm-up container NIC recycling threshold

    nic-max-above-warm-target

    Pre-bound NICs on a node will only be unbound and reclaimed if the difference between the number of idle NICs and the nic-warm-target value exceeds the threshold. The value can only be a number.

    • A large value will accelerate pod startup but slow down the unbinding of idle container NICs and decrease the IP address usage. Exercise caution when performing this operation.
    • A small value will speed up the unbinding of idle container NICs and increase the IP address usage but will slow down pod startup, especially when a large number of pods increase instantaneously.

    Default: 2

    Pod Access to Metadata

    allow-metadata-network-access

    After this function is enabled, pods in a cluster can access the node's metadata.

    • If a pod is created after this function is enabled, its ability to access metadata depends on the function's status.
    • If a pod is created after this function is disabled, or in a cluster of an earlier version without this function, it cannot access metadata. To enable a pod to access metadata, it must be rebuilt while the function is enabled.

    Default: false

    Table 6 Network component configurations (supported only by CCE clusters using VPC networks)

    Item

    Parameter

    Description

    Value

    Retaining the non-masqueraded CIDR block of the original pod IP address

    nonMasqueradeCIDRs

    In a CCE cluster using the VPC network model, if a container in the cluster needs to access externally, the source pod IP address must be masqueraded as the IP address of the node where the pod resides through SNAT. After the configuration, the node will not perform SNAT on IP addresses within the specified CIDR block by default.

    By default, nodes in a cluster do not perform SNAT on packets destined for the private CIDR blocks 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16. Instead, these packets are directly transferred using the upper-layer VPC. These three CIDR blocks are considered internal networks within the cluster and are reachable at Layer 3 by default.

    Applicable cluster version: This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later.

    Default: 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16

    NOTE:

    To enable cross-node pod access, the CIDR block of the node where the target pod runs must be added.

    Similarly, to enable cross-ECS pod access in a VPC, the CIDR block of the ECS where the target pod runs must be added.

    Table 7 Extended controller configurations (supported only by clusters of v1.21 or later)

    Item

    Parameter

    Description

    Value

    Enable resource quota management

    enable-resource-quota

    Determine whether to automatically create a ResourceQuota when creating a namespace. With quota management, you can control the number of workloads of each type and the upper limits of resources in a namespace or related dimensions.

    • false: Auto creation is disabled.
    • true: Auto creation is enabled. For details about the resource quota defaults, see Configuring Resource Quotas.

    Configuration suggestion: In high-concurrency scenarios (for example, batch creation of pods), resource quota management may cause some requests to fail due to conflicts. This function should not be enabled unless necessary. If you enable it, ensure that the request client has a retry mechanism.

    Default: false

  4. Click OK.

References