Configuring Alarm Thresholds for Resource Monitoring
When a component is deployed in the Kubernetes environment, if you need to monitor some resources and respond to exceptions in a timely manner, you can create threshold rules for metrics of these key resources, so that you can find and handle exceptions in time.
- If the metric meets the threshold conditions within a specified period, the system sends a threshold alarm.
- If no metric is reported within a specified period, the system sends a data insufficiency event.
- If you cannot query the change information about the threshold rule status on the ServiceStage console due to non-business hours or business trips, you can enable the notification function to send the change information to related personnel through SMS messages or emails.
Procedure
- Log in to ServiceStage.
- Use either of the following methods to go to the Threshold Alarms page.
- On the Application Management page, click the application to which the component belongs, and click the target component in Component List. In the left navigation pane, choose O&M Configurations > Threshold Alarms.
- On the Component Management page, click the target component. In the left navigation pane, choose O&M Configurations > Threshold Alarms.
- Click Set Threshold Rule and set threshold rule parameters by referring to Table 1. Parameters marked with an asterisk (*) are mandatory.
Table 1 Threshold rule parameters Parameter
Description
*Threshold Name
Name of the threshold rule to be added.
NOTE:The name must be unique and cannot be modified once specified.
Description
Threshold rule description.
Statistic Method
Method used to measure metrics.
Statistical Periods
Interval at which metric data is collected.
Metric
Select the metrics to be monitored.
*Threshold Condition
Trigger of a threshold alarm. A threshold condition consists of two parts: operators (≥, ≤, >, and <) and threshold value.
For example, if this parameter is set to ≥ 80, the system generates a threshold alarm when the metric is greater than or equal to 80.
Consecutive Periods
When the metric meets the threshold condition for a specified number of consecutive periods, a threshold alarm will be generated.
Alarm Severity
Severity of the threshold alarm.
Send Notifications
Whether to send notifications.
- If you select Yes (recommended), SMN will send notifications to you when a threshold alarm is triggered.
- If you select No, you will not be notified.
NOTE:SMN must have been deployed in the environment. Otherwise, Send Notifications is not displayed by default.
*Topic Name
If you select Yes for Send Notifications, select a topic and click ▲.
For details about how to create a topic, see Creating a Topic.
*Trigger Condition
Trigger condition for sending a notification when Send Notifications is set to Yes.
- An alarm occurred: When a threshold alarm is generated, the system sends a notification to a specified user by email or SMS message.
- The alarm is cleared: When the alarm is cleared, the system sends a notification to a specified user by email or SMS message.
- Click OK.
Follow-Up Operations
After a threshold rule is created, you can manage threshold alarms by referring to Table 2.
Operation | Description |
---|---|
Modify a Threshold Alarm | When you find that the current threshold rule is not properly set, you can perform the following operations to modify the threshold rule to better meet your service requirements.
|
Delete a Threshold Alarm | When you find that the current threshold rule is no longer needed, you can perform the following operations to delete the threshold rule to release more threshold rule resources.
|
Search for Threshold Alarms |
|
View Threshold-Crossing Alarms | If the metric meets the threshold conditions within a specified period, the system sends a threshold alarm. View the alarm in the threshold alarm list. |
View Alarm History | Click History in the Operation column of the threshold rule list to view historical alarms. |
Check the data insufficiency event. | If no metric is reported within a specified period, the system sends a data insufficiency event. You can view the event on the Event page. For details, see Viewing Component Running Events. |
- Procedure
- Follow-Up Operations