[prometheus-users] Issue with resolved alerts not sending notifications

mohammad md Wed, 23 Oct 2024 09:22:32 -0700


*I am running Prometheus to monitor system resources like memory and CPU 
usage, as well as other services on the infrastructure. I rely on 
Alertmanager to send alerts to Telegram whenever a specific issue occurs 
(such as high memory usage or a service stopping).*






*The problem I'm facing is that Alertmanager is not sending a notification 
when an issue is resolved.High CPU Usage: If CPU usage exceeds 70%.High 
Memory Usage: If memory usage exceeds 85%.Service Stopped: If a service 
stops working.Alerts are sent to Alertmanager, which then sends 
notifications via Telegram when an issue arises.*

*The initial alert messages are received correctly when the problem occurs. 
However, when the system returns to a normal state and the issue is 
"resolved," Alertmanager does not send a notification indicating that the 
problem has been resolved.*

*Instead of sending a "Resolved" message when the issue is fixed, I notice 
that the same alert message is repeated (the one for the issue), rather 
than receiving a message indicating that the issue has been resolved.*



*Current Configuration:Prometheus Configuration (file alerts.yml):groups:*

   - 
   
   
*name: CPU Usage Alert rules:*
   - 









*alert: HighCPUUsage expr: ceil(100 * (1 - (avg by (Host, Client) 
      (rate(node_cpu_seconds_total{mode="idle"}[5m]))))) > 70 for: 6m labels: 
      severity: Critical Host: "{{ $labels.Host }}" Client: "{{ $labels.Client 
      }}" annotations: summary: "High CPU usage on {{ $labels.Host }} for {{ 
      $labels.Client }} ({{ $value }})" description: "CPU usage on {{ 
      $labels.Host }} for {{ $labels.Client }} has exceeded 70% for 5 minutes." 
      resolved: "CPU usage on {{ $labels.Host }} for {{ $labels.Client }} is 
back 
      to normal ({{ $value }})."*
   - 
   
   
*name: Memory Usage Alert rules:*
   - 









*alert: HighMemory expr: floor(1 - (avg(node_memory_MemAvailable_bytes) by 
      (Client, Host) / avg(node_memory_MemTotal_bytes) by (Client, Host))) * 
100 
      > 85 for: 6m labels: severity: Critical Host: "{{ $labels.Host }}" 
Client: 
      "{{ $labels.Client }}" annotations: summary: "High Memory usage on {{ 
      $labels.Host }} for {{ $labels.Client }} ({{ $value }})" description: 
      "Memory usage on {{ $labels.Host }} for {{ $labels.Client }} has exceeded 
      85% for 5 minutes." resolved: "Memory usage on {{ $labels.Host }} for {{ 
      $labels.Client }} is back to normal ({{ $value }}%)."*
   









*Alertmanager Configuration (file alertmanager.yml):global:resolve_timeout: 
5mroute:receiver: telegram_receivergroup_by: ["alertname", 
"Host"]group_wait: 15sgroup_interval: 15srepeat_interval: 24hroutes:*

   - 
*receiver: 'telegram_receiver' matchers:*
      - *severity="Critical"*
   
*receivers:*

   - 
*name: 'telegram_receiver' telegram_configs:*
      - 




*api_url: 'https://api.telegram.org <https://api.telegram.org/>' 
      send_resolved: true bot_token: xxxxxxxxxxxxxx chat_id: 
      yyyyyyyyyyyyyyyyyyyyyyyyy message: '{{ range .Alerts }}Alert⚠️: {{ printf 
      "%s\n" .Labels.alertname }}{{ printf "%s\n" .Annotations.summary }}{{ 
      printf "%s\n" .Annotations.description }}{{ end }}' parse_mode: 'HTML'*
   
*I would greatly appreciate any guidance or solutions to this issue.*

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/4010b5eb-e76a-467d-b4fb-ada44af2912bn%40googlegroups.com.

[prometheus-users] Issue with resolved alerts not sending notifications

Reply via email to