I plan to disable the grouping only for opsgenie routes and for specific
set of alerts. Here is the example of current alert manager configuration -
Example -
route:
group_wait: 5m
group_interval: 5m
repeat_interval: 7h
receiver: admin
routes:
- match_re:
alertname: ".* Type1 Server is down.* "
receiver: admingroup2
routes:
- match:
team: support
severity: critical
receiver: opsgeniesupport
group_wait: 1m
group_interval: 5m
repeat_interval: 6h
continue: true
- match:
team: support
severity: critical
receiver: mailsupport
group_wait: 1m
group_interval: 1h
repeat_interval: 12h
Q1: Is is possible to disable the grouping for specific type of alerts (
Say Type1 keyword in alert manager) only for opsgenie route? I am
looking for something like -
- match:
team: support
severity: critical
receiver: opsgeniesupport
* group_by: [instance]*
group_wait: 1m
group_interval: 5m
repeat_interval: 6h
continue: true
- match:
team: support
severity: critical
receiver: mailsupport
* group_by: [instance]*
group_wait: 1m
group_interval: 1h
repeat_interval: 12h
Is this allowed by Alert Manager?
Q2: Is it possible to change the alert name from the prometheus before
prometheus dispatches alert to the alert manager?
- alert: "Type1 down or process monitoring service is unreachable"
expr: up{ SERVER_CATEGORY='Type1' } == 0
for: 2m
labels:
severity: critical
team: support
annotations:
summary: "{{ $labels.instance }} is not reachable"
description: "{{ $labels.instance }} is not reachable"
- alert: " Type1 down or process monitoring service is unreachable -
{{ $labels.instance}} "
Hopefully this will help me as i am unable to get the appropriate tags in
opsgenie using grouping.
Having host name tag will be helpful and we can know via JIRA integration
that how many incidents have occured for a host in past.
Regards
MG
On Saturday, July 27, 2024 at 9:09:57 PM UTC+5:30 mohan garden wrote:
> Hi Brian,
> Thank you for the suggestion,
> I was able to setup a flask application to monitor the data sent by alert
> manager for opsgenie using api_url end point.
> I had to create 3 end points
> 1. POST for - /
> 2. PUT for /v2/alerts/message
> 3. PUT for /v2/alerts/description
>
>
> *POST:*
> {'alias': '<mangled>71c5c169a773796b467cc741f70457c4', 'message': 'Type1
> Server is down or node exporter is unreachable', 'description':
> 'server1:9100 server is down or prometheus is unable to query the node
> exporter service which should be up and running.\n\rserver2:9100 server is
> down or prometheus is unable to query the node exporter service which
> should be up and running.\n\r', 'details': {'SERVER_CATEGORY': 'Type1',
> 'SERVER_SITE': 'ind', 'alertname': 'Type1 Server is down or node exporter
> is unreachable', 'criteria': 'nodedown', 'job': 'default_nodeexporters',
> 'severity': 'critical', 'team': 'infrasupport'}, 'source': '
> http://alertmanager:9093/#/alerts?receiver=opsgenie_support', 'tags':
> ['SERVER_CATEGORY=Type1', 'SERVER_SITE=ind', 'criteria=nodedown',
> 'severity=critical', 'team=support', 'support', 'monitor',
> 'server1:9100', 'server2:9100'], 'priority': 'P1'}
> 10.73.6.210 - - [27/Jul/2024 07:32:04] "POST /v2/alerts HTTP/1.1" 200 -
>
> *First PUT:*
> {'message': 'Utility Server is down or node exporter is unreachable'}
> 10.73.6.210 - - [27/Jul/2024 07:32:04] "PUT
> /v2/alerts/<mangled>71c5c169a773796b467cc741f70457c4/message?identifierType=alias
>
> HTTP/1.1" 200 -
>
> *Second PUT:*
> {'description': 'server1:9100 server is down or prometheus is unable to
> query the node exporter service which should be up and
> running.\n\rserver2:9100 server is down or prometheus is unable to query
> the node exporter service which should be up and running.\n\r'}
> 10.73.6.210 - - [27/Jul/2024 07:32:04] "PUT
> /v2/alerts/<mangled>71c5c169a773796b467cc741f70457c4/description?identifierType=alias
>
> HTTP/1.1" 200 -
>
> It seems the alert manager needs to send another PUT request for updating
> the opsgenie tags.
>
>
>
>
> On Wednesday, April 3, 2024 at 9:59:06 PM UTC+5:30 Brian Candler wrote:
>
>> On Wednesday 3 April 2024 at 16:01:21 UTC+1 mohan garden wrote:
>>
>> Is there a way i can see the entire message which alert manager sends out
>> to the Opsgenie? - somewhere in the alertmanager logs or a text file?
>>
>>
>> You could try setting api_url to point to a webserver that you control.
>>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/02681943-17e4-498b-a6be-d5222705186cn%40googlegroups.com.