I wonder if it would also be possible to hear Julian's perspective on this. 
I can bring the topic to the dev summit on Thursday?

On Thursday, February 9, 2023 at 5:48:12 PM UTC Matthias Rampke wrote:

> I agree that silently sending *no* alert is the worst possible outcome. I 
> wonder what would be "nicer" in case a template fails - send the alert with 
> the fields that did not fail to render (possibly render the error *into* 
> the fields that failed to make it very obvious?), or (as proposed) fall 
> back to a "safe" template?
>
> /MR
>
> On Thu, Feb 9, 2023 at 6:44 PM Bjoern Rabenstein <[email protected]> 
> wrote:
>
>> On 07.02.23 05:57, 'George Robinson' via Prometheus Developers wrote:
>> > 
>> > While I appreciate the responsibility of writing correct templates is 
>> on 
>> > the user, I have also been considering whether Alertmanager should be 
>> more 
>> > tolerant of template errors, and attempt to send some kind of 
>> notification 
>> > when this happens. For example, falling back to the default template 
>> that 
>> > we have high confidence of being correct.
>>
>> I think that makes sense. The fall-back template could call out very
>> explicitly that the intended template failed to expand and therefore
>> you get a replacement, maybe even with the error message of the
>> attempt to expand the original template.
>>
>> But I'm not really an Alertmanager experts. And despite having a lot
>> of historical context about Prometheus in general, I don't remember
>> anything specific about error handling in alert templates.
>>
>> I only remember that trying out an alert "in production" is really
>> hard since you need to trigger it. And if the moment you notice that
>> your template doesn't work is also the moment when your alert is
>> supposed to fire, that's really bad.
>>
>> So better test tooling might help here, but even if we had that, I
>> think there should be a safe fall-back so that no alert is ever
>> swallowed because of a templating error.
>>
>> -- 
>> Björn Rabenstein
>> [PGP-ID] 0x851C3DA17D748D03
>> [email] [email protected]
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-developers/Y%2BUxD3QTKJbrLACk%40mail.rabenste.in
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/fdd43fe2-c7ce-47fb-9606-55fae7adc058n%40googlegroups.com.

Reply via email to