To add a little more information about ambiguous/inconsistent matchers, the 
current version of Alertmanager allows the following:

{foo=}} equivalent to {foo="}"}, but I think this should be an error
{{foo=} is an error because of two {{, unlike the above
{foo=~} could be either {foo=~""} or {foo="~"}, it's interpreted in current 
versions as {foo=~""}
{foo=,} is equivalent to {foo=""}, but I think should be an error as a 
comma with no value has a high likelihood of being human error
{foo=,,} is an error, unlike {foo=,} or {foo=}}
{foo= } is equivalent to {foo=""}
{foo= b} and {foo=b } are equivalent to {foo="b"}, but {foo=b b} is 
equivalent to {foo="b b"}

The changes proposed here would also restrict all of the examples above as 
1. trailing commas are rejected and 2. unquoted text must match the regular 
expression [a-zA-Z_:][a-zA-Z0-9_:]*.


On Tuesday, May 9, 2023 at 10:05:49 AM UTC+1 George Robinson wrote:

> Hi, all! 👋
>
> I've opened a pull request for the Alertmanager here 
> <https://github.com/prometheus/alertmanager/pull/3353> that adds support 
> for parsing UTF-8 matchers while also adding restrictions on 
> ambiguous matchers that are permitted in the current version of 
> Alertmanager.
>
> https://github.com/prometheus/alertmanager/pull/3353
>
> *Background*
>
> As explained in #3319 
> <https://github.com/prometheus/alertmanager/issues/3319>, Alertmanager 
> constrains label names to the characters ^[a-zA-Z_][a-zA-Z0-9_]*$ - the 
> same as Prometheus.
>
> However, because Alertmanager is such a well known and popular open source 
> project it makes sense to use it for other kinds of alert generators too, 
> and not just Prometheus. An example of this is Grafana, where Alertmanager 
> is used to manage alerts created from both Prometheus and non-Prometheus 
> like datasources, including Graphite, InfluxDB, SQL, Loki and more. The 
> issue with this though is that these other datasources do not share the 
> same constraints as Prometheus when it comes to label names, and so using 
> them with Alertmanager requires some kind of normalization to ensure label 
> matchers match the set of allowed characters before the alert can be sent 
> to the Alertmanager.
>
> *What this pull request does*
>
> @yuri-tceretian <https://github.com/yuri-tceretian> has proposed adding 
> support for UTF-8 characters in #3321 
> <https://github.com/prometheus/alertmanager/pull/3321> that updates the 
> existing regular expression to support double quoted UTF-8 sequences as 
> label names, while keeping unquoted label names the same.
>
> In this pull request I wanted to propose a different option to where we 
> would add a "simple" LL(1) parser to Alertmanager to parse matchers instead.
>
> *Motivation*
>
> The original motivation for writing this parser was to add support for 
> matching label names containing . and spaces to grafana/grafana 
> <https://github.com/grafana/grafana>. However, about the same time I 
> learned that Prometheus maintainers agreed to add support for UTF-8 
> labels in Alertmanager 
> <https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit#bookmark=id.wm24eue9w2eg>,
>  
> and so I decided to further the work to see if it could be upstreamed to 
> Alertmanager instead.
>
> The original source code can be found at grobinson-grafana/matchers 
> <https://github.com/grobinson-grafana/matchers>.
>
> *Breaking changes*
>
> *Expressions must start and end with open and closing braces*
>
> All expressions must start and end with { and }, although this can be 
> relaxed if required. For example foo=bar is not valid, it must be 
> {foo=bar}.
>
> *Trailing commas are not permitted*
>
> Trailing commas are not permitted. For example {foo=bar,} is not valid, 
> it must be {foo=bar}.
>
> *All non [a-zA-Z_:][a-zA-Z0-9_:]* values must be double quoted*
>
> The set of unquoted characters is now the same on both sides of the 
> expression. In other words, both label names and label values without 
> double quotes must match the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*. 
> For example {foo=!bar} is not valid, it must be {foo="!bar"}. In current 
> versions of Alertmanager, unquoted label values can contain all UTF-8 code 
> points with the exception of comma, such as {foo=!bar}.
>
> There are two reasons for this:
>
>    1. 
>    
>    It's no longer possible to write ambiguous matchers which I feel is 
>    something Alertmanager should fix. For example is {foo=~} equivalent 
>    to {foo="~"} or {foo=~""}?
>    2. 
>    
>    If we restrict the =, !, ~ characters to double quotes we can keep the 
>    grammar LL(1). Without this restriction lookahead/backtrack is required to 
>    parse matchers such as {foo==~!=!~bar} which are valid in current 
>    versions of Alertmanager.
>    
> *Errors*
>
> One of the goals with this LL(1) parser is to provide better error 
> messages than what is possible using just a regular expression. This work 
> is still in progress as error messages are not as useful as I would like 
> for EOF, but here is an example of what it looks like for non-EOF cases:
> {foo=bar 🙂} 9:13: 🙂: invalid input: expected comma or closing '}' {foo 
> with spaces=bar with spaces} 5:9: unexpected with: expected an operator 
> such as '=', '!=', '=~' or '!~'
> *Benchmarks*
>
> I've also provided a number of benchmarks of both the LL(1) parser and 
> regex parser that supports UTF-8. These can be found at 
> grobinson-grafana/matchers-benchmarks 
> <https://github.com/grobinson-grafana/matchers-benchmarks>. However, to 
> run them go.mod must be updated to use the branch 
> https://github.com/grafana/prometheus-alertmanager/tree/yuri-tceretian/utf-8-label-names
>  here.
> BenchmarkMatchersSimple, BenchmarkPrometheusSimple {foo="bar"} 
> BenchmarkMatchersComplex, BenchmarkPrometheusComplex {foo="bar",bar="foo 
> 🙂","baz"!=qux,qux!="baz 🙂"} BenchmarkMatchersRegexSimple, 
> BenchmarkPrometheusRegexSimple {foo=~"[a-zA-Z_:][a-zA-Z0-9_:]*"} 
> BenchmarkMatchersRegexComplex, BenchmarkPrometheusRegexComplex 
> {foo=~"[a-zA-Z_:][a-zA-Z0-9_:]*",bar=~"[a-zA-Z_:]","baz"!~"[a-zA-Z_:][a-zA-Z0-9_:]*",qux!~"[a-zA-Z_:]"}
>  
>
> go test -bench=. -benchmem goos: darwin goarch: arm64 pkg: 
> github.com/grobinson-grafana/matchers-benchmarks 
> BenchmarkMatchersRegexSimple-8 
> <http://github.com/grobinson-grafana/matchers-benchmarksBenchmarkMatchersRegexSimple-8>
>  
> 488295 2425 ns/op 3248 B/op 49 allocs/op BenchmarkMatchersRegexComplex-8 
> 138081 9074 ns/op 11448 B/op 169 allocs/op BenchmarkPrometheusRegexSimple-8 
> 329244 3496 ns/op 3531 B/op 58 allocs/op BenchmarkPrometheusRegexComplex-8 
> 95188 12554 ns/op 12619 B/op 204 allocs/op BenchmarkMatchersSimple-8 
> 2888340 414.9 ns/op 56 B/op 2 allocs/op BenchmarkMatchersComplex-8 741590 
> 1628 ns/op 248 B/op 7 allocs/op BenchmarkPrometheusSimple-8 1919209 613.9 
> ns/op 233 B/op 8 allocs/op BenchmarkPrometheusComplex-8 425430 2803 ns/op 
> 1015 B/op 31 allocs/op PASS ok 
> github.com/grobinson-grafana/matchers-benchmarks 11.766s
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/97791d92-3c5d-4d54-8b96-f08852839e28n%40googlegroups.com.

Reply via email to