On Thu, 18 Jun 2020 at 18:09, Matt Layher <[email protected]> wrote:

> Hey folks,
>
> This is something myself and others have discussed in a few places, but I
> figure it's time to at least get the ball rolling on some sort of mailing
> list thread. I'd like to at least have a discussion about the possibility
> of adding CIDR notation matchers to PromQL, via a hypothetical "cidr_match"
> function.
>
> # The problem
>
> I'm developing a network daemon (https://github.com/mdlayher/corerad)
> which produces metrics with IP address prefix information (note: not
> individual IP addresses) partitioned by the advertising interface and
> prefix:
>
> prefix_autonomous{interface="eth0", prefix="fd9e:1a04:f01d::/64"} 1
> prefix_autonomous{interface="eth1", prefix="fd9e:1a04:f01d:10::/64"} 1
> prefix_autonomous{interface="eth0", prefix="2600:6c4a:7880:3200::/64"} 1
> prefix_autonomous{interface="eth1", prefix="2600:6c4a:7880:320a::/64"}  1
>
> In order to alert if either of my prefixes is unavailable or cannot be
> used for IPv6 stateless address autoconfiguration, I have an alert that
> ensures that a /64 prefix is present on each interface for both my IPv6
> global /56 and unique local /48. At the moment, I use a regex matcher:
>
> count by(instance, interface)
> (prefix_autonomous{prefix=~"2600:6c4a:7880:32.*|fd9e:1a04:f01d:.*"} == 1)
> != 2
>

Hmm, this is incorrect. You want bool here or you'll miss when both are 0.

With the given example the regex matcher is also redundant, as all series
match it.


>
> Since my IPv6 /56 and /48 fall on 8-bit boundaries, it isn't a huge deal
> to write a regex that is good enough to match them both.
>
> However, a friend has a scenario where he wants to inspect metrics for an
> IPv4 /20 prefix, so things get considerably harder. Using the tool 'rgxg',
> we produce the following:
>
> $ rgxg cidr 192.0.2.0/20
> 192\.0\.(1[0-5]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
>
> I'm not sure how to get rgxg to strip the period escapes, but after doing
> so, your PromQL query looks like:
>
>
> prefix_autonomous{prefix=~"192.0.(1[0-5]|[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])"}
>

You need the period escapes for correctness.


>
> This is ugly but somewhat workable for IPv4. For IPv6, the problem is much
> worse. Consider a hypothetical prefix such as /63 which cannot be matched
> on a 4-bit hex digit boundary (meaning you can't do the trick I'm doing
> with ".*"):
>
> $ rgxg cidr "2001:db8::/63"
>
> 2001:0?[Dd][Bb]8((:(:0?0?0?[01])?(:[0-9A-Fa-f]{1,4}){1,4}|::|:0?0?0?0(:(:[0-9A-Fa-f]{1,4}){1,4}|::|:0?0?0?[01](:(:[0-9A-Fa-f]{1,4}){1,3}|::|:[0-9A-Fa-f]{1,4}(:(:[0-9A-Fa-f]{1,4}){1,2}|::|:[0-9A-Fa-f]{1,4}(::[0-9A-Fa-f]{1,4}|::|:[0-9A-Fa-f]{1,4}(::|:[0-9A-Fa-f]{1,4}))))))|(:(:0?0?0?[01])?(:[0-9A-Fa-f]{1,4}){0,2}|:0?0?0?0(:(:[0-9A-Fa-f]{1,4}){0,2}|:0?0?0?[01](:(:[0-9A-Fa-f]{1,4})?|:[0-9A-Fa-f]{1,4}(:|:[0-9A-Fa-f]{1,4})))):(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})
>
>
Needless to say, this is pretty ugly and hard to comprehend.
>

It looks to me like that regex is trying to match all possible valid IPv6
address renderings, however the addresses that would be exposed is going to
be much smaller as any :: is likely going to be in only one place, usage of
leading 0s will be consistent, and there won't be a mix of dot and colon.
So this could likely be greatly simplified in practice.

An exporter could also make it easier for such regexes, by keeping to one
explicit canonical form.


>
> # A possible solution
>
> I mentioned this in IRC and Julien came up with a hypothetical PromQL
> function "cidr_match":
>
> cidr_match(http_access_log_per_ip_total, "ip", "192.168.0.0/24")
>
> This function would be very useful for me as well.
>
> Going back to my original data, let's say I want to only filter on my
> global /56 prefixes:
>
> cidr_match(prefix_autonomous, "prefix", "2600:6c4a:7880:3200::/56")
>
> prefix_autonomous{interface="eth0", prefix="2600:6c4a:7880:3200::/64"} 1
> prefix_autonomous{interface="eth1", prefix="2600:6c4a:7880:320a::/64"}  1
>
> Assuming there was a metric with a bare IP address such as "192.0.2.1",
> cidr_match could infer "192.0.2.1/32" (or "2001:db8::1" to
> "2001:db8::1/128" for IPv6).
>
> In my case, I just care about IP address prefixes, and am aware that
> individual IP addresses in labels can often cause a cardinality problem.
> However, I assume there are lots of users out there who are doing this
> anyway, regardless of Prometheus best practice recommendations.
>
> I understand that this will probably be the primary argument against this
> sort of functionality, but would like to start a discussion anyway just to
> see if others in our community would find this useful. Consider this less
> of an official proposal and more of throwing an idea out there to gauge
> interest.
>

I think we should first consider what would be done if there were an opaque
string, with no known structure so that the general problem is looked at
rather than one very specific use case (a use case which as demonstrated
above, PromQL already supports).
My first thought there would be an info metric to add in the information to
allow metrics to be grouped. You might also add such a label directly to
the metrics, depending on the exact use case.
This is in addition to the approaches I suggested above for how the
exporter could expose this data to make it simpler to write regexes for.


More generally starting to add all sorts of additional matchers but in the
form of functions does not sound like the best of ideas to me. We already
cover everything that's possible to match with a regular grammar, so it
feels like cruft language wise. In addition it'd be quite a bit less
performant than a regex matcher.

-- 
Brian Brazil
www.robustperception.io

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLq%2B-19Ddebj-%3DVfGj6N55%2BSrGrzA%2BRqKPXqjO11qO1f8w%40mail.gmail.com.

Reply via email to