the committed code in /common includes support for 3 escaping formats: * all legacy-invalid characters to underscores (cannot round-trip) * dots are escaped to _dot_ and underscores are converted to double underscores, all other legacy-invalid characters become single underscores. (partial round trip) * a new U__ encoding for unicode code points so names can be fully round-tripped
https://github.com/prometheus/common/blob/1c9da3533702ae8a54609c7b6f5bdc1ff9b754c9/model/metric.go#L60-L77 On Mon, Jun 10, 2024 at 6:07 AM Bartłomiej Płotka <[email protected]> wrote: > First of all, while it can be painful, I think it's ok to double check the > already > approved proposal > <https://github.com/prometheus/proposals/blob/main/proposals/2023-08-21-utf8.md> > for > the big blockers. I don't think I see any big blocker, given this > discussion. > > Just 2c, I wonder if it would be helpful to split metric/label naming > discussion into: > > A) What protocol, storage and PromQL supports (more flexibility for future) > B) What makes sense to use for everyone (recommendation or mandated for > some systems) > > Essentially we can (and probably should) add versatile escaping > mechanisms. Does it mean people should call their metrics > "❤️🔥123🧠_total"? I don't think it's useful to accept those. It might be > useful to accept (escaped) dots though given Otel users want (for example) > without translation, use the same matchers for traces, logs, profiles and > metrics and they cannot really change that decision. > > We could keep (recommend or ensure) the conversion schema for Otel metrics > (move to _) AND move with generic UTF-8 support. Both could be done in > parallel, as long as it's officially agreed and documented (: > > Kind Regards, > Bartek Płotka (@bwplotka) > > > On Wed, Jun 5, 2024 at 5:55 PM Bjoern Rabenstein <[email protected]> > wrote: > >> On 05.06.24 18:07, 'Fabian Stäber' via Prometheus Developers wrote: >> > >> > So, is the prefered solution to keep things as they are, i.e. keep >> > replacing dots with underscores? >> >> I don't think the purpose of the survey was to find a "preferred >> solution". First of all, this is a technical decision, not a >> democratic one. And even if it were, an online survey is inherently >> biased. >> >> The idea behind the survey was (I hope) to get a broad idea what >> people find surprising or annoying, what they expect, what they like, >> ... and then we can use those inputs in a responsible fashion to >> inform decisions. >> >> > > why allow two different separator characters if they have no >> > > semantic difference (no true namespacing). >> > >> > This argument seems to resonate with the Prometheus team. If this is the >> > main concern, we don't solve it by allowing dots in quotes. We solve >> this >> > by replacing dots with underscores. >> >> As discussed before, this solution has issues because you might run >> into name collisions, and it is hard to match a name from one side of >> the conversion wall to the corresponding name on the other side. >> >> The previous discussion lead to the conclusion that we want allow all >> of UTF-8, because OTel does, but that everything that is not a >> valid conventional Prometheus name will require quoting. >> >> We kept open the option of later allowing more characters in the >> unquoted names, after we have seen how the quoting goes. >> >> > >From the survey it looks like most users prefer the current naming >> scheme >> > as well: >> > >> > [image: screenshot_2024-06-05_18:04:05_908234003.png] >> > [image: screenshot_2024-06-05_18:04:14_304430186.png] >> >> The people in the survey got confronted with the various quoting >> schemas without providing any context. This can only give us some idea >> about people's gut feeling, but not much more. >> >> > Shall we just drop the idea of adding UTF-8 support? >> >> I don't understand the jump to this conclusion. OTel stil supports all >> of UTF-8 in names. If somebody names a metric in Chinese or Cyrillic, >> we cannot convert it to "______________". That's the whole point. We >> need UTF-8 support _anyway_. So let's do it and see how it goes before >> running the umptieth reiteration of "can we just allow dots in metric >> names". >> >> -- >> Björn Rabenstein >> [PGP-ID] 0x851C3DA17D748D03 >> [email] [email protected] >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Prometheus Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/prometheus-developers/ZmCYjf9yXTLMqNbW%40mail.rabenste.in >> . >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "Prometheus Developers" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/prometheus-developers/ftnfizjXOmk/unsubscribe > . > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-developers/CAMssQwa87r9%2B3tAKRSwG6L8YAAu6QQwk4mdt7E%2BGgYR7%3DvOjfQ%40mail.gmail.com > <https://groups.google.com/d/msgid/prometheus-developers/CAMssQwa87r9%2B3tAKRSwG6L8YAAu6QQwk4mdt7E%2BGgYR7%3DvOjfQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAJcrZ4PEmM%3DACsZ6kFrQ2%3DjjERD%2BS5cjzGmH_S-7NgZa_TWCQA%40mail.gmail.com.

