the committed code in /common includes support for 3 escaping formats:

* all legacy-invalid characters to underscores (cannot round-trip)
* dots are escaped to _dot_ and underscores are converted to double
underscores, all other legacy-invalid characters become single underscores.
(partial round trip)
* a new U__ encoding for unicode code points so names can be fully
round-tripped

https://github.com/prometheus/common/blob/1c9da3533702ae8a54609c7b6f5bdc1ff9b754c9/model/metric.go#L60-L77



On Mon, Jun 10, 2024 at 6:07 AM Bartłomiej Płotka <[email protected]>
wrote:

> First of all, while it can be painful, I think it's ok to double check the 
> already
> approved proposal
> <https://github.com/prometheus/proposals/blob/main/proposals/2023-08-21-utf8.md>
>  for
> the big blockers. I don't think I see any big blocker, given this
> discussion.
>
> Just 2c, I wonder if it would be helpful to split metric/label naming
> discussion into:
>
> A) What protocol, storage and PromQL supports (more flexibility for future)
> B) What makes sense to use for everyone (recommendation or mandated for
> some systems)
>
> Essentially we can (and probably should) add versatile escaping
> mechanisms. Does it mean people should call their metrics
> "❤️🔥123🧠_total"? I don't think it's useful to accept those. It might be
> useful to accept (escaped) dots though given Otel users want (for example)
> without translation, use the same matchers for traces, logs, profiles and
> metrics and they cannot really change that decision.
>
> We could keep (recommend or ensure) the conversion schema for Otel metrics
> (move to _) AND move with generic UTF-8 support. Both could be done in
> parallel, as long as it's officially agreed and documented (:
>
> Kind Regards,
> Bartek Płotka (@bwplotka)
>
>
> On Wed, Jun 5, 2024 at 5:55 PM Bjoern Rabenstein <[email protected]>
> wrote:
>
>> On 05.06.24 18:07, 'Fabian Stäber' via Prometheus Developers wrote:
>> >
>> > So, is the prefered solution to keep things as they are, i.e. keep
>> > replacing dots with underscores?
>>
>> I don't think the purpose of the survey was to find a "preferred
>> solution". First of all, this is a technical decision, not a
>> democratic one. And even if it were, an online survey is inherently
>> biased.
>>
>> The idea behind the survey was (I hope) to get a broad idea what
>> people find surprising or annoying, what they expect, what they like,
>> ... and then we can use those inputs in a responsible fashion to
>> inform decisions.
>>
>> > > why allow two different separator characters if they have no
>> > > semantic difference (no true namespacing).
>> >
>> > This argument seems to resonate with the Prometheus team. If this is the
>> > main concern, we don't solve it by allowing dots in quotes. We solve
>> this
>> > by replacing dots with underscores.
>>
>> As discussed before, this solution has issues because you might run
>> into name collisions, and it is hard to match a name from one side of
>> the conversion wall to the corresponding name on the other side.
>>
>> The previous discussion lead to the conclusion that we want allow all
>> of UTF-8, because OTel does, but that everything that is not a
>> valid conventional Prometheus name will require quoting.
>>
>> We kept open the option of later allowing more characters in the
>> unquoted names, after we have seen how the quoting goes.
>>
>> > >From the survey it looks like most users prefer the current naming
>> scheme
>> > as well:
>> >
>> > [image: screenshot_2024-06-05_18:04:05_908234003.png]
>> > [image: screenshot_2024-06-05_18:04:14_304430186.png]
>>
>> The people in the survey got confronted with the various quoting
>> schemas without providing any context. This can only give us some idea
>> about people's gut feeling, but not much more.
>>
>> > Shall we just drop the idea of adding UTF-8 support?
>>
>> I don't understand the jump to this conclusion. OTel stil supports all
>> of UTF-8 in names. If somebody names a metric in Chinese or Cyrillic,
>> we cannot convert it to "______________". That's the whole point. We
>> need UTF-8 support _anyway_. So let's do it and see how it goes before
>> running the umptieth reiteration of "can we just allow dots in metric
>> names".
>>
>> --
>> Björn Rabenstein
>> [PGP-ID] 0x851C3DA17D748D03
>> [email] [email protected]
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-developers/ZmCYjf9yXTLMqNbW%40mail.rabenste.in
>> .
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Prometheus Developers" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/prometheus-developers/ftnfizjXOmk/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/CAMssQwa87r9%2B3tAKRSwG6L8YAAu6QQwk4mdt7E%2BGgYR7%3DvOjfQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-developers/CAMssQwa87r9%2B3tAKRSwG6L8YAAu6QQwk4mdt7E%2BGgYR7%3DvOjfQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAJcrZ4PEmM%3DACsZ6kFrQ2%3DjjERD%2BS5cjzGmH_S-7NgZa_TWCQA%40mail.gmail.com.

Reply via email to