We've done at least one rewrite of the parser in the past. We do
substantial changes to our subsystems all the time. For example, the
"stringlabels" changes were a recent substantial change to the internals of
in-memory label storage.

The only things we want to avoid is breaking existing users and reducing
the correctness of the parser.

On Fri, Jun 23, 2023 at 9:35 AM 'Antoine Pultier' via Prometheus Developers
<[email protected]> wrote:

> Hi,
>
> I am parsing a large number of metrics, and I noticed that the Prometheus
> expfmt.TextParser takes a significant amount of CPU time on my machine.
>
> I also noticed that VictoriaMetrics has an entirely different parsing
> implementation that is faster on my machine. I have not conducted extensive
> benchmarking; I'm unsure if I want to. But you can find a small comparison
> at the end of the email with a small string to parse and a 5MB string full
> of metrics and labels to parse.
>
> I read both implementations, both open-source with the Apache 2.0 license,
> and I guess the main difference is the extensive use of strings.IndexByte
> in the VictoriaMetrics parser. Golang provides a fast implementation to
> look for a byte in a string, which is much faster than scanning and
> comparing byte per byte (on common CPU architectures).
> Example for arm64:
> https://github.com/golang/go/blob/e45202f2154839f713b603fd6e5f8a8ad8d527e0/src/internal/bytealg/indexbyte_arm64.s
> I discovered the existence of such optimisations while reading this article
> about ripgrep: https://blog.burntsushi.net/ripgrep/#literal-optimizations
>
> I'm not a Prometheus developer, but I would guess that completely
> replacing the parser with another one is not on the table, but doing some
> changes to the existing one could be possible.
>
> However, it seems to require significant changes to gain performance. I'm
> wondering whether the Prometheus project would welcome substantial changes
> inside the parser at this point. One change would be to load more data at
> once. Perhaps the whole data into a string in memory like VictoriaMetrics
> does, which has some implications. And also the use of strings.IndexBytes
> and slices instead of constructing many strings byte by byte. These changes
> will probably make the parser less elegant, but that may or may not be
> worth it.
>
> ---
> The tiny benchmark:
> ---
> goos: darwin
> goarch: arm64
> pkg: simple-bench
> BenchmarkPrometheusTextParserMinimal-8      416382       2798 ns/op
> BenchmarkVictoriaMetricsTextParserMinimal-8   3622894       296.1 ns/op
> BenchmarkPrometheusTextParserBig-8          4    287416010 ns/op
> BenchmarkVictoriaMetricsTextParserBig-8       142     8374695 ns/op
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CABbyFmoT5-Q%3DvDqrjT9sP4w9h-c7ogMGk8vNp_16FG8nkZAJKg%40mail.gmail.com.

Reply via email to