We've done at least one rewrite of the parser in the past. We do substantial changes to our subsystems all the time. For example, the "stringlabels" changes were a recent substantial change to the internals of in-memory label storage.
The only things we want to avoid is breaking existing users and reducing the correctness of the parser. On Fri, Jun 23, 2023 at 9:35 AM 'Antoine Pultier' via Prometheus Developers <[email protected]> wrote: > Hi, > > I am parsing a large number of metrics, and I noticed that the Prometheus > expfmt.TextParser takes a significant amount of CPU time on my machine. > > I also noticed that VictoriaMetrics has an entirely different parsing > implementation that is faster on my machine. I have not conducted extensive > benchmarking; I'm unsure if I want to. But you can find a small comparison > at the end of the email with a small string to parse and a 5MB string full > of metrics and labels to parse. > > I read both implementations, both open-source with the Apache 2.0 license, > and I guess the main difference is the extensive use of strings.IndexByte > in the VictoriaMetrics parser. Golang provides a fast implementation to > look for a byte in a string, which is much faster than scanning and > comparing byte per byte (on common CPU architectures). > Example for arm64: > https://github.com/golang/go/blob/e45202f2154839f713b603fd6e5f8a8ad8d527e0/src/internal/bytealg/indexbyte_arm64.s > I discovered the existence of such optimisations while reading this article > about ripgrep: https://blog.burntsushi.net/ripgrep/#literal-optimizations > > I'm not a Prometheus developer, but I would guess that completely > replacing the parser with another one is not on the table, but doing some > changes to the existing one could be possible. > > However, it seems to require significant changes to gain performance. I'm > wondering whether the Prometheus project would welcome substantial changes > inside the parser at this point. One change would be to load more data at > once. Perhaps the whole data into a string in memory like VictoriaMetrics > does, which has some implications. And also the use of strings.IndexBytes > and slices instead of constructing many strings byte by byte. These changes > will probably make the parser less elegant, but that may or may not be > worth it. > > --- > The tiny benchmark: > --- > goos: darwin > goarch: arm64 > pkg: simple-bench > BenchmarkPrometheusTextParserMinimal-8 416382 2798 ns/op > BenchmarkVictoriaMetricsTextParserMinimal-8 3622894 296.1 ns/op > BenchmarkPrometheusTextParserBig-8 4 287416010 ns/op > BenchmarkVictoriaMetricsTextParserBig-8 142 8374695 ns/op > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CABbyFmoT5-Q%3DvDqrjT9sP4w9h-c7ogMGk8vNp_16FG8nkZAJKg%40mail.gmail.com.

