richardstartin commented on PR #8818: URL: https://github.com/apache/pinot/pull/8818#issuecomment-1154392815
Just circling back here having taken a quick look at `BenchmarkFuseRegexp.decreasing9Fusing` - I got similar results to Gonzalo: ``` Benchmark (_fstType) Mode Cnt Score Error Units BenchmarkFuseRegexp.decreasing9Fusing LUCENE avgt 5 32.525 ± 2.733 ms/op BenchmarkFuseRegexp.decreasing9Fusing NATIVE avgt 5 0.487 ± 0.123 ms/op BenchmarkFuseRegexp.decreasing9Fusing null avgt 5 0.845 ± 0.204 ms/op ``` Lucene spends virtually all of its time in _minimization_ (which I mistakenly referred to as _determinization_ earlier): <img width="936" alt="Screenshot 2022-06-13 at 21 10 51" src="https://user-images.githubusercontent.com/16439049/173436813-a13a5830-ad7b-4a8c-8e5f-3cf8003640df.png"> Now if I enable minimization in the native implementation by reverting #8237, the scores change drastically: ``` Benchmark (_fstType) Mode Cnt Score Error Units BenchmarkFuseRegexp.decreasing9Fusing LUCENE avgt 5 32.399 ± 3.005 ms/op BenchmarkFuseRegexp.decreasing9Fusing NATIVE avgt 5 366.520 ± 40.574 ms/op BenchmarkFuseRegexp.decreasing9Fusing null avgt 5 0.819 ± 0.172 ms/op ``` As do the profiles. Without minimization, there is no clear single bottleneck: <img width="879" alt="Screenshot 2022-06-13 at 21 14 47" src="https://user-images.githubusercontent.com/16439049/173437464-980859ce-e7c5-4e8f-b43e-ba609ab39396.png"> With minimization, there is a single bottleneck, and it's in minimization : <img width="931" alt="Screenshot 2022-06-13 at 21 23 10" src="https://user-images.githubusercontent.com/16439049/173438712-5f8efc36-1f8d-4ea5-8bb6-b135d21ff4fc.png"> As to whether this is cause for concern is beside the point; this is the root cause of the difference observed, rather than a quirk of the benchmark which needs to be analysed and removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org