mikemccand commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1498797871
> PKLookup seems varies a lot as well when there are no changes. I wonder if `luceneutil` maybe has a bug where `PKLookup` task is not using the specified random seed to derive which IDs it looks up? Indeed I have seen it be noisy in the past (not just for you)... > In addition, I noticed adding `-Xbatch` JVM argument will actually make the -50% slow down go away (and also boost PKLookup's QPS): Thanks for testing this. We've debated the merits of disabling background compilation (`-Xbatch`) in the past, but decided it's too risky since nobody actually runs this way in production so the results would not necessarily reflect production impact. It is indeed an interesting data point and does seem to point to "hotspot compilation noise" as the source of the wide differences. Though, I would also expect that as you vary the particular query (`apr`, `jan`, `may` on the negated clause) that the gains should be quite different? It depends heavily on how the postings fall into long runs or not in the index? Though, the line file docs for `luceneutil` are randomly sorted, so there should not be a correlation by time with Lucene's `docid`. > In terms of code, PKLookup will execute this [section of modified code](https://github.com/apache/lucene/pull/12194/files#diff-900619bac18cb1e2e177533efe157e9b4707d0c855180f535051f0d955828306R530-R543) when its doing [doc enumeration](https://github.com/mikemccand/luceneutil/blob/2c8ccdf53e93622761a545c1a54377514c338caa/src/main/perf/PKLookupTask.java#L111), but reverting changes there didn't solve the issue. OK thanks for testing. I think net/net we can conclude that this is all noise and should not block this great change! The speedups for some cases are astounding! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org