mikemccand commented on PR #12194:
URL: https://github.com/apache/lucene/pull/12194#issuecomment-1498797871

   > PKLookup seems varies a lot as well when there are no changes.
   
   I wonder if `luceneutil` maybe has a bug where `PKLookup` task is not using 
the specified random seed to derive which IDs it looks up?  Indeed I have seen 
it be noisy in the past (not just for you)...
   
   > In addition, I noticed adding `-Xbatch` JVM argument will actually make 
the -50% slow down go away (and also boost PKLookup's QPS):
   
   Thanks for testing this.  We've debated the merits of disabling background 
compilation (`-Xbatch`) in the past, but decided it's too risky since nobody 
actually runs this way in production so the results would not necessarily 
reflect production impact.  It is indeed an interesting data point and does 
seem to point to "hotspot compilation noise" as the source of the wide 
differences.
   
   Though, I would also expect that as you vary the particular query (`apr`, 
`jan`, `may` on the negated clause) that the gains should be quite different?  
It depends heavily on how the postings fall into long runs or not in the index? 
 Though, the line file docs for `luceneutil` are randomly sorted, so there 
should not be a correlation by time with Lucene's `docid`.
   
   > In terms of code, PKLookup will execute this [section of modified 
code](https://github.com/apache/lucene/pull/12194/files#diff-900619bac18cb1e2e177533efe157e9b4707d0c855180f535051f0d955828306R530-R543)
 when its doing [doc 
enumeration](https://github.com/mikemccand/luceneutil/blob/2c8ccdf53e93622761a545c1a54377514c338caa/src/main/perf/PKLookupTask.java#L111),
 but reverting changes there didn't solve the issue.
   
   OK thanks for testing.
   
   I think net/net we can conclude that this is all noise and should not block 
this great change!  The speedups for some cases are astounding!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to