gsmiller commented on PR #12141:
URL: https://github.com/apache/lucene/pull/12141#issuecomment-1426788802

   Thanks @uschindler for the alternate approach. It helped me understand your 
earlier suggestion to use streams, which I wasn't totally clear on (I thought 
you were originally suggesting to do away with prefix-encoding altogether and 
reference the streams directly inside the query implementations to iterate the 
terms, which was confusing).
   
   I'm not setup to re-run our internal benchmarks at the moment (where we see 
a large amount of time spent sorting terms), but I at least ran my simple 
"benchmark" test case that does some simple timing over the query 
initialization (see below). The results for this PR were as good as my initial 
proposal to share prefix-encoded terms. So, from a pure performance 
point-of-view, this appears to be just as efficient as what I'd come up with 
initially.
   
   Simple test case "benchmark":
   ```
     public void testSortPerformance() {
       int len = 50000;
       BytesRef[] terms = new BytesRef[len];
       for (int i = 0; i < len; i++) {
         String s = TestUtil.randomSimpleString(random(), 10, 20);
         terms[i] = new BytesRef(s);
       }
   
       int iters = 300;
       for (int i = 0; i < iters; i++) {
         KeywordField.newSetQuery("foo", terms);
       }
   
       long minTime = Long.MAX_VALUE;
       for (int i = 0; i < iters; i++) {
         long t0 = System.nanoTime();
         KeywordField.newSetQuery("foo", terms);
         minTime = Math.min(minTime, System.nanoTime() - t0);
       }
   
       System.err.println("Time: " + minTime / 1_000_000);
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to