gf2121 commented on PR #12800:
URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820687175

   Thanks for feedback @mikemccand !
   
   > Hmm it looks like random got a bit slower in candidate? Flush time ~550 
ish ms in baseline and maybe ~650 ish ms in candidate?
   
   Ohhh! I reconfirmed and it turns out i paste the benchmark result in wrong 
place, sorry!
   
   > And it's not quite random right -- it's sort of a sawtooth between 0 and 
14? Or am I reading the results backwards?
   
   Thanks for pointing out this, it is indeed not random enough. I changed the 
RANDOM to use `java.util.Random` and renamed the origin RANDOM to ROUND. The 
new random distribution get improved a bit more, 20+%.
   
   <details><summary>Benchmark Detail</summary>
   
   **Baseline**
   ```
   Using order: RANDOM
   DWPT 0 [2023-11-21T10:49:03.532762Z; main]: flush time 1138.599958 ms
   DWPT 0 [2023-11-21T10:49:05.455861Z; main]: flush time 1059.658875 ms
   DWPT 0 [2023-11-21T10:49:07.224901Z; main]: flush time 991.167625 ms
   DWPT 0 [2023-11-21T10:49:08.847679Z; main]: flush time 850.281875 ms
   DWPT 0 [2023-11-21T10:49:10.468672Z; main]: flush time 848.403625 ms
   DWPT 0 [2023-11-21T10:49:12.104744Z; main]: flush time 861.536542 ms
   DWPT 0 [2023-11-21T10:49:13.731466Z; main]: flush time 851.316958 ms
   DWPT 0 [2023-11-21T10:49:15.350887Z; main]: flush time 847.584917 ms
   DWPT 0 [2023-11-21T10:49:16.963837Z; main]: flush time 843.849042 ms
   DWPT 0 [2023-11-21T10:49:18.579249Z; main]: flush time 843.092666 ms
   Using order: ROUND
   DWPT 1 [2023-11-21T10:49:20.051903Z; main]: flush time 648.484542 ms
   DWPT 1 [2023-11-21T10:49:21.472366Z; main]: flush time 643.642417 ms
   DWPT 1 [2023-11-21T10:49:22.889719Z; main]: flush time 644.994541 ms
   DWPT 1 [2023-11-21T10:49:24.307484Z; main]: flush time 642.117958 ms
   DWPT 1 [2023-11-21T10:49:25.727815Z; main]: flush time 642.38 ms
   DWPT 1 [2023-11-21T10:49:27.143574Z; main]: flush time 639.769875 ms
   DWPT 1 [2023-11-21T10:49:28.562387Z; main]: flush time 644.234375 ms
   DWPT 1 [2023-11-21T10:49:29.975009Z; main]: flush time 639.01125 ms
   DWPT 1 [2023-11-21T10:49:31.396969Z; main]: flush time 643.216 ms
   DWPT 1 [2023-11-21T10:49:32.810467Z; main]: flush time 639.049041 ms
   Using order: ASC
   DWPT 2 [2023-11-21T10:49:34.100537Z; main]: flush time 473.33425 ms
   DWPT 2 [2023-11-21T10:49:35.236826Z; main]: flush time 352.816167 ms
   DWPT 2 [2023-11-21T10:49:36.312917Z; main]: flush time 293.915917 ms
   DWPT 2 [2023-11-21T10:49:37.386792Z; main]: flush time 290.221458 ms
   DWPT 2 [2023-11-21T10:49:38.463960Z; main]: flush time 287.046708 ms
   DWPT 2 [2023-11-21T10:49:39.537561Z; main]: flush time 287.051709 ms
   DWPT 2 [2023-11-21T10:49:40.610809Z; main]: flush time 287.296375 ms
   DWPT 2 [2023-11-21T10:49:41.686863Z; main]: flush time 290.536083 ms
   DWPT 2 [2023-11-21T10:49:42.751377Z; main]: flush time 289.183375 ms
   DWPT 2 [2023-11-21T10:49:43.824249Z; main]: flush time 289.238584 ms
   Using order: DESC
   DWPT 3 [2023-11-21T10:49:45.039267Z; main]: flush time 394.276959 ms
   DWPT 3 [2023-11-21T10:49:46.203835Z; main]: flush time 365.40575 ms
   DWPT 3 [2023-11-21T10:49:47.359253Z; main]: flush time 364.55 ms
   DWPT 3 [2023-11-21T10:49:48.548749Z; main]: flush time 385.198 ms
   DWPT 3 [2023-11-21T10:49:49.715963Z; main]: flush time 366.247083 ms
   DWPT 3 [2023-11-21T10:49:50.881628Z; main]: flush time 372.473333 ms
   DWPT 3 [2023-11-21T10:49:52.037239Z; main]: flush time 367.666041 ms
   DWPT 3 [2023-11-21T10:49:53.192338Z; main]: flush time 364.390834 ms
   DWPT 3 [2023-11-21T10:49:54.346795Z; main]: flush time 367.417208 ms
   DWPT 3 [2023-11-21T10:49:55.506692Z; main]: flush time 374.948625 ms
   ```
   
   **Candidate**
   ```
   Using order: RANDOM
   DWPT 0 [2023-11-21T10:31:14.638348Z; main]: flush time 926.650958 ms
   DWPT 0 [2023-11-21T10:31:16.527778Z; main]: flush time 983.61375 ms
   DWPT 0 [2023-11-21T10:31:18.105650Z; main]: flush time 745.283416 ms
   DWPT 0 [2023-11-21T10:31:19.545346Z; main]: flush time 614.212208 ms
   DWPT 0 [2023-11-21T10:31:20.986866Z; main]: flush time 621.046833 ms
   DWPT 0 [2023-11-21T10:31:22.418842Z; main]: flush time 613.169292 ms
   DWPT 0 [2023-11-21T10:31:23.843488Z; main]: flush time 608.060375 ms
   DWPT 0 [2023-11-21T10:31:25.289972Z; main]: flush time 633.770083 ms
   DWPT 0 [2023-11-21T10:31:26.729025Z; main]: flush time 617.815 ms
   DWPT 0 [2023-11-21T10:31:28.152042Z; main]: flush time 606.253292 ms
   Using order: ROUND
   DWPT 1 [2023-11-21T10:31:29.546556Z; main]: flush time 540.889709 ms
   DWPT 1 [2023-11-21T10:31:30.891868Z; main]: flush time 534.34825 ms
   DWPT 1 [2023-11-21T10:31:32.235487Z; main]: flush time 529.94025 ms
   DWPT 1 [2023-11-21T10:31:33.585848Z; main]: flush time 538.600959 ms
   DWPT 1 [2023-11-21T10:31:34.926304Z; main]: flush time 535.212458 ms
   DWPT 1 [2023-11-21T10:31:36.261841Z; main]: flush time 529.868792 ms
   DWPT 1 [2023-11-21T10:31:37.612535Z; main]: flush time 532.926375 ms
   DWPT 1 [2023-11-21T10:31:38.950114Z; main]: flush time 531.968 ms
   DWPT 1 [2023-11-21T10:31:40.283548Z; main]: flush time 529.449208 ms
   DWPT 1 [2023-11-21T10:31:41.621569Z; main]: flush time 531.614458 ms
   Using order: ASC
   DWPT 2 [2023-11-21T10:31:42.931710Z; main]: flush time 466.0205 ms
   DWPT 2 [2023-11-21T10:31:44.110242Z; main]: flush time 361.563833 ms
   DWPT 2 [2023-11-21T10:31:45.270395Z; main]: flush time 344.598167 ms
   DWPT 2 [2023-11-21T10:31:46.391066Z; main]: flush time 297.298416 ms
   DWPT 2 [2023-11-21T10:31:47.508596Z; main]: flush time 292.465833 ms
   DWPT 2 [2023-11-21T10:31:48.619912Z; main]: flush time 294.3465 ms
   DWPT 2 [2023-11-21T10:31:49.733508Z; main]: flush time 294.211834 ms
   DWPT 2 [2023-11-21T10:31:50.844318Z; main]: flush time 292.396292 ms
   DWPT 2 [2023-11-21T10:31:51.957632Z; main]: flush time 294.951792 ms
   DWPT 2 [2023-11-21T10:31:53.060245Z; main]: flush time 293.81875 ms
   Using order: DESC
   DWPT 3 [2023-11-21T10:31:54.309892Z; main]: flush time 397.02825 ms
   DWPT 3 [2023-11-21T10:31:55.507951Z; main]: flush time 375.452125 ms
   DWPT 3 [2023-11-21T10:31:56.705769Z; main]: flush time 379.94275 ms
   DWPT 3 [2023-11-21T10:31:57.916353Z; main]: flush time 374.742583 ms
   DWPT 3 [2023-11-21T10:31:59.098488Z; main]: flush time 370.185083 ms
   DWPT 3 [2023-11-21T10:32:00.286668Z; main]: flush time 373.631208 ms
   DWPT 3 [2023-11-21T10:32:01.479051Z; main]: flush time 369.689833 ms
   DWPT 3 [2023-11-21T10:32:02.665413Z; main]: flush time 370.781 ms
   DWPT 3 [2023-11-21T10:32:03.841312Z; main]: flush time 372.006916 ms
   DWPT 3 [2023-11-21T10:32:05.019313Z; main]: flush time 374.449833 ms
   ```
   
   </details>
   
   <details><summary>Code</summary>
   
   ```
   enum Order {
       RANDOM,
       ROUND,
       ASC,
       DESC;
     }
   
     public static void main(String[] args) throws IOException {
       Random random = new Random(4317849138248L);
       for (Order order : Order.values()) {
         System.out.println("Using order: " + order.name());
         Directory dir = FSDirectory.open(Paths.get("/tmp/a"));
         IndexWriterConfig cfg = new IndexWriterConfig(new StandardAnalyzer());
         cfg.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
         cfg.setInfoStream(new PrintStreamInfoStream(System.out));
         cfg.setMaxBufferedDocs(1_000_000);
         cfg.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
         cfg.setIndexSort(
             new Sort(LongField.newSortField("sort_field", false, 
SortedNumericSelector.Type.MIN)));
         IndexWriter w = new IndexWriter(dir, cfg);
         Document doc = new Document();
         LongField sortField = new LongField("sort_field", 0);
         doc.add(sortField);
         TextField stringField1 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField1);
         TextField stringField2 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField2);
         TextField stringField3 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField3);
         for (int i = 0; i < 10_000_000; ++i) {
           long sortValue =
               switch (order) {
                 case RANDOM -> random.nextLong(15);
                 case ROUND -> i % 15;
                 case ASC -> i;
                 case DESC -> -i;
               };
           sortField.setLongValue(sortValue);
           stringField1.setStringValue(Integer.toBinaryString(i % 10));
           stringField2.setStringValue(Integer.toBinaryString(i % 100));
           stringField3.setStringValue(Integer.toBinaryString(i % 1000));
           w.addDocument(doc);
         }
         w.flush();
         w.commit();
         w.close();
       }
     }
   ```
   
   </details>
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to