gf2121 commented on PR #12800:
URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820564110

   I also run the index script to see flush time with this new approach, result 
in ~15% faster for random data and no regression on asc/desc :)
   
   <details><summary>Benchmark Detail</summary>
   
   **Baseline**
   ```
   DWPT 0 [2023-11-21T09:26:53.204254Z; main]: flush time 825.381875 ms
   DWPT 0 [2023-11-21T09:26:54.820677Z; main]: flush time 759.469459 ms
   DWPT 0 [2023-11-21T09:26:56.294950Z; main]: flush time 685.044875 ms
   DWPT 0 [2023-11-21T09:26:57.643239Z; main]: flush time 562.0595 ms
   DWPT 0 [2023-11-21T09:26:58.977843Z; main]: flush time 555.718625 ms
   DWPT 0 [2023-11-21T09:27:00.310891Z; main]: flush time 554.023042 ms
   DWPT 0 [2023-11-21T09:27:01.645311Z; main]: flush time 556.796709 ms
   DWPT 0 [2023-11-21T09:27:02.984576Z; main]: flush time 552.745084 ms
   DWPT 0 [2023-11-21T09:27:04.327895Z; main]: flush time 550.138166 ms
   DWPT 0 [2023-11-21T09:27:05.667026Z; main]: flush time 552.581875 ms
   Using order: ASC
   DWPT 1 [2023-11-21T09:27:06.963483Z; main]: flush time 444.065625 ms
   DWPT 1 [2023-11-21T09:27:08.116839Z; main]: flush time 346.094542 ms
   DWPT 1 [2023-11-21T09:27:09.223496Z; main]: flush time 297.886417 ms
   DWPT 1 [2023-11-21T09:27:10.330354Z; main]: flush time 294.814708 ms
   DWPT 1 [2023-11-21T09:27:11.437615Z; main]: flush time 296.89775 ms
   DWPT 1 [2023-11-21T09:27:12.543693Z; main]: flush time 294.519125 ms
   DWPT 1 [2023-11-21T09:27:13.650317Z; main]: flush time 294.359458 ms
   DWPT 1 [2023-11-21T09:27:14.755599Z; main]: flush time 296.021333 ms
   DWPT 1 [2023-11-21T09:27:15.856800Z; main]: flush time 295.717875 ms
   DWPT 1 [2023-11-21T09:27:16.961871Z; main]: flush time 297.089042 ms
   Using order: DESC
   DWPT 2 [2023-11-21T09:27:18.189055Z; main]: flush time 388.857166 ms
   DWPT 2 [2023-11-21T09:27:19.378069Z; main]: flush time 380.438708 ms
   DWPT 2 [2023-11-21T09:27:20.566168Z; main]: flush time 379.276208 ms
   DWPT 2 [2023-11-21T09:27:21.765879Z; main]: flush time 380.318458 ms
   DWPT 2 [2023-11-21T09:27:22.955527Z; main]: flush time 380.022583 ms
   DWPT 2 [2023-11-21T09:27:24.149801Z; main]: flush time 381.808792 ms
   DWPT 2 [2023-11-21T09:27:25.335294Z; main]: flush time 379.536084 ms
   DWPT 2 [2023-11-21T09:27:26.522849Z; main]: flush time 379.696875 ms
   DWPT 2 [2023-11-21T09:27:27.708379Z; main]: flush time 378.25275 ms
   DWPT 2 [2023-11-21T09:27:28.888002Z; main]: flush time 376.322417 ms
   ```
   
   **Candidate**
   ```
   Using order: RANDOM
   DWPT 0 [2023-11-21T09:28:48.701620Z; main]: flush time 907.094125 ms
   DWPT 0 [2023-11-21T09:28:50.423239Z; main]: flush time 871.719292 ms
   DWPT 0 [2023-11-21T09:28:51.959144Z; main]: flush time 792.907334 ms
   DWPT 0 [2023-11-21T09:28:53.357324Z; main]: flush time 656.680334 ms
   DWPT 0 [2023-11-21T09:28:54.744187Z; main]: flush time 646.845625 ms
   DWPT 0 [2023-11-21T09:28:56.267555Z; main]: flush time 657.262625 ms
   DWPT 0 [2023-11-21T09:28:57.654678Z; main]: flush time 646.135625 ms
   DWPT 0 [2023-11-21T09:28:59.056354Z; main]: flush time 647.583209 ms
   DWPT 0 [2023-11-21T09:29:00.456401Z; main]: flush time 648.0325 ms
   DWPT 0 [2023-11-21T09:29:01.861058Z; main]: flush time 651.569625 ms
   Using order: ASC
   DWPT 1 [2023-11-21T09:29:03.110932Z; main]: flush time 439.131041 ms
   DWPT 1 [2023-11-21T09:29:04.217419Z; main]: flush time 341.144292 ms
   DWPT 1 [2023-11-21T09:29:05.280921Z; main]: flush time 300.120542 ms
   DWPT 1 [2023-11-21T09:29:06.333502Z; main]: flush time 293.290125 ms
   DWPT 1 [2023-11-21T09:29:07.386953Z; main]: flush time 293.251333 ms
   DWPT 1 [2023-11-21T09:29:08.445562Z; main]: flush time 294.627333 ms
   DWPT 1 [2023-11-21T09:29:09.500210Z; main]: flush time 294.796584 ms
   DWPT 1 [2023-11-21T09:29:10.551829Z; main]: flush time 292.79775 ms
   DWPT 1 [2023-11-21T09:29:11.610377Z; main]: flush time 297.584417 ms
   DWPT 1 [2023-11-21T09:29:12.673562Z; main]: flush time 294.24025 ms
   Using order: DESC
   DWPT 2 [2023-11-21T09:29:13.863704Z; main]: flush time 386.06825 ms
   DWPT 2 [2023-11-21T09:29:15.012659Z; main]: flush time 382.778625 ms
   DWPT 2 [2023-11-21T09:29:16.159515Z; main]: flush time 381.255458 ms
   DWPT 2 [2023-11-21T09:29:17.307500Z; main]: flush time 383.2325 ms
   DWPT 2 [2023-11-21T09:29:18.456641Z; main]: flush time 384.675875 ms
   DWPT 2 [2023-11-21T09:29:19.600958Z; main]: flush time 381.510875 ms
   DWPT 2 [2023-11-21T09:29:20.748250Z; main]: flush time 382.213333 ms
   DWPT 2 [2023-11-21T09:29:21.902089Z; main]: flush time 378.915958 ms
   DWPT 2 [2023-11-21T09:29:23.045118Z; main]: flush time 379.851583 ms
   DWPT 2 [2023-11-21T09:29:24.187161Z; main]: flush time 378.107708 ms
   ```
   
   </details>
   
   <details><summary>Code</summary>
   
   ```
     enum Order {
       RANDOM,
       ASC,
       DESC;
     }
   
     public static void main(String[] args) throws IOException {
       for (Order order : Order.values()) {
         System.out.println("Using order: " + order.name());
         Directory dir = FSDirectory.open(Paths.get("/tmp/a"));
         IndexWriterConfig cfg = new IndexWriterConfig(new StandardAnalyzer());
         cfg.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
         cfg.setInfoStream(new PrintStreamInfoStream(System.out));
         cfg.setMaxBufferedDocs(1_000_000);
         cfg.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
         cfg.setIndexSort(
             new Sort(LongField.newSortField("sort_field", false, 
SortedNumericSelector.Type.MIN)));
         IndexWriter w = new IndexWriter(dir, cfg);
         Document doc = new Document();
         LongField sortField = new LongField("sort_field", 0);
         doc.add(sortField);
         TextField stringField1 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField1);
         TextField stringField2 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField2);
         TextField stringField3 = new TextField("string_field", "", 
Field.Store.NO);
         doc.add(stringField3);
         for (int i = 0; i < 10_000_000; ++i) {
           long sortValue =
               switch (order) {
                 case RANDOM -> i % 15;
                 case ASC -> i;
                 case DESC -> -i;
               };
           sortField.setLongValue(sortValue);
           stringField1.setStringValue(Integer.toBinaryString(i % 10));
           stringField2.setStringValue(Integer.toBinaryString(i % 100));
           stringField3.setStringValue(Integer.toBinaryString(i % 1000));
           w.addDocument(doc);
         }
         w.flush();
         w.commit();
         w.close();
       }
     }
   ```
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to