easyice opened a new pull request, #12381:
URL: https://github.com/apache/lucene/pull/12381

   ### Description
   
   like [pr-399](https://github.com/apache/lucene/pull/399), the 
DocsWithFieldSet#add() can avoid create instance for FixedBitSet in dense 
scene, so in NumericDocValuesWriter#sortDocValues() we can do the same thing, 
just the way to judge dense or sparse, I'm not sure if it's rigorous enough
   
   the benchmark for write ten SortedNumericDocValuesField, the optimization 
saves ~7% commit time
   
   `` `
   public class IndexBenchMarksNDV {
   
     public static void main(final String[] args) throws Exception {
       doWriteNDV();
     }
   
     static void doWriteNDV() throws IOException {
       BenchMark benchMark = new BenchMark(5, 5, 1000000);
       benchMark.run();
     }
   
     static class BenchMark {
       final int warmup;
       final int numValues;
       final int loopCount;
   
       Directory dir;
       IndexWriter indexWriter;
   
       BenchMark(int warmup, int loopCount, int numValues) {
         this.warmup = warmup;
         this.numValues = numValues;
         this.loopCount = loopCount;
       }
   
       private void init() throws IOException {
         Path tempDir = 
Files.createTempDirectory(Paths.get("/Volumes/RamDisk"), "tmp");
         dir = MMapDirectory.open(tempDir);
         IndexWriterConfig iwc = new IndexWriterConfig(null);
         iwc.setMergePolicy(NoMergePolicy.INSTANCE);
         iwc.setMaxBufferedDocs(IndexWriterConfig.DISABLE_AUTO_FLUSH);
         Sort indexSort = new Sort(new SortedNumericSortField("f1", 
SortField.Type.LONG));
         iwc.setIndexSort(indexSort);
         indexWriter = new IndexWriter(dir, iwc);
       }
   
       private void close() throws IOException {
         indexWriter.close();
         dir.close();
       }
   
       private long doWrite() throws IOException {
         for (int i = 0; i < numValues; i++) {
           Document document = new Document();
           for (int f = 0; f < 10; f++) {
             document.add(new SortedNumericDocValuesField("f" + f, i / 1000));
           }
           indexWriter.addDocument(document);
         }
         Document document = new Document();
         for (int f = 0; f < 10; f++) {
           document.add(new SortedNumericDocValuesField("f" + f, 1));
         }
         indexWriter.addDocument(document);
         long t0 = System.nanoTime();
         indexWriter.commit();
         return System.nanoTime() - t0;
       }
   
       void run() throws IOException {
         init();
         for (int i = 0; i < warmup; i++) {
           doWrite();
         }
         System.gc();
         List<Double> times = new ArrayList<>();
         for (int i = 0; i < loopCount; i++) {
           long took = doWrite();
           times.add(took / 1000000D);
         }
         double min = 
times.stream().mapToDouble(Number::doubleValue).min().getAsDouble();
         System.out.println("took(ms):" + String.format(Locale.ROOT, "%.2f", 
min));
         close();
       }
     }
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to