rmuir commented on PR #13816:
URL: https://github.com/apache/lucene/pull/13816#issuecomment-2365228413

   this ID is generated for safety, and we only calculate this **once per 
segment** when initially flushing it. Doesn't even happen at merge. Really, it 
should be a few insns under lock, shouldn't need to be fast. That's why there 
isn't contention, your profiler is lying to you. Sorry, I don't buy 30ms.
   
   And if somehow you are writing THAT many segments concurrently, then you 
definitely need the safety (and we dont want thread visibility issues where 2 
segments end out with the same ID).
   
   If we want to improve it, IMO the best way is to remove custom BigDecimal 
and xor/shift logic, but instead swap in the OpenJDK implementation. This code 
was written before OpenJDK had decent PRNGs.
   
   This will clean up the code AND improve the performance.
   
   ```java
   // initialization
   var factory = RandomGeneratorFactory.of("Xoroshiro128PlusPlus");
   var generator = factory.create(seed);
   
   // generation
   byte[] result = new byte[ID_LENGTH];
   synchronized(idLock) {
     random.nextBytes(result);
   }
   return result;
   ```
   
   Instead of two BigDecimal operations and an allocation under the lock, it 
will just do some xors and shifts on two `long` variables behind the scenes. 
And I think I would trust it more since code is maintained in the JDK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to