uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1843848755
   Hi Adrien,
   
   Thanks for the more insight!
   
   > @uschindler FYI this is what I'm getting: 
https://gist.github.com/jpountz/be81b1eb93c6118aac65c3679911f1d8. There are two 
files: baseline.txt for the default impl, and contender.txt for the optimized 
impl.
   
   This looks much better than the files posted before. This shows that both 
(the baseline and contender) methods are fully inlined (almost...). So the 
difference is indeed the internal implementation.
   
   > > In general I am not sure if we really need specialization in Mmapdir.
   > > We do not even implement readVInt for memory segments. Why should we 
implement it for this complex case?
   > 
   > To be clear, this specialization is not about doing the same kind of 
optimization that the JVM would do by inlining, it's about taking advantage of 
the fact that many of our Directory implementations can do unaligned random 
reads under the hood to decode group-varints more efficiently, fewer 
conditionals specifically.
   
   This information wasn't available to me.
   
   With unaligned random reads you mean that you read with positional reads 
from the area where the buffer is saved? With default DataInput you can only go 
forward, so you need to read everything and also in order.
   
   This explains, but I am really not happy to put this very special code into 
theeI/O layer of Lucene. This makes it unmaintainable, sorry.
   
   I have not much time at the moment, but could we for now leave out 
MMapDirectory (all variants) form the optimization and start small? I have some 
ideas to rewrite that code to not duplicate the same code with small changes 
over and over. What you need is some way to read the the values unaliged from 
an indexed (int for arrays and bytebuffers, long for memorysegments)? This is 
exactly what VarHandles do! We could write a generic implementation that uses 
VarHandles to access the underlying bytes. You can get Varhandles from byte 
arrays (see BitUtil class - you already use that) or from ByteBuffers (works 
the same way to get a VarHandle to access internals of ByteBuffers - P.S.: We 
could use that to get rid of the stupid LongBuffers in the legacy 
MMap-ByteBufferIndexInput - so I played around with it already). And finally 
the accessors of MemorySegment are Varhandles behind the scenes, too (just with 
long coordinate).
   
   So my idea would be an implementation that uses VarHandles and this one can 
be used for byte[], ByteBuffer[] and MemorySegment. The impl is basically the 
same (takes an offset for start of block and the receiver and then decodes 
everything after that offset).  All implementations would create an instance of 
that GroupVarInt decoder that knows the varhandles and the receiver object).
   
   I am a bit busy the next day, I can maybe look into that at the weekend. 
Maybe we get a good API without copypasting this horrible complex code that I 
don't under stand ten times into our repository. This is my main issue with 
this. I do much for performance but I won't copypaste code everywhere just to 
get a few percents. Sorry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to