uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1843848755
Hi Adrien, Thanks for the more insight! > @uschindler FYI this is what I'm getting: https://gist.github.com/jpountz/be81b1eb93c6118aac65c3679911f1d8. There are two files: baseline.txt for the default impl, and contender.txt for the optimized impl. This looks much better than the files posted before. This shows that both (the baseline and contender) methods are fully inlined (almost...). So the difference is indeed the internal implementation. > > In general I am not sure if we really need specialization in Mmapdir. > > We do not even implement readVInt for memory segments. Why should we implement it for this complex case? > > To be clear, this specialization is not about doing the same kind of optimization that the JVM would do by inlining, it's about taking advantage of the fact that many of our Directory implementations can do unaligned random reads under the hood to decode group-varints more efficiently, fewer conditionals specifically. This information wasn't available to me. With unaligned random reads you mean that you read with positional reads from the area where the buffer is saved? With default DataInput you can only go forward, so you need to read everything and also in order. This explains, but I am really not happy to put this very special code into theeI/O layer of Lucene. This makes it unmaintainable, sorry. I have not much time at the moment, but could we for now leave out MMapDirectory (all variants) form the optimization and start small? I have some ideas to rewrite that code to not duplicate the same code with small changes over and over. What you need is some way to read the the values unaliged from an indexed (int for arrays and bytebuffers, long for memorysegments)? This is exactly what VarHandles do! We could write a generic implementation that uses VarHandles to access the underlying bytes. You can get Varhandles from byte arrays (see BitUtil class - you already use that) or from ByteBuffers (works the same way to get a VarHandle to access internals of ByteBuffers - P.S.: We could use that to get rid of the stupid LongBuffers in the legacy MMap-ByteBufferIndexInput - so I played around with it already). And finally the accessors of MemorySegment are Varhandles behind the scenes, too (just with long coordinate). So my idea would be an implementation that uses VarHandles and this one can be used for byte[], ByteBuffer[] and MemorySegment. The impl is basically the same (takes an offset for start of block and the receiver and then decodes everything after that offset). All implementations would create an instance of that GroupVarInt decoder that knows the varhandles and the receiver object). I am a bit busy the next day, I can maybe look into that at the weekend. Maybe we get a good API without copypasting this horrible complex code that I don't under stand ten times into our repository. This is my main issue with this. I do much for performance but I won't copypaste code everywhere just to get a few percents. Sorry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org