uschindler commented on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-860889941
After analyzing the heap dumps provided by JFR, I was able to figure out what the problem is. Basically, all native, VarHandle backed methods are fast and optimize nice. But all "bulk" read methods are slow and produce a lot of garbage on heap: The problem is that we can't copy from a memory segment to a heap `byte[]` or heap `float[]` natively! So the whole wrapping with `MemorySegment.ofArray()`, applying offsets and length, copy memory makes the whole thing produce too much garbage, because it looks like Hotspot isn't able to remove the object allocation. I did a quick test: - I removed bulk `readLongs()` and bulk `readFloats()` from source code and let it fall through to the simple readLong/readFloat loop of DataInput and the slowdown suddely was going to zero! - I replaced the readBytes() code also by a simple loop in the case the byte[] length to read is < 4096 bytes See this hack commit (not in this branch): https://github.com/uschindler/lucene/commit/9b328a74746a04351a99f33f82515122b06d5baa The speed is much better, but some tasks are slower now, but this is related because now stuff like copying or uncompressing stored fields may be slower partially. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org