msokolov commented on PR #14874: URL: https://github.com/apache/lucene/pull/14874#issuecomment-3059172791
> My understanding was that off heap document vectors helped by avoiding a copy back into the heap, plus avoiding the cost of reallocation and copy if some of them got garbage collected. But doesn't this change add a copy, by copying the byte[] queryVector from heap to the allocated off-heap segment? Also, since the query vector is only used during the lifetime of the query, I would've thought keeping it on heap should be okay? It is confusing to me too. I think to understand it we need to decompile and look at the instructions that are generated -- after hotspot does its work. Maybe we are bypassing memory barriers that get applied to on-heap arrays? I am really not sure. > I'm confused, if dotProductWTF and dotProduct are exactly identical, why did dotProductWTF fix the 'search after indexing' case? The idea behind this was to create two separate code paths: one used during indexing (when both arrays are on-heap) and another one used during search, when one array is one-heap and the others are off-heap (memory mapped from disk). This seems to enable hotspot to separately optimize these two code paths. There is yet another mystery here, which is: why, after adding this hotspot hackery, do we see *even faster* performance on the query path when it is preceded by an indexing workflow than we do when it is not (although it's still faster than the baseline). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org