kaivalnp commented on PR #15341: URL: https://github.com/apache/lucene/pull/15341#issuecomment-3711964469
Sorry for the delay @mikemccand > Maybe we can spin off the CFS case to a follow-on issue? This makes sense to me, @uschindler please let me know if you think otherwise.. IIUC the CFS format glues all files together, and aligns individual file contents to [8 bytes](https://github.com/apache/lucene/blob/f996e744c1787da16454777a4dadc2f3cd2fac0a/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90CompoundFormat.java#L123-L124), which is a multiple of all current alignment occurrences (4 bytes) -- so those alignments will hold for the contents of individual files. However in this PR, we want the contents of one specific file to be aligned to 64 bytes -- which may not hold with CFS (the contents are only guaranteed to be 8-byte-aligned). I'm not sure of a clean way to do this, because we do not persist alignment information for each file in the index. One way could be to align _all_ files to 64 bytes in the CFS format -- but that may be excessive? In any case, I don't think this PR will degrade performance of float vector search in indexes that use CFS (same alignment as before) -- so I've added a TODO for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
