mikemccand commented on issue #13158: URL: https://github.com/apache/lucene/issues/13158#issuecomment-2012195070
This is a cool idea @msokolov. It is wasteful to lug around those `float32` precision vectors out to the searchers in an NRT segment replication architecture. In practice, they would consume disk space on the searchers, and waste time copying them out, but since the OS would never load them at search time, their bytes would remain cold on disk and not put much pressure on OS free RAM? The OS would only cache the disk pages in the index that are actually needed at search time. It would be nice not to copy all that deadweight around ... Probably the solution would have to be something like segment to segment? I.e. for each segment in the index, we would make a corresponding "read only" segment (stripped of the `float32` vectors). This way, as the normal index changes (gets new flushed/merged segments), we could also incrementally/NRT maintain the shadow read-only index. I wonder if there are other things in a Lucene index today that are needed only during indexing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org