mikemccand commented on issue #13403: URL: https://github.com/apache/lucene/issues/13403#issuecomment-2144941653
> As an aside, this "wait to build the index" thing could also be done for HNSW. Tiny segments with quick flushes probably shouldn't even build HNSW graphs. Instead, they should probably store the float vectors flat (or the scalar quantized vectors flat as scalar quantizing is effectively linear in runtime). Then when a threshold is reached (it could be small, something like 1k, 10k?), we create the HNSW graphs. Oooh -- +1 to explore this idea as a pre-cursor separately from enabling/exploring dimensionality reduction compression. Lucene's write-once segments really make such optimizations (different choices depending on segment's size or characteristics of the documents in each segment) possible and worthwhile! Maybe open a spinoff for this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org