[ https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julie Tibshirani updated LUCENE-10592: -------------------------------------- Attachment: Screen Shot 2022-07-25 at 9.04.11 AM.png > Should we build HNSW graph on the fly during indexing > ----------------------------------------------------- > > Key: LUCENE-10592 > URL: https://issues.apache.org/jira/browse/LUCENE-10592 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Mayya Sharipova > Assignee: Mayya Sharipova > Priority: Minor > Fix For: 9.4 > > Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png > > Time Spent: 8h > Remaining Estimate: 0h > > Currently, when we index vectors for KnnVectorField, we buffer those vectors > in memory and on flush during a segment construction we build an HNSW graph. > As building an HNSW graph is very expensive, this makes flush operation take > a lot of time. This also makes overall indexing performance quite > unpredictable (as the number of flushes are defined by memory used, and the > presence of concurrent searches), e.g. some indexing operations return almost > instantly while others that trigger flush take a lot of time. > Building an HNSW graph on the fly as we index vectors allows to avoid this > problem, and spread a load of HNSW graph construction evenly during indexing. > This will also supersede LUCENE-10194 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org