mbrette commented on issue #12440: URL: https://github.com/apache/lucene/issues/12440#issuecomment-1702371365
An idea, instead of trying to merge the subgraph, is to do a union of subgraphs: When we merge, we build a disconnected graph which is the union of all the segment graphs. At search time, you explore each subgraph to come up with results. If we do this naively, this would be the same as parallel exploration of segments. To improve this: When we search in this graph, we run a greedy search like we do today, with 2 small tweaks: * First you make sure that you include candidates from every subgraph. * Second, you don't want the candidates list to be dominated by one subgraph (for ex because this is the one you start exploring), so you need a strategy to give a fair share for each subgraph. Maybe exploring not greedily by best score, but each subgraph in turn. Or maybe maintain your results and candidate lists such that the stopping criteria for adding candidates is not only based on a single best score for all the candidates, but also account on the best score from each subgraph. The advantages of that approach would be that: * You can explore each subgraph in parallel, like you would do with many segments, except that it does not influence the segment strategy for the keyword indexes. * You can somewhat limit the increase in number of nodes you have to explore by having a stopping criteria that is cross-subgraph But of course, this may become bloated as we start to have many disconnected subgraphs. At some point you'll want to do a real merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org