benwtrent commented on issue #14758: URL: https://github.com/apache/lucene/issues/14758#issuecomment-3607183575
> you have to be willing to traverse "dark" areas of the graph to get to the places you want to be. ACORN only scores vectors that match the criteria @msokolov but that is only enforced on the bottom layer. We don't spend time scoring within "dark" areas of the graph. My concern here is that graph administrivia (reading in vectors, binary search to find their offsets, etc.) is costing too much here. Maybe we can spend some space to make exploring the graph (I ain't talking about scoring...just reading in the neighbors and then iterating without scoring) cheaper. Also, seeding the bottom layer on restrictive filters is a known thing, and can help significantly, I just didn't bother digging deeper on this as ACORN was a big enough change as it is. > The sorting idea should help cluster filtered nodes together, but that;s not intrinsically part of ACORN, I am not suggesting it was or ever would be. The "Sorting idea" is the idea of enforcing new connections that add an requirement that nodes must be connected to other nodes within a particular ordinal range (in addition to their true nearest neighbors). This has nothing to do with BPV or reording vectors according to other criteria when NOT sorted. I would expect bi-partite stuff to only be useful when there is NOT a index sort and I wouldn't expect it to help at all in the filter case. At least, not directly, maybe it helps search generally. > Personally I'm convinced that the benefit of precomputing filtered graphs is worth some pain, mostly because I just haven't seen any other approach that comes near to the performance (recall/latency). This tells me more that either graphs just ain't the right solution here at all, or that we have the opportunity to do something fundamentally different that works better with how Lucene does filters with graphs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
