msokolov commented on PR #14167:
URL: https://github.com/apache/lucene/pull/14167#issuecomment-2619481888

   I was thinking of another approach based on pro-rating. On its own this is 
deterministic and close to optimally efficient, but risks missing the best 
results when the index is skewed. If me that if the HNSW search could be made 
re-entrant, by preserving the state in the HnswSearcher (visited list, priority 
queues) then we could examine all the per-segment results after completing a 
pass through the graphs, and then revisit some segments more deeply if the 
results appear skewed. Basically the information-sharing would be done in a 
sequential, periodic fashion
   
   On Monday, January 27th, 2025 at 2:55 PM, Benjamin Trent ***@***.***> wrote:
   
   >> This made me wonder if it would be a better trade-off to let just one 
slice run on its own first, and then let all other N-1 slices run in parallel 
with one another,
   >
   > I really like this idea. For kNN search, it seems best to take the largest 
tiers, gather information from them, and then run the smaller tiers in parallel.
   >
   > The major downside of kNN is that there is no slicing at all. Every 
segment is just its own worker, which is sort of crazy. We should at a minimum 
combine all the tiny segments together into a single thread.
   >
   > What do you think ***@***.***(https://github.com/mayya-sharipova) ? 
Slicing the segments and then picking the "largest" slice and search that in 
current thread. Then using that information to help the future parallel threads?
   >
   > —
   > Reply to this email directly, [view it on 
GitHub](https://github.com/apache/lucene/pull/14167#issuecomment-2616760637), 
or 
[unsubscribe](https://github.com/notifications/unsubscribe-auth/AAHHUQOHSZPE3IX6N2MXJ532M2FJXAVCNFSM6AAAAABVXSSWDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJWG43DANRTG4).
   > You are receiving this because you were mentioned.Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to