msokolov commented on PR #15430:
URL: https://github.com/apache/lucene/pull/15430#issuecomment-3596775130

   OK after finally having ironed out the bugs, I have some results.  The 
situation is a little complicated as the change here really doesn't help much 
with the typical "dense" index where every document has a vector. I think the 
reason is that any gains are masked by the additional cost of having a 
node->doc mapping that must be traversed.  On the other hand, in the "sparse" 
case where some documents have no vectors, we already have such a mapping, so 
we can see the impact of this change more clearly.  Net/net we see improvements 
in search latency, increasing with index size. On indexes of 1-2MM I see 5% 
improvement, on 10MM, a 10% improvement.  As expected, `vex` files show a 
decrease in size (about 15%). There is also an increase in `vem` since that is 
where we store the new node->doc mapping, but this is pretty small.  Merge 
times go up a lot - this metric varies quite a bit, but seems to be about 100% 
increase.
   
   It may be possible to reduce the merge times by tweaking the parameters of 
the BP execution to make it recurse less?  I'll see if I can do that while 
retaining the latency improvements.  Then it might be best to enable this only 
for sparse indexes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to