tveasey commented on PR #14420: URL: https://github.com/apache/lucene/pull/14420#issuecomment-2761706401
I discussed this a bit with Mayya. So the underlying issue is how we account for the gain for degree 1 vertices. We say that the gain contribution from any vertex is at least 2, i.e. $\max\left(2, \frac{|N_s(v)|}{4}\right)$. This forces us to insert degree 1 vertices which was intentional. The problem is we can get at most +1 contribution from them to $g_{tot}$. Mayya tried it and when we correct the gain to $\min\left(\max\left(2, \frac{|N_s(v)|}{4}\right), |N_s(v)|\right)$ we terminate before exhausting the heap, as expected. Arguably this is slightly more conceptually correct. However, it represents a change in behaviour and invalidates the testing Mayya's done. The current behaviour increases the relative size of $J$ for graphs with many degree one vertices. This seems conceptually ok. In the case the heap is exhausted we've just picked $J = V$ the set of all vertices and reverted to the original merge procedure which is clearly harmless. However, this is going to be an extreme edge case since we normally only have a small fraction of degree one vertices and we'll nearly always terminate on $g_{tot}<g_{exit}$. Therefore, my vote is to just go with Ben's fix and leave the $g_{tot}$ calculation as is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org