mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-1003458911
@msokolov > I seem to remember that when I checked (you can use -fanout parameter to KnnGraphTester IIRC) most nodes were not fully populated; ie they had fewer than maxConn connections. Why? This seems counterintuitive I have checked fanout on two datasets using `KnnGraphTesters` `-stats` option, and most nodes turned out to be fully connected, but there are some that are not: **glove-100-angular** M:16 Graph level=0 size=1183514, Fanout min=1, mean=15.90, max=16 Number of connections (column 1) and number of nodes with this number of connections (column 2) <img src="https://user-images.githubusercontent.com/5738841/147839964-c8e57287-1a99-4b86-8eb6-4a1e90b61e3b.png" height="350"> Histogram: | 0% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100%| | ---: | ---: | ---: | ---: | ---: | ---: | ---:| ---: | ---: | ---: |---: | | 0 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | **sift-128-euclidean** M:16 Graph level=0 size=1000000, Fanout min=1, mean=15.52, max=16 Histogram: | 0% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100%| | ---: | ---: | ---: | ---: | ---: | ---: | ---:| ---: | ---: | ---: |---: | | 0 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org