benwtrent commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1644461632
@jmazanec15 I followed your steps with the same data (forcemerging as well) Instead of using `dot_product` as it is, I instead focused on the non-negative case (which is what it would be we supported this). So I used your piecewise transformation (negatives are between 0-1 and positives are unscaled scores of 1+). This is what I got: ``` recall latency nDoc fanout maxConn beamWidth visited index ms 0.989 2.74 400000 200 32 200 210 683712 1.00 post-filter ``` So, 0.989 recall at 2.7ms per query taking `683712ms` to build the index. Not too shabby. Its interesting how the scaling slightly changes the recall number. We should verify this is ok by feed the docs in a random order. We might be getting lucky in the graph building. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org