mikemccand commented on issue #13699:
URL: https://github.com/apache/lucene/issues/13699#issuecomment-2373996504
[Disclaimer: I work with @dungba88 at Amazon Product Search team, and I also
suggested he open this issue ;)]
> Lucene has tens of parameters like this one, exposing them all would make
our APIs look rather bad. I wonder how you envision users to tune this
parameter, would it be good enough if we made it configurable through something
like a system property rather than through the API?
This is indeed a problem/challenge for Lucene. In cases like this I like
the "simple things should be simple, complex things should be possible" mantra.
I.e. Lucene should have good defaults for interesting parameters like _g_, but
really should expose the option for complex apps to tune it?
I guess we can tackle case by case... complex apps could always poach/fork
the Lucene implementation and then change anything, worst case.
> Here, g controls the greediness of non-competitive search and is some
number less than 1. In effect, g is a free parameter we can use to control
recall vs the speed up.
Hmm the thing is, if I'm reading the blog post correctly, this parameter _g_
(greediness) is quite critical/fundamental a knob in this cool
cross-concurrent-segment search algo, to trade off how the algorithm prefers
exploiting vs exploration? Why fuzz it up by adding a layer of indirection
("desired recall")? Also, it seems likely the recall vs performance tradeoff
might be model dependent? Are all vectors really "created equal" (I honestly
don't know how much variance in behavior there is among sets of vectors....).
How about if we expose tuning _g_ through API, somehow, but mark it
`@lucene.experimental` so we have the freedom to change/break it if we switch
to "desired recall" in the future? And mark it "expert"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]