Huh, so lucene is actually doing what has been commonly described as impossible in Solr?

But is Solr trunk, as the OP person seemed to report, still not aware of this and raising on a sort on multi-valued field, instead of just saying, okay, we'll just pass it to lucene anyway and go with lucene's approach to sorting on multi-valued field (that is, apparently, using the largest value)?

If so... that kind of sounds like a bug/misfeature, yes, no?

Also... if lucene is already capable of sorting on multi-valued field by choosing the largest value.... largest vs. smallest is presumably just arbitrary there, there is presumably no performance implication to choosing the smallest instead of the largest. It just chooses the largest, according to Yonik.

So... if someone patched lucene, so whether it chose the largest or smallest in that case was a parameter passed in -- probably not a large patch since lucene, says Yonik, already has been enhanced to choose largest always -- and then patched Solr to take a param and pass it to Lucene for this purpose, which presumably also wouldn't be a large patch if lucene supported it.... then we'd have the feature OP asked for.

Based on Yonik's description (assuming I understand correctly and he's correct), it doesn't sound like a lot of code. But it's still beyond my unfamiliar-with-lucene-code-not-so-great-at-java abilities, nor do I have the interest for my own app needs at the moment. But if OP or someone else has both.... sounds like a plausible feature?

On 3/16/2011 6:00 PM, Yonik Seeley wrote:
On Wed, Mar 16, 2011 at 5:46 PM, Chris Hostetter
<hossman_luc...@fucit.org>  wrote:
: However, many of our multiValued fields are single valued for the majority
: of documents in our index so we may not have noticed the incorrect sorting
: behaviors.

that would make sense ... if you use a multiValued field as if it were
single valued, you would never enocunter a problem.  if you had *some*
multivalued fields your results would be sorted extremely arbitrarily for
those docs that did have multiple values, unless you had more distinct
values then you had documents -- at which point you would get a hard crash
at query time.
AFAIK, not any more.  Since that behavior was very unreliable, it has
been removed and you can reliably sort by any multi-valued field in
lucene (with the sort order being defined by the largest value if
there are multiple).

-Yonik
http://lucidimagination.com

Reply via email to