First, sorting completely overrides scoring. So if you specify a sort,
scoring is essentially ignored. If you specify more than one sort, they
are applied in order. That is, any ties in the first sort parameter are
broken by the second sort parameter. If all sort parameters specified
tie, the internal document ID is used to break the tie. This last is just
to make sure the ordering is repeatable. You can even use score
as a sort criteria as, say, the 2nd or 3rd sort criteria.

Lucene (where the sorting happens) assembles a list of all the unique
*values* for a sort field and sorts the result set by comparing to that
list. It doesn't sort all the documents per-se.

About the ordering. I think you've got it pretty much correct, but I have
to ask whether this is a curiosity question or if there's some behavior
you want to see?

And a note about sorting on timestamps. You'll do yourself a favor if
you use the coarsest time you can. Lucene assembles a list of all
unique values as I mentioned above for a sort field. That list can be
much smaller if you can round to hour (or day or .....). Of course for
a small index this won't matter much, but if it's a larger one... And what
you do here depends on the use-case you're satisfying of course.


Best
Erick

On Thu, Feb 2, 2012 at 3:09 AM, tiuser123 <tiuser1...@gmail.com> wrote:
> Hello new user here,
>
> Would just like to clarify the behavior of the solr/lucene sort param.
>
>
> In this post:
> http://lucene.472066.n3.nabble.com/Lucene-sort-performance-roots-tp3102493p3104294.html
> I somehow got the impression that solr would do the sort only on the top
> ranking documents taken from a priority queue (is number of documents in the
> priority queue is based on a config parameter? can someone help point where
> is this parameter in the solrconfig.xml?)
>
> However in this post:
> http://lucene.472066.n3.nabble.com/Boosting-for-most-recent-documents-tp499286p499296.html
> I got the impression that sort is actually done on all the documents. (may
> be wrong)
>
> What I'm trying to do is actually get the most recently uploaded documents
> (defined by a custom timestamp field).
> actual query I'm thinking of is to have a q=title:* sort=upload_dt desc
>
> If #1 behavior is correct then I can't use the sort parameter as it won't be
> able to sort the whole result set (Though what if all the documents have the
> same scores? would all of the result set be included in the priority
> queue?). I'm guessing I can boost the document using the date instead.
> Though if #2 is correct I guess I can use the sort parameter (though may
> have memory issues if whole result set is huge)
>
> Forgive me as I am more familiar with RDBs so I'm actually more familiar
> with sorting on the whole result set using order by.
>
> Side note: is there a documentation/quick note or someone who can explain
> the order solr/lucene executes the query params
> ie.
> 1 applies filter query
> 2 applies query
> 3 applies sort
> something like this?
> like in oracle where the "where" clause is applied first before "connect
> by".
>
> Thanks.
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-sort-param-behavior-clarification-tp3709248p3709248.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to