Thanks Erik,

the null problem was introduced when I copied the example below, now I have the nulls excluded using (sortMissingLast="true"), in 1.5 using the suggested config below and im still not seeing the desired behavior.

It seems to me that the default behavior of the Java Collator using the ROOT locale (PRIMARY or SECONDARY dont seem to matter in this example) is as follows:

empty string
symbols (by this I mean $, & , *, * etc)
numerics
alpha
leading spaces

My desire is:
alpha
numeric
symbols
leading spaces
empty string

Im going to try a custom RuleBasedCollator to see if I can make this happen as Shalin suggested.

thanks
Joel



I
On Dec 31, 2009, at 11:11 AM, Erick Erickson wrote:

have you tried setting sortMissingLast="true" in your schema.xml? Something
like...

<fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>

or perhaps in your individual field definition instead. The schema.xml
examples have additional information that you really should scan at
least....

HTH
Erick

On Thu, Dec 31, 2009 at 8:53 AM, Joel Nylund <jnyl...@yahoo.com> wrote:

Hi,

After some further investigation, it turns out that null fields were
sorting first, so if the title was null it was coming up first. This is true
even with 1.5 and collatedROOT. (I tried on last nights build).

So let me change my question, how do I make items with null values sort
last?

thanks
Joel


On Dec 30, 2009, at 3:11 PM, Joel Nylund wrote:

Hi, so this is only available in 1.5?

I tried in 1.4 and got :

org.apache.solr.common.SolrException: Error loading class
'solr.CollationKeyFilterFactory'

Is there a way to do this in 1.4?

The link Shalin sent is a 1.5 link I think.

thanks
Joel

On Dec 25, 2009, at 10:52 PM, Robert Muir wrote:

Hello, as Shalin said, you might want to try CollationKeyFilterFactory.

Below is an example (using the multilingual root locale), where the
spaces will sort after the letters and numbers as you mentioned, but
it will still not be case-sensitive. This is because strength is
'secondary'.

But are you really sure you want the spaces sorted after the letters and numbers? Or instead do you just want them ignored for sorting? If
this is the case, then try 'primary', so that spaces, punctuation,
accents and things like that in addition to case are ignored in the
sort: for example "Test-1234" and "   test1234" sort the same with
primary, but not with secondary (the one with leading spaces will sort
last)

If all else fails, you can write custom rules for it too, as Shalin
mentioned.

<fieldType name="collatedROOT" class="solr.TextField">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.CollationKeyFilterFactory"
    language=""
    strength="secondary"
/>
</analyzer>
</fieldType>

On Fri, Dec 25, 2009 at 5:37 AM, Shalin Shekhar Mangar
<shalinman...@gmail.com> wrote:


On Thu, Dec 24, 2009 at 11:51 PM, Joel Nylund <jnyl...@yahoo.com>
wrote:

update, I tried changing to datatype string, and it sorts the numerics
better, but the other sorts are not as good.

Is there a way to control sorting for special chars, for example, I
want
blanks to sort after letters and numbers.


In the general case, CollationKeyFilterFactory will do the trick. You
could
create a custom rule set which sorts spaces after letters and numbers.
See
http://wiki.apache.org/solr/UnicodeCollation


using alphaOnlySort - sorts nicely for alpha, but numbers dont work
string - sorts nicely for numbers and letters, but special chars like
blanks show up first in the list


alphaOnlySort has a PatternReplaceFilterFactory which removes all
characters
except a-z. This is the reason behind those wierd results. You could try
removing that filter and see if thats what you need.

--
Regards,
Shalin Shekhar Mangar.




--
Robert Muir
rcm...@gmail.com





Reply via email to