Hi,
 We are indexing our objects into Solr and let users to sort by different
fields. The sort field is defined as specified below in schema.xml:

    <fieldType name="lowercase" class="solr.TextField"
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory" />
      </analyzer>
    </fieldType>

For a field of type "lowercase", if we have the field values: APPLES,
ZUCCHINI, banana, BANANA, apples, zucchini and sort in ascending order,
solr produces the result in the following sorted order:
APPLES, apples, BANANA, banana, ZUCCHINI, zucchini.

But we have another tool which also displays the same information from a
database in the following sorted order:
apples, APPLES, banana, BANANA, zucchini, ZUCCHINI

But the database is using the SQL query "select column1 from table1 order
by UPPER(column1) asc".

I could either change SQL query to "select column1 from table1 order by
LOWER(column1) asc" or change solr definition to include
solr.UpperCaseFilterFactory instead of solr.LowerCaseFilterFactory so that
both applications behave same in terms of sorting.

But, in general, when we sort a collection of string values, what should be
the correct sort order? Should upper case value ("APPLE") come before
lowercase value ("apple") or the other way (lowercase value before
uppercase value) when sorting in ascending order?

Thanks,
Vasu

Reply via email to