Like I mentioned before. You could use string type if you just want
title it is. Or you can use a custom type to normalize the indexed
value, as long as you end up with a single token.

So, if you want to strip leading A/An/The, you can use
KeywordTokenizer, combined with whatever post-processing you need. I
would suggest LowerCase filter and perhaps Regex filter to strip off
those leading articles. You may need to iterate a couple of times on
that specific chain.

The good news is that you can just make a couple of type definitions
with different values/order, reload the index (from Cores screen of
the Web Admin UI) and run some of your sample titles through those
different definitions without having to reindex in the Analysis
screen.

Regards,
  Alex.

----
Sign up for my Solr resources newsletter at http://www.solr-start.com/

On 17 February 2015 at 22:36, Simon Cheng <simonwhch...@gmail.com> wrote:
> Hi Alex,
>
> It's okay after I added in a new field "s_title" in the schema and
> re-indexed.
>
>    <field name="s_title" type="string" indexed="true" stored="false"
> multiValued="false"/>
>    <copyField source="title" dest="s_title"/>
>
> But how can I ignore the articles ("A", "An", "The") in the sorting. As you
> can see from the below example :

Reply via email to