Re: Internationalization

Erik Hatcher Wed, 17 Jan 2007 00:08:23 -0800

Way to go Bess!   This is great stuff you're sharing.


I have a question though...

On Jan 16, 2007, at 11:48 AM, Bess Sadler wrote:

Currently, we are assigning all fields, no matter what language totype string, defined as
<fieldtype name="string" class="solr.StrField"sortMissingLast="true"/>
This does string matching very well, but doesn't do any stop words,or stemming, or anything fancy. We are toying with the idea of acustom Tibetan indexer to better break up the Tibetan into discretewords, but for this particular project (because it mostly has to dowith proper names, not long passages of text) this hasn't been aproblem yet, and the above solution seems to be doing the trick.

Why are you assigning all fields to a "string" type? That indexeseach field as-is, with no tokenization at all. How are you usingthat field from the front-end? I'd think you'd want to copyFieldeverything into a "text" field.

Elizabeth (Bess) Sadler
Head, Technical and Metadata Services
Digital Scholarship Services
Box 400129
Alderman Library
University of Virginia
Charlottesville, VA 22904


Just two floors down.... what amazing folks we have on this!

        Erik

Re: Internationalization

Reply via email to