On Tue, Aug 4, 2015, at 05:55 PM, Steven White wrote: > Hi Everyone, > > I see Solr comes pre-configured with text analyzers for a list of > supported > languages e.g.: "text_ar", "text_bq", "text_ca", "text_cjk", "text_ckb", > "text_cz", etc. > > My questions are: > > 1) How well optimized are those languages for general usage? This is > something I need help with because other then English, I cannot judge how > well the current pre-configured setting works for best quality. Yes, > "quality" means different thing for each customer, but still I'm curious > to > know if the out-of-the-box setting is optimal. > > 2) Is there a landing link that talks about each of the > supported languages, what is available and how to tune that fieldType for > the said language? > > 3) What do you do when a language I need is not on the list? The obvious > answer is to write my own plug-in "fieldType" (or even customize one off > existing fieldType), but short of that, is there a "general" fieldType > that > can be used? Even if it means this fieldType will function as if it is > SQL's LIKE feature.
I'm not aware of such a page. It would be a wonderful thing, though. For unsupported languages, you can just use text_general, which is intended to be a catch all field type. Or you can craft your own from the components already available. E.g. stopword filter, etc is likely to work okay on most languages. What unsupported languages are you concerned with? Upayavira