Hi Steve, This page may be useful: <https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-Language-SpecificFactories>
In most cases the configurations described there are the only OOTB alternative, so optimality isn’t discussed. I think the path most people take is to try those out, iterate with users who can provide feedback about quality, then if necessary investigate alternative solutions, including commercial ones. Steve www.lucidworks.com > On Aug 4, 2015, at 12:55 PM, Steven White <swhite4...@gmail.com> wrote: > > Hi Everyone, > > I see Solr comes pre-configured with text analyzers for a list of supported > languages e.g.: "text_ar", "text_bq", "text_ca", "text_cjk", "text_ckb", > "text_cz", etc. > > My questions are: > > 1) How well optimized are those languages for general usage? This is > something I need help with because other then English, I cannot judge how > well the current pre-configured setting works for best quality. Yes, > "quality" means different thing for each customer, but still I'm curious to > know if the out-of-the-box setting is optimal. > > 2) Is there a landing link that talks about each of the > supported languages, what is available and how to tune that fieldType for > the said language? > > 3) What do you do when a language I need is not on the list? The obvious > answer is to write my own plug-in "fieldType" (or even customize one off > existing fieldType), but short of that, is there a "general" fieldType that > can be used? Even if it means this fieldType will function as if it is > SQL's LIKE feature. > > Thanks > > Steve