Thank you, Marco. I see the debug out put that looks like: <str name="rawquerystring">title_jpn:2001年</str> <str name="querystring">title_jpn:2001年</str> <str name="parsedquery">PhraseQuery(title_jpn:"2001 年")</str> <str name="parsedquery_toString">title_jpn:"2001 年"</str> <lst name="explain"/> <str name="QParser">LuceneQParser</str>
Does this mean the standard query parser does send the raw query string to the Analyzer and (because the query yielded more than one token?) it uses phrase query? I guess the cause of my problem is somewhere else. On Mar 17, 2010, at 1:05 AM, Marco Martinez wrote: > Hello, > > You can see what happen (which analyzer are used for this field and which is > the output of the analyzers) with this search using the analysis page of the > solr default web page. I assume you are using the same analyzers and > tokenizers in indexing and searching for this field in your schema. > > Regards, > > > Marco Martínez Bautista > > > > 2010/3/17 Teruhiko Kurosaka <k...@basistech.com> > >> It seems that Solr's query parser doesn't pass a single term query >> to the Analyzer for the field. For example, if I give it >> 2001年 (year 2001 in Japanese), the searcher returns 0 hits >> but if I quote them with double-quotes, it returns hits. >> In this experiment, I configured schema.xml so that >> the field in question will use the morphological Analyzer >> my company makes that is capable of splitting 2001年 >> into two tokens 2001 and 年. I am guessing that this >> Analyzer is called ONLY IF the term is a phrase. >> Is my observation correct? >> >> If so, is there any configuration parameter that I can tweak >> to force any query for the text fields be processed by >> the Analyzer? >> >> One might ask why users won't put space between 2001 and 年. >> Well if they are clearly two separate words, people do that. >> But 年 works more like a suffix in this case, and in many >> Japanese speaker's mind, 2001年 seems like one token, so >> many people won't. (Remember Japanese don't use spaces >> in normal writing.) Forcing to use Analyzer would also >> be useful for compound word handling often desirable >> for languages like German. ---- Teruhiko "Kuro" Kurosaka RLP + Lucene & Solr = powerful search for global contents