about analyzer and index
lucene have ChineseAnalyzer and CJKAnalyzer,,,so i can search chinese keyword with it. solr have it? if not, how can i add it? if i use php+mysql build data.xml,,,use post.sh data.xml? it is the only way to index? i remember i must use same analyzer to index and search when i use lucene2.0 ,,, what is solr analyzer? and how support user defined?(if it not support chinese)
Re: about analyzer and index
On Aug 27, 2006, at 3:27 AM, James liu wrote: lucene have ChineseAnalyzer and CJKAnalyzer,,,so i can search chinese keyword with it. solr have it? if not, how can i add it? Those analyzers are not part of the core Solr distribution, but you can add them easily by getting the JAR file from Lucene (it'll be called lucene-analyzers-.jar) in the Lucene binary downloads. You'll then need to adjust your schema.xml to point at the analyzer you wish to use, something like this: class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"/> if i use php+mysql build data.xml,,,use post.sh data.xml? it is the only way to index? No, not at all. Solr works off XML over HTTP, which is trivial to do from PHP and other environments. Check out the wiki here: wiki.apache.org/solr/SolPHP> Erik
Re: Possible bug in copyField
By looking at what is stored. Has this worked for others? - Original Message From: Yonik Seeley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org; jason rutherglen <[EMAIL PROTECTED]> Sent: Friday, August 25, 2006 6:35:43 PM Subject: Re: Possible bug in copyField On 8/25/06, jason rutherglen <[EMAIL PROTECTED]> wrote: > When doing a copyField into a text field that is supposed to be stemmed I'm > not seeing the stemming occur. How did you determine that stemming didn't occur? -Yonik
Re: Possible bug in copyField
: By looking at what is stored. Has this worked for others? the "stored" value of a field is allways going to be the pre-analyzed text -- that's why the stored values in your "text" fields still have upper case characters and stop words. what matters is whether or not the "indexed" terms of your "text_stem" fields are really stemmed or not. I certianly haven't noticed this problem ... using the fields/types you mentioned before, do you have an example of a doc you've indexed, and expected to get from a stemmed query that wasn't acctually returned? -Hoss