Re: how to Index and Search non-Eglish Text in solr

2011-06-10 Thread Erick Erickson
Well, no. Specifying both indexed and stored as "false" is essentially a no-op, you'd never find anything! But even with indexed="true", this solution has problems. It's essentially using a single field to store text from different languages. The problem is that tokenization, stemming etc. behaves

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Mohammad Shariq
Thanks Erick for your help. I have another silly question. Suppose I created mutiple fieldTypes e.g. news_English, news_Chinese, news_Japnese etc. after creating these field, can I copy all these to CopyField "*defaultquery" *like below : * *and my "defaultquery" looks like :* *Is this right

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Erick Erickson
No, you'd have to create multiple fieldTypes, one for each language Best Erick On Thu, Jun 9, 2011 at 5:26 AM, Mohammad Shariq wrote: > Can I specify multiple language in filter tag in schema.xml ???  like below > > >   >       >       words="stopwords.txt" enablePositionIncrements="true"/

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Mohammad Shariq
Can I specify multiple language in filter tag in schema.xml ??? like below On 8 June 2011 18:47, Erick Erickson wrote: > This page is a handy reference for individual languages... > http://wiki.apache.org/solr/LanguageAnalysis > > But the usual approa

Re: how to Index and Search non-Eglish Text in solr

2011-06-08 Thread Erick Erickson
This page is a handy reference for individual languages... http://wiki.apache.org/solr/LanguageAnalysis But the usual approach, especially for Chinese/Japanese/Korean (CJK) is to index the content in different fields with language-specific analyzers then spread your search across the language-spec