Re: looking for multilanguage indexing best practice/hint

Sujatha Arun Wed, 17 Dec 2008 20:15:40 -0800

Hi,

I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
schema -id,content and language.


I am indexing 3 pdf files ,the languages are foroyo,chinese and japanese.

I use xpdf to convert the content of pdf to text and push the text to solr
in the content field.

What is the analyzer  that i need to use for the above.

By using the default text analyzer and posting this content to solr, i am
not getting any  results.

Does solr support stemmin for the above languages.

Regards
Sujatha




On 12/18/08, Feak, Todd <todd.f...@smss.sony.com> wrote:
>
> Don't forget to consider scaling concerns (if there are any). There are
> strong differences in the number of searches we receive for each
> language. We chose to create separate schema and config per language so
> that we can throw servers at a particular language (or set of languages)
> if we needed to. We see 2 orders of magnitude difference between our
> most popular language and our least popular.
>
> -Todd Feak
>
> -----Original Message-----
> From: Julian Davchev [mailto:j...@drun.net]
> Sent: Wednesday, December 17, 2008 11:31 AM
> To: solr-user@lucene.apache.org
> Subject: looking for multilanguage indexing best practice/hint
>
> Hi,
> From my study on solr and lucene so far it seems that I will use single
> scheme.....at least don't see scenario where I'd need more than that.
> So question is how do I approach multilanguage indexing and multilang
> searching. Will it really make sense for just searching word..or rather
> I should supply lang param to search as well.
>
> I see there are those filters and already advised on them but I guess
> question is more of a best practice.
> solr.ISOLatin1AccentFilterFactory, solr.SnowballPorterFilterFactory
>
> So solution I see is using copyField I have same field in different
> langs or something using distinct filter.
> Cheers
>
>
>
>

Re: looking for multilanguage indexing best practice/hint

Reply via email to