Re: Multi language search help

Sujatha Arun Fri, 19 Dec 2008 04:09:39 -0800

Thanks Grant,

The requirement from the user end is to only search in that particular
language and not across languages.


Also going forward we will be adding more languages.

so if i have separate fields for each language ,then we need to change the
schema everytime and that will not scale very well.

So there are two options ,either use dynamic fields  or use multi core .

Please advice which is better in terms of scaling ,optimum use of existing
resources (available  ram which is abt 4GB for several instances of solr) .

If we use multicore ,will it degrade in terms of speed etc?

Any pointers will be helpful

Regards
Sujatha


On 12/19/08, Grant Ingersoll <gsing...@apache.org> wrote:
>
>
> On Dec 18, 2008, at 6:25 AM, Sujatha Arun wrote:
>
> Hi,
>> I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
>> schema -id,content and language.
>>
>> I am indexing 3 pdf files ,the languages are foroyo,chinese and japanese.
>>
>> I use xpdf to convert the content of pdf to text and push the text to solr
>> in the content field.
>>
>> What is the analyzer  that i need to use for the above.
>>
>> By using the default text analyzer and posting this content to solr, i am
>> not getting any  results.
>>
>> Does solr support stemming for the above languages.
>>
>
> I'm not familiar with Foroyo, but there should be tokenizers/analysis
> available for Chines and Japanese.  Are you putting all three languages into
> the same field?  If that is the case, you will need some type of language
> detection piece that can choose the correct analyzer.
>
> How are your users searching?  That is, do you know the language they want
> to search in?  If so, then you can have a field for each language.
>
> -Grant
>
>

Re: Multi language search help

Reply via email to