.
Just one clarification, when you say ICUFilterFactory am I correct in thinking
its ICUFodingFilterFactory.
Thanks,
Rishi.
-Original Message-
From: Tom Burton-West
To: solr-user
Sent: Wed, Feb 25, 2015 4:33 pm
Subject: Re: Basic Multilingual search capability
Hi Rishi,
As
Hi Alex,
>
> Thanks for the suggestions. These steps will definitely help out with our
> use case.
> Thanks for the idea about the lengthFilter to protect our system.
>
> Thanks,
> Rishi.
>
>
>
>
>
>
>
> -Original Message-
> From: Alexandr
: Re: Basic Multilingual search capability
Given the limited needs, I would probably do something like this:
1) Put a language identifier in the UpdateRequestProcessor chain
during indexing and route out at least known problematic languages,
such as Chinese, Japanese, Arabic into individual fields
forward to trying out once its integrated to main.
Thanks,
Rishi.
-Original Message-
From: Trey Grainger
To: solr-user
Sent: Tue, Feb 24, 2015 1:40 am
Subject: Re: Basic Multilingual search capability
Hi Rishi,
I don't generally recommend a language-insensitive approach excep
Given the limited needs, I would probably do something like this:
1) Put a language identifier in the UpdateRequestProcessor chain
during indexing and route out at least known problematic languages,
such as Chinese, Japanese, Arabic into individual fields
2) Put everything else together into one f
t if it had capability to tokenize email addresses
> (ex:he...@aol.com- i think standardTokenizer already does this),
> filenames (здравствуйте.pdf), but maybe we can use filters to accomplish
> that.
> >
> > Thanks,
> > Rishi.
> >
> > -Original Message-
&
solr-user
Sent: Mon, Feb 23, 2015 11:17 pm
Subject: Re: Basic Multilingual search capability
It isn’t just complicated, it can be impossible.
Do you have content in Chinese or Japanese? Those languages (and some others)
do
not separate words with spaces. You cannot even do word search without a
>
> Thanks,
> Rishi.
>
> -Original Message-
> From: Alexandre Rafalovitch
> To: solr-user
> Sent: Mon, Feb 23, 2015 5:49 pm
> Subject: Re: Basic Multilingual search capability
>
>
> Which languages are you expecting to deal with? Multilingual suppor
Subject: Re: Basic Multilingual search capability
Which languages are you expecting to deal with? Multilingual support
is a complex issue. Even if you think you don't need much, it is
usually a lot more complex than expected, especially around relevancy.
Regards,
Alex.
Sign up for my
xt documents from our end users,
> which can be in any language (sometimes combination) and we cannot determine
> the language of the incoming text. Language detection at index time is not
> necessary.
>
> Which analyzer is recommended to achive basic multilingual search capability
>
nd we cannot determine the language
of the incoming text. Language detection at index time is not necessary.
Which analyzer is recommended to achive basic multilingual search capability
for a use case like this.
I have read a bunch of posts about using a combination standardtokenizer or
ICUtoke
11 matches
Mail list logo