-
>If you reply to this email, your message will be added to the discussion
>below:
>http://lucene.472066.n3.nabble.com/Implementing-custom-analyzer-for-multi-language-stemming-tp4150156p4159594.html
>To unsubscribe from Implementing custom analyzer for multi-language stemming,
&g
Is there a way to set attribute in tokenizer to document to search by word
and this attribute?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Implementing-custom-analyzer-for-multi-language-stemming-tp4150156p4159594.html
Sent from the Solr - User mailing list archive at
evelop that.
In another project, I am following the same approach to develop an
AutoAnalyzer for Lucene without using Solr. So, let me know if you want
directions in how to do it.
Regards
Ameer
--
View this message in context:
http://lucene.472066.n3.nabble.com/Implementing-custom-analyzer-for
-for-multi-language-stemming-tp4150156p4159550.html
Sent from the Solr - User mailing list archive at Nabble.com.
Yes, each token could have a LanguageAttribute on it, just like
ScriptAttributes. I didn't *think* a span would be necessary.
I would also add a multivalued "lang" field to the document. Searching
English documents for "die" might look like: "q=die&lang=eng". The "lang"
param could tell the Reques
On 8/5/14, 8:36 AM, Rich Cariens wrote:
Of course this is extremely primitive and basic, but I think it would be
possible to write a CharFilter or TokenFilter that inspects the entire
TokenStream to guess the language(s), perhaps even noting where languages
change. Language and position informat
I've started a GitHub project to try out some cross-lingual analysis ideas (
https://github.com/whateverdood/cross-lingual-search). I haven't played
over there for about 3 months, but plan on restarting work there shortly.
In a nutshell, the interesting component
("SimplePolyGlotStemmingTokenFilter
On 7/30/14, 10:47 AM, Eugene wrote:
Hello, fellow Solr and Lucene users and developers!
In our project we receive text from users in different languages. We
detect language automatically and use Google Translate APIs a lot (so
having arbitrary number of languages in our system doesn't
>
> > Cheers,
> > -Chris.
> >
> > ------------
> > From: "Eugene"
> > Sent: Wednesday, July 30, 2014 1:48 PM
> > To: solr-user@lucene.apache.org
> > Subject: Implementing custom analyzer for multi-language s
--
> From: "Eugene"
> Sent: Wednesday, July 30, 2014 1:48 PM
> To: solr-user@lucene.apache.org
> Subject: Implementing custom analyzer for multi-language stemming
>
> Hello, fellow Solr and Lucene users and developers!
>
> In our project we receive text fro
ene"
Sent: Wednesday, July 30, 2014 1:48 PM
To: solr-user@lucene.apache.org
Subject: Implementing custom analyzer for multi-language stemming
Hello, fellow Solr and Lucene users and developers!
In our project we receive text from users in different languages. We
detect language automatically
Hello, fellow Solr and Lucene users and developers!
In our project we receive text from users in different languages. We
detect language automatically and use Google Translate APIs a lot (so
having arbitrary number of languages in our system doesn't concern us).
However we need to be able
12 matches
Mail list logo