Re: Basic Multilingual search capability

2015-02-26 Thread Rishi Easwaran
. Just one clarification, when you say ICUFilterFactory am I correct in thinking its ICUFodingFilterFactory. Thanks, Rishi. -Original Message- From: Tom Burton-West To: solr-user Sent: Wed, Feb 25, 2015 4:33 pm Subject: Re: Basic Multilingual search capability Hi Rishi, As

Re: Basic Multilingual search capability

2015-02-25 Thread Tom Burton-West
Hi Rishi, As others have indicated Multilingual search is very difficult to do well. At HathiTrust we've been using the ICUTokenizer and ICUFilterFactory to deal with having materials in 400 languages. We also added the CJKBigramFilter to get better precision on CJK queries. We don'

Re: Basic Multilingual search capability

2015-02-25 Thread Rishi Easwaran
: Re: Basic Multilingual search capability Given the limited needs, I would probably do something like this: 1) Put a language identifier in the UpdateRequestProcessor chain during indexing and route out at least known problematic languages, such as Chinese, Japanese, Arabic into individual fields

Re: Basic Multilingual search capability

2015-02-25 Thread Rishi Easwaran
forward to trying out once its integrated to main. Thanks, Rishi. -Original Message- From: Trey Grainger To: solr-user Sent: Tue, Feb 24, 2015 1:40 am Subject: Re: Basic Multilingual search capability Hi Rishi, I don't generally recommend a language-insensitive approach excep

Re: Basic Multilingual search capability

2015-02-24 Thread Alexandre Rafalovitch
Given the limited needs, I would probably do something like this: 1) Put a language identifier in the UpdateRequestProcessor chain during indexing and route out at least known problematic languages, such as Chinese, Japanese, Arabic into individual fields 2) Put everything else together into one f

Re: Basic Multilingual search capability

2015-02-23 Thread Trey Grainger
n use index-time language detection to map to the appropriate fields/analyzers if you are otherwise unaware of the languages of the content from your application layer. The third option requires custom code (included in the large Multilingual Search chapter of Solr in Action <http://solrinaction.com

Re: Basic Multilingual search capability

2015-02-23 Thread Rishi Easwaran
solr-user Sent: Mon, Feb 23, 2015 11:17 pm Subject: Re: Basic Multilingual search capability It isn’t just complicated, it can be impossible. Do you have content in Chinese or Japanese? Those languages (and some others) do not separate words with spaces. You cannot even do word search without a

Re: Basic Multilingual search capability

2015-02-23 Thread Walter Underwood
> > Thanks, > Rishi. > > -Original Message- > From: Alexandre Rafalovitch > To: solr-user > Sent: Mon, Feb 23, 2015 5:49 pm > Subject: Re: Basic Multilingual search capability > > > Which languages are you expecting to deal with? Multilingual suppor

Re: Basic Multilingual search capability

2015-02-23 Thread Rishi Easwaran
Subject: Re: Basic Multilingual search capability Which languages are you expecting to deal with? Multilingual support is a complex issue. Even if you think you don't need much, it is usually a lot more complex than expected, especially around relevancy. Regards, Alex. Sign up for my

Re: Basic Multilingual search capability

2015-02-23 Thread Alexandre Rafalovitch
xt documents from our end users, > which can be in any language (sometimes combination) and we cannot determine > the language of the incoming text. Language detection at index time is not > necessary. > > Which analyzer is recommended to achive basic multilingual search capability >

Basic Multilingual search capability

2015-02-23 Thread Rishi Easwaran
nd we cannot determine the language of the incoming text. Language detection at index time is not necessary. Which analyzer is recommended to achive basic multilingual search capability for a use case like this. I have read a bunch of posts about using a combination standardtokenizer or ICUtoke

Re: multilingual search

2014-07-04 Thread Paul Libbrecht
> 1. Modify the qf parameter directly by either adding the "_xx" language > suffix to each field in qf, or replacing the "xx" for any qf fields that > already have an "_xx" suffix. > 2. Have separate "qf_xx" parameters which are customized for specific > languages and then copy the language-spec

Re: multilingual search

2014-07-04 Thread Jack Krupansky
uot; parameters which are customized for specific languages and then copy the language-specific "qf_xx" parameter to the main qf parameter based on the language that is detected. -- Jack Krupansky -Original Message- From: Paul Libbrecht Sent: Friday, July 4, 2014 11:36 AM T

Re: multilingual search

2014-07-04 Thread Paul Libbrecht
s to use for the "qf" parameter. > > -- Jack Krupansky > > -Original Message- From: benjelloun > Sent: Friday, July 4, 2014 10:52 AM > To: solr-user@lucene.apache.org > Subject: multilingual search > > Hello, > > what i need to do i

Re: multilingual search

2014-07-04 Thread Jack Krupansky
2014 10:52 AM To: solr-user@lucene.apache.org Subject: multilingual search Hello, what i need to do is to detect language of my fields then when i search with "/select RequestHandler" how can i define for a search to detect the language of words to choose which field_langid use. my con

multilingual search

2014-07-04 Thread benjelloun
or on language. i dont want to add stopwords_fr in stopwords_en. what i want is to detect the language before the select search then choose the field_langid for search. Best regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/multilingual-search-tp4145639.html

Re: Language specific tokenizer for purpose of multilingual search in single-core solr,

2012-02-15 Thread Chris Hostetter
: I want to do multilingual search in single-core solr. That requires to : define language specific tokenizers in scheme.xml. Say for example, I have : two tokenizers, one for English ("en") and one for simplified Chinese : ("zh-cn"). Can I just put following definit

Re: Language specific tokenizer for purpose of multilingual search in single-core solr,

2012-02-14 Thread Paul Libbrecht
only one field element? There should be two or? One for each language. paul Le 14 févr. 2012 à 07:34, bing a écrit : > > Hi, all, > > I want to do multilingual search in single-core solr. That requires to > define language specific tokenizers in scheme.xml. Say for exampl

Closed -- Re: Multilingual search in multicore solr

2012-02-01 Thread bing
Hi, Erick, Thanks for commenting on this thread, and I think my problem has been solved. I might start another thread raising technical questions about using SolrJ. Thank you again. Best Regards, Bing -- View this message in context: http://lucene.472066.n3.nabble.com/Multilingual

Re: Multilingual search in multicore solr

2012-02-01 Thread Erick Erickson
find many start-up tutorials about that, thus would be grateful if > any suggestions and hints brought about. > > Best > Bing > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3705556.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multilingual search in multicore solr

2012-01-31 Thread bing
e.com/Multilingual-search-in-multicore-solr-tp3698969p3705556.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multilingual search in multicore solr

2012-01-31 Thread Erick Erickson
core, but still is a concern. > Thanks. > > Best Regards, > Ni Bing > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3702041.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multilingual search in multicore solr

2012-01-30 Thread bing
, returned with a set of scores. Is it confident to conclude that the highest score gives the most confidence of the results? Thanks. Best Regards, Ni Bing -- View this message in context: http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3702041.html Sent from the

Re: Multilingual search in multicore solr

2012-01-30 Thread Erick Erickson
to stick on multicore solr and want to > see whether the problems can be solved. Meanwhile, I am aware that > multilingual search does not necessarily need multicore solr, which I have > learned in previous thread. > http://lucene.472066.n3.nabble.com/Tika0-10-language-identifier-in-Solr3-5-0-

Multilingual search in multicore solr

2012-01-29 Thread bing
Hi, all, I am going to multilingual search in multicore solr. Specifically, the design of the solr server is like: I have several cores corresponding to different languages, where each core has its configuration files and data. I have following questions: 1. While indexing a document, I use

Re: Multilingual - Search against the appropriate field

2010-07-01 Thread Saïd Radhouani
Hi Jan, I totally agree with what you said. In a), you talked about boosting. I guess you meant to boost at the client side, right? I still have a question: >> does Solr choose the appropriate analysis for the query. i.e., if a query is >> compared to a document having English free text (tex

Re: Multilingual - Search against the appropriate field

2010-07-01 Thread Saïd Radhouani
Hi Jan, I totally agree with what you said. In a), you talked about boosting. I guess you meant to boost at the client side, right? I still have a question: >> does Solr choose the appropriate analysis for the query. i.e., if a query is >> compared to a document having English free text (tex

Re: Multilingual - Search against the appropriate field

2010-07-01 Thread Jan Høydahl / Cominvent
Hi, I have chosen the same approach as you, indexing content into text_ fields with custom analysis, and it works great. Solr does not have any overhead with this even if there are hundreds of languages, due to the schema-less nature of Lucene. And if you know which language is being searched,

Multilingual - Search against the appropriate field

2010-07-01 Thread Saïd Radhouani
Hi, I know this topic has been treated many times in the (distant) past, but I wonder whether there are new better practices/tendencies. In my application, I'm dealing with documents in different languages. Each document is monolingual; it has some fields containing free text and a set of fiel

Re: MultiLingual Search

2008-05-17 Thread Otis Gospodnetic
/Malayalam_script Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Sachit P. Menon <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Saturday, May 17, 2008 8:44:06 AM > Subject: RE: MultiLingual Search > > Hi A

RE: MultiLingual Search

2008-05-17 Thread Sachit P. Menon
Hi All, I would like to know if Indian regional languages (like Malayalam, Kannada, Tamil, etc.) can also be indexed through Solr. Thanks and Regards Sachit P. Menon DISCLAIMER: This message (including attachment if any) is confidential and may be privileged. If you have received this message

Re: MultiLingual Search

2008-05-12 Thread Norberto Meijome
On Mon, 12 May 2008 16:16:28 +0530 "Sachit P. Menon" <[EMAIL PROTECTED]> wrote: > My project requires having the same content (mostly) in multiple languages. hi Sachit, please search the archives of the list. this topic seems to come up twice a week or thereabouts :) You are of course encoura

Re: MultiLingual Search

2008-05-12 Thread Alexander Ramos Jardim
gt; from > the front end. > > > > Can anyone tell me the steps to implement multilingual search in Solr as > I'm > very new to Solr? > > > > > > Thanks, > > Sachit > > > > DISCLAIMER: > This message (including attachment if any) is co

MultiLingual Search

2008-05-12 Thread Sachit P. Menon
to implement multilingual search in Solr as I'm very new to Solr? Thanks, Sachit DISCLAIMER: This message (including attachment if any) is confidential and may be privileged. If you have received this message by mistake please notify the sender by return e-mail and delete this me

Re: Multilingual Search

2008-05-09 Thread Grant Ingersoll
Yes. Solr handles UTF-8 and has many analyzers for non-English languages. -Grant On May 9, 2008, at 7:23 AM, Sachit P. Menon wrote: Can we have a multilingual search using Solr Thanks and Regards Sachit P. Menon| Programmer Analyst| MindTree Ltd. |West Campus, Phase-1, Global Village

Multilingual Search

2008-05-09 Thread Sachit P. Menon
Can we have a multilingual search using Solr Thanks and Regards Sachit P. Menon| Programmer Analyst| MindTree Ltd. |West Campus, Phase-1, Global Village, RVCE Post, Mysore Road, Bangalore-560 059, INDIA |Voice +91 80 26264000 |Extn 65377|Fax +91 80 26264100 | Mob : +91 9986747356