Hi,
I think guessing the language based purely on query string is OK *if* you are OK it not being very accurate and finding ways to work around that, say by giving users the options to switch to another language easily, allowing them to easily select a default language for them in the future, etc. Otis ---- Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html ----- Original Message ----- > From: nibing <nibing_...@hotmail.com> > To: solr-user@lucene.apache.org > Cc: > Sent: Thursday, January 19, 2012 10:35 PM > Subject: RE: Tika0.10 language identifier in Solr3.5.0 > > > Hi, Jan Høydahl You are right. I am hoping to detect the language of a > query, > so that the serarching can be done according to the language detected. Since > people often type a few words, which is too few to detect, then it is hard to > do > that. Let me describe a little bit about the solr server in my design. It > consists of several cores, corresponding to the several languages, which is > built during indexing. Since language detection in indexing can be done with > Tika identifier, then we are currently OK. But the problem is about > searching. I > want to do language detection first before do searching in the individual > cores. > In the case that detection result is ambiguous and several languages are > returned, we probably returns a set of results, and let user to decide which > language set of results they want to look into. In general, it is just the > same > with the language supported by google. Do you have some suggestions if I want > to > achieve multilingual search described as above? Thank you. > Best Regards > Ni, Bing > >> Subject: Re: Tika0.10 language identifier in Solr3.5.0 >> From: jan....@cominvent.com >> Date: Thu, 19 Jan 2012 12:31:01 +0100 >> To: solr-user@lucene.apache.org >> >> Hi, >> >> You may use the string as you choose, for instance filtering > (fq=language_s:en) or for faceting (facet.field=language_s). What are you > looking to do? >> >> What would you like to detect on the query side? The language of the search > string? That is very hard since people type very few words into the search > box. >> >> -- >> Jan Høydahl, search solution architect >> Cominvent AS - www.cominvent.com >> Solr Training - www.solrtraining.com >> >> On 19. jan. 2012, at 09:22, nibing wrote: >> >> > >> > Hi, all, >> > >> > >> > >> > I am using Solr3.5.0 which applies Tika0.10 to do language detection, >> > and I have a couple of questions about this function. >> > >> > >> > >> > 1. I can see the outcome of the language detection in a field >> > "language_s". But what action will be taken according to the > different >> > language code? How to configure? >> > >> > >> > >> > 2. Currently the language detection only happens in indexing. Is it >> > possible to use the function in searching as well? How to configure? >> > >> > >> > >> > Many thanks. >> > >> > >> > Best Regards >> > >> > Ni, Bing >> > >> >