Re: Automatic Language Identification

2016-07-01 Thread William Bell
ge- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] > Sent: Wednesday, June 22, 2016 8:27 AM > To: solr-user@lucene.apache.org; solr-user > Subject: RE: Automatic Language Identification > > Hello, > > I recommend using the langdetect language detector, it supports

RE: Automatic Language Identification

2016-07-01 Thread Allison, Timothy B.
ssage- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Wednesday, June 22, 2016 8:27 AM To: solr-user@lucene.apache.org; solr-user Subject: RE: Automatic Language Identification Hello, I recommend using the langdetect language detector, it supports many more languages and has

RE: Automatic Language Identification

2016-06-22 Thread Markus Jelsma
ct: Re: Automatic Language Identification > > In both cases, the issues seems to be related to the library not being > loaded. For Tika identifier, I believe it is > solr-langid-.jar, for the sia.* it is whatever the book > recommended. > > Are you running SolrCloud? Additiona

Re: Automatic Language Identification

2016-06-22 Thread Alexandre Rafalovitch
used for > "automatic language identification" but when they failed to make a > collection in his process: > > 1. The automatic language identification > > > ERROR: Failed to create collection 'coba' due to: > org.apache.solr.client.solrj.impl.HttpSolrC

Automatic Language Identification

2016-06-22 Thread Hardika Catur S
Hi, I will make the collection in the collection solrcloud and used for "automatic language identification" but when they failed to make a collection in his process: 1. The automatic language identification ERROR: Failed to create collection &#x

RE: language identification during solrj indexing

2015-07-02 Thread Markus Jelsma
https://wiki.apache.org/solr/LanguageDetection -Original message- > From:Alessandro Benedetti > Sent: Thursday 2nd July 2015 11:06 > To: solr-user@lucene.apache.org > Subject: Re: language identification during solrj indexing > > SolrJ is simply a java client to ac

Re: language identification during solrj indexing

2015-07-02 Thread Alessandro Benedetti
" endpoint. Have you checked if you have your custom chain configured for that ? Cheers 2015-07-02 9:07 GMT+01:00 vineet yadav : > Hi, > > I want to identify language identification during solrj indexing. I have > made configuration changes required for language ident

language identification during solrj indexing

2015-07-02 Thread vineet yadav
Hi, I want to identify language identification during solrj indexing. I have made configuration changes required for language identification on the basis of solr wiki( https://cwiki.apache.org/confluence/display/solr/Detecting+Languages+During+Indexing ). language detection update chain is

Re: Language Identification and Stemming

2013-03-02 Thread Jan Høydahl
t; > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Language-Identification-and-Stemming-tp4044116p4044132.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: Language Identification and Stemming

2013-03-01 Thread vybe3142
this message in context: http://lucene.472066.n3.nabble.com/Language-Identification-and-Stemming-tp4044116p4044132.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Language Identification and Stemming

2013-03-01 Thread Jan Høydahl
Hi, Q1. You use langid for the detection, and your chosen field(s) can be mapped to new names such as title->title_en or title_de. Thus you need to configure your schema with a separate fieldType for every language you want to support if you'd like to use language specific stemming and stopwords e

Language Identification and Stemming

2013-03-01 Thread Vinay B,
As I understand, SOLR allows us to plug in language detection processors: http://wiki.apache.org/solr/LanguageDetection GIven that our use case involves a collection of mixed language documents, Q1: Assume that we plug in language detection, will this affect the stemming and other language specifi

Re: Language Identification in index time

2013-01-20 Thread Jack Krupansky
, January 20, 2013 10:36 AM To: solr-user@lucene.apache.org Subject: Language Identification in index time Hi all I am very new to solr and nutch. Currently i have a requirement to develop a small search engine for local movie websites. Because non standard encoding system currently using on many of

Re: Language Identification

2012-04-23 Thread Jan Høydahl
gt; -- >> Jan Høydahl, search solution architect >> Cominvent AS - www.cominvent.com >> Solr Training - www.solrtraining.com >> >> On 20. apr. 2012, at 21:49, Bai Shen wrote: >> >>> I'm working on using Shuyo's work to improve the language

Re: Language Identification

2012-04-23 Thread Robert Muir
On Mon, Apr 23, 2012 at 1:27 PM, Bai Shen wrote: > I was under the impression that solr does Tika and the language identifier > that Shuyo did.  The page at > http://wiki.apache.org/solr/LanguageDetectionlists them both. > > class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProc

Re: Language Identification

2012-04-23 Thread Bai Shen
r Training - www.solrtraining.com > > On 20. apr. 2012, at 21:49, Bai Shen wrote: > > > I'm working on using Shuyo's work to improve the language identification > of > > our search. Apparently, it's been moved from Nutch to Solr. Is there a > > rea

Re: Language Identification

2012-04-20 Thread Jan Høydahl
. apr. 2012, at 21:49, Bai Shen wrote: > I'm working on using Shuyo's work to improve the language identification of > our search. Apparently, it's been moved from Nutch to Solr. Is there a > reason for this? > > http://code.google.com/p/language-detection/issues/

Language Identification

2012-04-20 Thread Bai Shen
I'm working on using Shuyo's work to improve the language identification of our search. Apparently, it's been moved from Nutch to Solr. Is there a reason for this? http://code.google.com/p/language-detection/issues/detail?id=34 I would prefer to have the processing done in Nu