https://wiki.apache.org/solr/LanguageDetection
-Original message-
> From:Alessandro Benedetti
> Sent: Thursday 2nd July 2015 11:06
> To: solr-user@lucene.apache.org
> Subject: Re: language identification during solrj indexing
>
> SolrJ is simply a java client to ac
SolrJ is simply a java client to access Solr REST API.
This means that " indexing through SolrJ" doesn't exist.
You simply need to add the proper chain to the update request handler you
are using.
Taking a look to the code , by Default SolrJ UpdateRequest refers to the
"/update" endpoint.
Have you
In addition to the text_lang fields you can of course have a text_general
field which is unstemmed, where you put documents that you don't yet have
language specific handling for.
One potential issue of multi language search is detecting the language of the
query itself.
Sometimes your search pag
>From your response, I gather that there's no way to maintain a single set of
fields for multiple languages i.e. I can't use a field "text" for the body
text. Instead, I would have to define text_en, text_fr, text_ru etc each
mapped to their specific languages.
--
View this message in context:
Hi,
Q1. You use langid for the detection, and your chosen field(s) can be mapped to
new names such as title->title_en or title_de. Thus you need to configure
your schema with a separate fieldType for every language you want to support
if you'd like to use language specific stemming and stopwords e
It sounds like you want an update request processor:
http://wiki.apache.org/solr/UpdateRequestProcessor
But, it also sounds like you should probably be normalizing the encoding
before sending the data to Solr.
-- Jack Krupansky
-Original Message-
From: Yewint Ko
Sent: Sunday, Januar
I think nothing has "moved". We just offer Solr users to do language detection
inside of Solr, using any of these two libs. If you choose to do language
detection on client side instead, using any of these, what is stopping you?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominv
On Mon, Apr 23, 2012 at 1:27 PM, Bai Shen wrote:
> I was under the impression that solr does Tika and the language identifier
> that Shuyo did. The page at
> http://wiki.apache.org/solr/LanguageDetectionlists them both.
>
> class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProc
I was under the impression that solr does Tika and the language identifier
that Shuyo did. The page at
http://wiki.apache.org/solr/LanguageDetectionlists them both.
Again, I'm just trying to understand why it was moved to solr.
On Fri, Apr 20, 2012 at 6:02 PM, Jan Høydahl wrote:
> Hi,
>
>
Hi,
Solr just reuses Tika's language identifier. But you are of course free to do
your language detection on the Nutch side if you choose and not invoke the one
in Solr.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com
On 20. apr.
10 matches
Mail list logo