Hi Alexander,
Very useful the provided information.
Assinatura SPA Many thanks and best regards
**
*Rui Pimentel*
**
*DINSD - Departamento de Informática / SPA Digital*
Av. Duque de Loulé, 31 - 1069-153 Lisboa PORTUGAL
*T * (+ 351) 21 359 44 36 */* (+ 351) 21 359 44 00 *F* (+ 351) 21 353 02 57
<mailto:%7bmailsector...@spautores.pt> informat...@spautores.pt
<http://www.spautores.pt/>www.SPAutores.pt
<https://www.facebook.com/spautores>
<https://www.youtube.com/user/SPAutores1925><https://plus.google.com/107542947146636584118><https://www.linkedin.com/company/spautores>
Please consider the environment before printing this email
Esta mensagem electrónica, incluindo qualquer dos seus anexos, contém
informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e destina-se
unicamente à pessoa e endereço electrónico acima indicados. Se não for o
destinatário desta mensagem, agradecemos que a elimine e nos comunique
de imediato através do telefone +351 21 359 44 00 ou por email para:
ge...@spautores.pt <mailto:ge...@spautores.pt>
This electronic mail transmission including any attachment hereof,
contains information that is PRIVATE, CONFIDENTIAL and PROTECTED FROM
DISCLOSURE, and it is only for the use of the person and the e-mail
address above indicated. If you have received this electronic mail
transmission in error, please destroy it and notify us immediately
through the telephone number +351 21 359 44 00 or at the e-mail address:
ge...@spautores.pt
Assinatura SPA
On 2020-10-15 15:33, Alexandre Rafalovitch wrote:
Solr now has package managers and DIH is one of the packages to
reflect the fact that its development cycle is not locked to Solr's
and to reduce core download. Tika may be heading the same way, as
running Tika inside the Solr process could cause memory issues with
complex PDFs.
In terms of other ways of pre-process and load data into Solr, there
are things like:
1) Apache Camel https://camel.apache.org/
2) Apache NiFi https://nifi.apache.org/
Other commercial solutions also exist, such as StreamSets:
3)
https://streamsets.com/documentation/datacollector/latest/help//datacollector/UserGuide/Destinations/Solr.html
And, of course, you can always roll your own with SolrJ.
Regards,
Alex.
On Thu, 15 Oct 2020 at 10:08, DINSD | SPAutores
<informat...@spautores.pt.invalid> wrote:
Hi
Based on this document there are two ways to index document on the
Solr platform, https://lucidworks.com/post/indexing-with-solrj/
Quote:
"Two popular methods of indexing existing data are the Data Import
Handler (DIH) and Tika (Solr Cell)/ExtractingRequestHandler"
Now that DHI has been discontinued, only supported by a community
package, are there any other options?
Best regards
*Rui Pimentel*
**
*Rui Pimentel*
**
*DINSD - Departamento de Informática / SPA Digital*
Av. Duque de Loulé, 31 - 1069-153 Lisboa PORTUGAL
*T * (+ 351) 21 359 44 36 */* (+ 351) 21 359 44 00 *F* (+ 351) 21
353 02 57
<mailto:%7bmailsector...@spautores.pt> informat...@spautores.pt
<mailto:informat...@spautores.pt>
<http://www.spautores.pt/>www.SPAutores.pt <http://www.SPAutores.pt>
<https://www.facebook.com/spautores>
<https://www.youtube.com/user/SPAutores1925><https://plus.google.com/107542947146636584118><https://www.linkedin.com/company/spautores>
Please consider the environment before printing this email
Esta mensagem electrónica, incluindo qualquer dos seus anexos,
contém informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e
destina-se unicamente à pessoa e endereço electrónico acima
indicados. Se não for o destinatário desta mensagem, agradecemos
que a elimine e nos comunique de imediato através do telefone +351
21 359 44 00 ou por email para: ge...@spautores.pt
<mailto:ge...@spautores.pt>
This electronic mail transmission including any attachment hereof,
contains information that is PRIVATE, CONFIDENTIAL and PROTECTED
FROM DISCLOSURE, and it is only for the use of the person and the
e-mail address above indicated. If you have received this
electronic mail transmission in error, please destroy it and
notify us immediately through the telephone number +351 21 359 44
00 or at the e-mail address: ge...@spautores.pt
<mailto:ge...@spautores.pt>