Hi Alexander,

Very useful the provided information.

Assinatura SPA Many thanks and best regards
**
*Rui Pimentel*


**
*DINSD - Departamento de Informática / SPA Digital*
Av. Duque de Loulé, 31 - 1069-153 Lisboa PORTUGAL
*T * (+ 351) 21 359 44 36 */* (+ 351) 21 359 44 00 *F* (+ 351) 21 353 02 57
<mailto:%7bmailsector...@spautores.pt> informat...@spautores.pt
<http://www.spautores.pt/>www.SPAutores.pt
<https://www.facebook.com/spautores> <https://www.youtube.com/user/SPAutores1925><https://plus.google.com/107542947146636584118><https://www.linkedin.com/company/spautores>
Please consider the environment before printing this email

Esta mensagem electrónica, incluindo qualquer dos seus anexos, contém informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e destina-se unicamente à pessoa e endereço electrónico acima indicados. Se não for o destinatário desta mensagem, agradecemos que a elimine e nos comunique de imediato através do telefone +351 21 359 44 00 ou por email para: ge...@spautores.pt <mailto:ge...@spautores.pt>

This electronic mail transmission including any attachment hereof, contains information that is PRIVATE, CONFIDENTIAL and PROTECTED FROM DISCLOSURE, and it is only for the use of the person and the e-mail address above indicated. If you have received this electronic mail transmission in error, please destroy it and notify us immediately through the telephone number +351 21 359 44 00 or at the e-mail address: ge...@spautores.pt
Assinatura SPA
On 2020-10-15 15:33, Alexandre Rafalovitch wrote:
Solr now has package managers and DIH is one of the packages to reflect the fact that its development cycle is not locked to Solr's and to reduce core download. Tika may be heading the same way, as running Tika inside the Solr process could cause memory issues with complex PDFs.

In terms of other ways of pre-process and load data into Solr, there are things like:
1) Apache Camel https://camel.apache.org/
2) Apache NiFi https://nifi.apache.org/

Other commercial solutions also exist, such as StreamSets:
3) https://streamsets.com/documentation/datacollector/latest/help//datacollector/UserGuide/Destinations/Solr.html

And, of course, you can always roll your own with SolrJ.

Regards,
  Alex.



On Thu, 15 Oct 2020 at 10:08, DINSD | SPAutores <informat...@spautores.pt.invalid> wrote:

    Hi

    Based on this document there are two ways to index document on the
    Solr platform, https://lucidworks.com/post/indexing-with-solrj/

    Quote:
    "Two popular methods of indexing existing data are the Data Import
    Handler (DIH) and Tika (Solr Cell)/ExtractingRequestHandler"

    Now that DHI has been discontinued, only supported by a community
    package, are there any other options?

    Best regards
    *Rui Pimentel*

    **
    *Rui Pimentel*


    **
    *DINSD - Departamento de Informática / SPA Digital*
    Av. Duque de Loulé, 31 - 1069-153 Lisboa  PORTUGAL
    *T * (+ 351) 21 359 44 36 */* (+ 351) 21 359 44 00 *F* (+ 351) 21
    353 02 57
    <mailto:%7bmailsector...@spautores.pt> informat...@spautores.pt
    <mailto:informat...@spautores.pt>
    <http://www.spautores.pt/>www.SPAutores.pt <http://www.SPAutores.pt>
    <https://www.facebook.com/spautores>
    
<https://www.youtube.com/user/SPAutores1925><https://plus.google.com/107542947146636584118><https://www.linkedin.com/company/spautores>

    Please consider the environment before printing this email

    Esta mensagem electrónica, incluindo qualquer dos seus anexos,
    contém informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e
    destina-se unicamente à pessoa e endereço electrónico acima
    indicados. Se não for o destinatário desta mensagem, agradecemos
    que a elimine e nos comunique de imediato através do telefone +351
    21 359 44 00 ou por email para: ge...@spautores.pt
    <mailto:ge...@spautores.pt>

    This electronic mail transmission including any attachment hereof,
    contains information that is PRIVATE, CONFIDENTIAL and PROTECTED
    FROM DISCLOSURE, and it is only for the use of the person and the
    e-mail address above indicated. If you have received this
    electronic mail transmission in error, please destroy it and
    notify us immediately through the telephone number +351 21 359 44
    00 or at the e-mail address: ge...@spautores.pt
    <mailto:ge...@spautores.pt>

Reply via email to