bq. Does you post mean that functionality for indexing documents in Solr using
ExtractRequestHandler doesn't provide the option of Indexing plain data
Frankly I don't know. It's just that if you plan to eventually offload
the Tika parsing onto a client (or use a service), does it make sense
to spe
Thanks Erick.
I do use this strategy for indexing data from DB. It is very flexible for
me.
I work in a company where .net is the main dev platform , so even more
important to separate things.
Does you post mean that functionality for indexing documents in Solr using
ExtractRequestHandler doesn't
While ERH is find for getting started, as you go toward production
you'll want to consider parsing the data outside of Solr for the
reasons (and example) outlined here:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
Best,
Erick
On Wed, Nov 14, 2018 at 6:46 AM Sergio García Maroto wrote:
>
Thanks a lot Jan.
That works very well.
I am now trying to index the doc in Solr deleting the extractOnly parameter
and can't find any similiar option to get the data indexed in plain text. I
am getting the metadata as well,
This is my request.
http://localhost:8983/solr/document/update/extract?it
Have you tried to specify &extractFormat=text
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 14. nov. 2018 kl. 12:09 skrev marotosg :
>
> Hi all,
>
> Currently I am trying to do index documents from different kinds with Solr
> and tika. It's working fine but when s
Hi all,
Currently I am trying to do index documents from different kinds with Solr
and tika. It's working fine but when solr returns the content of the
document. Doesn't return the plain text. It comes back as well with some
metadata.
For instance my request.
http://localhost:8983/solr/document