Use SolrJ if you end up developing Indexer in Java to send documents to
Solr. Its been a long i have used DIH but you can gave it a try first,
otherwise as Walter suggested developing external indexer is best.
On Sun, Jul 9, 2017 at 6:46 PM, Walter Underwood
wrote:
> 4. Write an external progra
Thank you guys for your advice!
I would rather take advantage as much as possible of the existing
handlers/processors.
I just realised that nested entities in DIH is extremely slow: I fixed that
with a view on the DB (that does a join between 2 tables).
The other thing I have to do is chain th
I did this at Netflix with Solr 1.3, read stuff out of various databases and
sent it all to Solr. I’m not sure DIH even existed then.
At Chegg, we have slightly more elaborate system because we have so many
collections and data sources. Each content owner writes an “extractor” that
makes a JSON
>4. Write an external program that fetches the file, fetches the metadata,
>combines them, and send them to Solr.
I've done this with some custom crawls. Thanks to Erick Erickson, this is a
snap:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
With the caveat that Tika should really be i
4. Write an external program that fetches the file, fetches the metadata,
combines them, and send them to Solr.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 9, 2017, at 3:03 PM, Giovanni De Stefano wrote:
>
> Hello all,
>
> I have to index