Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Susheel Kumar
Use SolrJ if you end up developing Indexer in Java to send documents to Solr. Its been a long i have used DIH but you can gave it a try first, otherwise as Walter suggested developing external indexer is best. On Sun, Jul 9, 2017 at 6:46 PM, Walter Underwood wrote: > 4. Write an external progra

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Giovanni De Stefano
Thank you guys for your advice! I would rather take advantage as much as possible of the existing handlers/processors. I just realised that nested entities in DIH is extremely slow: I fixed that with a view on the DB (that does a join between 2 tables). The other thing I have to do is chain th

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Walter Underwood
I did this at Netflix with Solr 1.3, read stuff out of various databases and sent it all to Solr. I’m not sure DIH even existed then. At Chegg, we have slightly more elaborate system because we have so many collections and data sources. Each content owner writes an “extractor” that makes a JSON

RE: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Allison, Timothy B.
>4. Write an external program that fetches the file, fetches the metadata, >combines them, and send them to Solr. I've done this with some custom crawls. Thanks to Erick Erickson, this is a snap: https://lucidworks.com/2012/02/14/indexing-with-solrj/ With the caveat that Tika should really be i

Re: How to "chain" import handlers: import from DB and from file system

2017-07-09 Thread Walter Underwood
4. Write an external program that fetches the file, fetches the metadata, combines them, and send them to Solr. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jul 9, 2017, at 3:03 PM, Giovanni De Stefano wrote: > > Hello all, > > I have to index