Depending on how complicated you need this to be, you can just write your own in SolrJ, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ You haven't said a lot about the characteristics of your situation. Are you talking 1B rows from the DB? 1M? what is the pain point? Because until one gets to massive amounts of data, 9 times out of 10 poor indexing performance is a result of the DB query being used executing very slowly. Before jumping to a solution, it'd be good to know 1> why you're dissatisfied with DIH, i.e. what is the problem you're seeing 2> some information about your situation, size of DB, how fast DIH works now etc. This latter is important, 'cause it's a totally different question if, say, your problem statement is "it takes 8 hours to import 1,000,000,000 rows and the docs are 1M long" .vs. "it takes 8 hours to import 100,000 rows that are 1K each". Until there are answers to questions like that it's not clear at all you even _have_ a problem that's solvable by any of the suggestions so far. Best, Erick On Thu, Jan 31, 2019 at 12:34 PM Alexandre Rafalovitch <arafa...@gmail.com> wrote: > > Apache NiFi may also be something of interest: https://nifi.apache.org/ > > Regards, > Alex. > > On Thu, 31 Jan 2019 at 11:15, Mikhail Khludnev <m...@apache.org> wrote: > > > > Hello, > > > > I did this deck some time ago. It might be useful for choosing one. > > https://docs.google.com/presentation/d/e/2PACX-1vQzi3QOZAwLh_t3zs1gH9EGCB2HKUgiN3WJRGHpULyA-GleCrQ41dIOINa18h_XG64BX5D_ZG6jKmXL/pub?start=false&loop=false&delayms=3000 > > Note, as far as I understand Lucidworks' answer to this is Spark. > > > > > > On Thu, Jan 31, 2019 at 2:15 PM Srinivas Kashyap <srini...@bamboorose.com> > > wrote: > > > > > Hello, > > > > > > As we all know DIH is single threaded and has it's own issues while > > > indexing. > > > > > > Got to know that we can write our own API's to pull data from DB and push > > > it into solr. One such I heard was Apache Kafka being used for the > > > purpose. > > > > > > Can any of you send me the links and guides to use apache kafka to pull > > > data from DB and push into solr? > > > > > > If there are any other alternatives please suggest. > > > > > > Thanks and Regards, > > > Srinivas Kashyap > > > ________________________________ > > > DISCLAIMER: > > > E-mails and attachments from Bamboo Rose, LLC are confidential. > > > If you are not the intended recipient, please notify the sender > > > immediately by replying to the e-mail, and then delete it without making > > > copies or using it in any way. > > > No representation is made that this email or any attachments are free of > > > viruses. Virus scanning is recommended and is the responsibility of the > > > recipient. > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev