Are Kafka and SQS interchangeable? (The latter does not seem to be free.) @Wunder: I'm assuming, that updating to Solr would fail if Solr is unavailable not just if posting via say a DB trigger, but probably also if trying to post through SolrJ? (Which is what I'm using for now.) So, even if using SolrJ, it would be a good idea to use a queuing software?
Thanks On Fri, Mar 17, 2017 at 10:12 PM, vishal jain <jain02...@gmail.com> wrote: > Streaming the data through kafka would be a good option if near real time > data indexing is the key requirement. > In our application the RDBMS data is populated by an ETL job periodically > so we don't need real time data indexing for now. > > Cheers, > Vishal > > On Fri, Mar 17, 2017 at 10:30 PM, Erick Erickson <erickerick...@gmail.com> > wrote: > > > Or set a trigger on your RDBMS's main table to put the relevant > > information in a different table (call it EVENTS) and have your SolrJ > > consult the EVENTS table periodically. Essentially you're using the > > EVENTS table as a queue where the trigger is the producer and the > > SolrJ program is the consumer. > > > > It's a polling solution though, so not event-driven. There's no > > mechanism that I know of have, say, your RDBMS push an event to DIH > > for instance. > > > > Hmmm, I do wonder if anyone's done anything with queueing (e.g. Kafka) > > for this kind of problem.. > > > > Best, > > Erick > > > > On Fri, Mar 17, 2017 at 8:41 AM, Alexandre Rafalovitch > > <arafa...@gmail.com> wrote: > > > One assumes by hooking into the same code that updates RDBMS, as > > > opposed to be reverse engineering the changes from looking at the DB > > > content. This would be especially the case for Delete changes. > > > > > > Regards, > > > Alex. > > > ---- > > > http://www.solr-start.com/ - Resources for Solr users, new and > > experienced > > > > > > > > > On 17 March 2017 at 11:37, OTH <omer.t....@gmail.com> wrote: > > >>> > > >>> Also, solrj is good when you want your RDBMS updates make immediately > > >>> available in solr. > > >> > > >> How can SolrJ be used to make RDBMS updates immediately available? > > >> Thanks > > >> > > >> On Fri, Mar 17, 2017 at 2:28 PM, Sujay Bawaskar < > > sujaybawas...@gmail.com> > > >> wrote: > > >> > > >>> Hi Vishal, > > >>> > > >>> As per my experience DIH is the best for RDBMS to solr index. DIH > with > > >>> caching has best performance. DIH nested entities allow you to define > > >>> simple queries. > > >>> Also, solrj is good when you want your RDBMS updates make immediately > > >>> available in solr. DIH full import can be used for index all data > first > > >>> time or restore index in case index is corrupted. > > >>> > > >>> Thanks, > > >>> Sujay > > >>> > > >>> On Fri, Mar 17, 2017 at 2:34 PM, vishal jain <jain02...@gmail.com> > > wrote: > > >>> > > >>> > Hi, > > >>> > > > >>> > > > >>> > I am new to Solr and am trying to move data from my RDBMS to Solr. > I > > know > > >>> > the available options are: > > >>> > 1) Post Tool > > >>> > 2) DIH > > >>> > 3) SolrJ (as ours is a J2EE application). > > >>> > > > >>> > I want to know what is the recommended way for Data import in > > production > > >>> > environment. > > >>> > Will sending data via SolrJ in batches be faster than posting a csv > > using > > >>> > POST tool? > > >>> > > > >>> > > > >>> > Thanks, > > >>> > Vishal > > >>> > > > >>> > > >>> > > >>> > > >>> -- > > >>> Thanks, > > >>> Sujay P Bawaskar > > >>> M:+91-77091 53669 > > >>> > > >