If you absolutely want to use Kafka after trying other mechanisms, I would
suggest Kafka Connect. Jeremy Custenborder has a good Kafka connector as a
sink to SOLR. You can define your own avro schemas on the Kafka topic that
adhere to your SOLR schema to give you that degree of control.
We have us
Depending on how complicated you need this to be, you can just write
your own in SolrJ, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
You haven't said a lot about the characteristics of your situation.
Are you talking 1B rows
from the DB? 1M? what is the pain point? Because until on
Apache NiFi may also be something of interest: https://nifi.apache.org/
Regards,
Alex.
On Thu, 31 Jan 2019 at 11:15, Mikhail Khludnev wrote:
>
> Hello,
>
> I did this deck some time ago. It might be useful for choosing one.
> https://docs.google.com/presentation/d/e/2PACX-1vQzi3QOZAwLh_t3zs1g
Hello,
I did this deck some time ago. It might be useful for choosing one.
https://docs.google.com/presentation/d/e/2PACX-1vQzi3QOZAwLh_t3zs1gH9EGCB2HKUgiN3WJRGHpULyA-GleCrQ41dIOINa18h_XG64BX5D_ZG6jKmXL/pub?start=false&loop=false&delayms=3000
Note, as far as I understand Lucidworks' answer to this
I recommend to look at the underlying problem that you try to solve. Writing an
own loader requires thorough technical design (eg recoverability in case of
errors, stoping in case user requested it, proper multithreading without
overloading the cluster etc) - I have not seen many that were well
Hello,
As we all know DIH is single threaded and has it's own issues while indexing.
Got to know that we can write our own API's to pull data from DB and push it
into solr. One such I heard was Apache Kafka being used for the purpose.
Can any of you send me the links and guides to use apache ka