I don't think the Solr Data Import Handler has a Cassandra plugin (entity
processor) yet, so the most straight forward approach is to write a Java app
that reads from Cassandra, then reads the corresponding RDBMS data, combines
the data, and then uses SolrJ to add documents to Solr.
Your best bet is to get that RDBMS data moved to Cassandra or DSE ASAP. All
you have until then is a stopgap measure rather than a robust architecture.
-- Jack Krupansky
-----Original Message-----
From: Yavar Husain
Sent: Tuesday, July 22, 2014 2:22 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cassandra MySQL Best Practice Indexing
Thanks Jack for your guidance on DSE. However it would be great if somebody
could help me solving my use case:
So my full text data lies on Cassandra along with an ID. Now I have a lot
of structured data linked to the ID which lies on an RDBMS (read MySQL). I
need this structured data as it would help me with my faceting and other
needs. What is the best practice in going about indexing in this scenario.
I will think about incremental indexing for the new records later.
Bit confused. Any help would be appreciated.
On Mon, Jul 21, 2014 at 6:51 PM, Jack Krupansky <j...@basetechnology.com>
wrote:
Solandra is not a supported product. DataStax Enterprise (DSE) supersedes
it. With DSE, just load your data into a Solr-enabled Cassandra data
center
and it will be indexed automatically in the embedded Solr within DSE, as
per a Solr schema that you provide. Then use any of the nodes in that
Solr-enabled Cassandra data center just the same as with normal Solr.
-- Jack Krupansky
-----Original Message----- From: Yavar Husain
Sent: Monday, July 21, 2014 8:37 AM
To: solr-user@lucene.apache.org
Subject: Solr Cassandra MySQL Best Practice Indexing
So my full text data lies on Cassandra along with an ID. Now I have a lot
of structured data linked to the ID which lies on an RDBMS (read MySQL). I
need this structured data as it would help me with my faceting and other
needs. What is the best practice in going about indexing in this scenario.
My thoughts (maybe weird):
1. Read the data from Cassandra, for each ID read, read the corresponding
row from MySQL for that ID, form an XML on the fly (for each ID) and send
it to Solr for Indexing without storing anything.
2. I do not have much idea on Solandra. However even if I use it I will
have to go to MySQL for fetching the structured data.
3. Duplicate the data and either get all of Cassandra to MySQL or vice
versa but then data duplication would happen.
I will think about incremental indexing for the new records later.
Bit confused. Any help would be appreciated.