Hi,
I am not sure if this is the right forum for this question, but it would be
great if I could be pointed in the right direction. We have been using a
combination of MySql and Solr for all our company full text and query needs.
But as our customers have grow so has the amount of data and MySql is just not
proving to be a right option for storing/querying.
I have been looking at Solr Cloud and it looks really impressive, but and not
sure if we should give away our storage system. So, I have been exploring
DataStax but a commercial option is out of question. So we were thinking of
using hbase to store the data and at the same time index the data into Solr
cloud, but for many reasons this design doesn't seem convincing (Also seen
basic of Lilly).
1) Would it be recommended to just user Solr cloud with multiple replication or
hbase-solr seems like good option
2) How much strain would be to keep both Solr Shard and Hbase node on the same
machine
3) if there a calculation on what kind of machine configuration would I need to
store 500-1000 million records. Most of these with be social data
(Twitter/facebook/blogs etc) and how many shards.
Regards,
Ayush