Shawn this seems to be a great implementation work. I am also trying to do something similar. But i am planning to use SolrCloud which manages shards and replication automatically,pls make me correct if i am wrong. I had explored mongoDB and cassandra too and mongoDB was fair enough to be used, but the main concern was creating and maintaining the index.
Same issue still persist with SolrCloud, but dataimport handler come to rescue. My data base is too large around 2-3TB and is ACL based. I am able to create initial full index with 4 shard configuration with 10 machines cluster using dataimport handler. But the real isseu starts now, how to maintain index up to date along with database, cause my search needs to be a near real time. There will be around few millions of updates in a day and the index needs to be up to date. As a result of this i have planned to implement JMS interface for async update in to index along with db. This seems to be a very heavy task considering above mentioned update counts per day. The root cause for this heavy load is per document level ACL implementation in solr, say each solr doc ships a list of users who have access to it,which incurs a heavy load of update on index. Can you pls suggest any alternate for this to maintain index up to date, along with needs your input on how to manage ACL ? Thanks a lot. -- View this message in context: http://lucene.472066.n3.nabble.com/hot-shard-concept-tp4016676p4017491.html Sent from the Solr - User mailing list archive at Nabble.com.