Hi, I am a newbie SolrCloud enthusiast. My goal is to implement an infrastructure to enable text analysis (clustering, classification, information extraction, sentiment analysis, etc).
My development environment consists of one machine, quad-core processor, 16GB RAM and 1TB HD. Have started implementing Apache Flume, Twitter as source and SolrCloud (within JBoss AS 7) as sink. Using Zookeeper (5 servers) to upload configuration and managing cluster. The pseudo-distributed cluster consists of one collection, three shards each with three replicas. Everything runs smoothly for a while. After 50.000 tweets committed (actually CloudSolrServer commits every batch consisting of 500 documents) randomly SolrCloud starts logging exceptions: Lucene file not found, IndexWriter cannot be opened, replication unsuccessful and the likes. Recovery starts with no success until replica goes down. Have tried different Solr versions (4.10.2, 4.9.1 and lastly 4.8.1) with same results. I have looked everywhere for help before writing this email. My guess right now is that the problem lies with SolrCloud and Zookeeper connection, although haven't seen any such exception. Any reference or help will be welcomed. Cheers, B.