On 9/11/2013 1:07 PM, Deepak Konidena wrote:
Are you suggesting a multi-core setup, where all the cores share the same
schema, and the cores lie on different disks?

Basically, I'd like to know if I can distribute shards/segments on a single
machine (with multiple disks) without the use of zookeeper.

Sure, you can do it all manually. At that point you would not be using SolrCloud at all, because the way to enable SolrCloud is to tell Solr where zookeeper lives.

Without SolrCloud, there is no cluster automation at all. There is no "collection" paradigm, you just have cores. You have to send updates to the correct core; they not be redirected for you. Similarly, queries will not be load balanced automatically. For Java clients, the CloudSolrServer object can work seamlessly when servers go down. If you're not using SolrCloud, you can't use CloudSolrServer.

You would be in charge of creating the shards parameter yourself. The way that I do this on my index is that I have a "broker" core that has no index of its own, but its solrconfig.xml has the shards and shards.qt parameters in all the request handler definitions. You can also include the parameter with the query.

You would also have to handle redundancy yourself, either with replication or with independently updated indexes. I use the latter method, because it offers a lot more flexibility than replication.

As mentioned in another reply, setting up RAID with a lot of disks may be better than trying to split your index up on different filesystems that each reside on different disks. I would recommend RAID10 for Solr, and it works best if it's hardware RAID and the controller has battery-backed (or NVRAM) cache.

Thanks,
Shawn

Reply via email to