On 9/7/2013 2:25 PM, mike st. john wrote: > yes the collections api ignored it, what i ended up doing, was just > building out some fairness in regards to creating the cores and calling > coreadmin to create the cores, seemed to work ok. Only issue i'm having > now, and i'm still investigating is subsequent queries are returning > different counts.
Every time I have seen distributed queries return different counts on different runs, it is because documents with the same value in the UniqueKey field exist in more than one shard. If you are letting SolrCloud route your documents automatically, this shouldn't happen ... but if you are using distrib=false or a router that doesn't do it automatically, then it could. The Collections API doesn't do the dataDir parameter. I suspect this is because you could pass an absolute path in, which would break things because every core would be trying to use the same dataDir. If you want a directory other than ${instanceDir}/data for dataDir, then you will need to create each core individually rather than use the Collections API. Java does have the capability to determine whether a path is relative or absolute, but it is safer to just ignore that parameter, especially given the fact that a single cloud is usually on many servers, and there's no reason those servers can't be running wildly different operating systems. Half your cloud could be on a Linux/UNIX OS and half of it could be on Windows. I personally find it better to let the Collections API do its thing and use the default. Thanks, Shawn