Jan Høydahl, My problem is intimately connected to Solr. it is not a batch job for hadoop, it is a distributed real-time query scheme. I hate to add yet another complex framework if a Solr RP can do the job simply.
For this problem, I can transform a Solr query into a subset query on each shard, and then let the SolrCloud mechanism. I am well aware of the 'zoo' of alternatives, and I will be evaluating them if I can't get what I want from Solr. On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl <jan....@cominvent.com> wrote: > Hi, > > Instead of using Solr, you may want to have a look at Hadoop or another > framework for distributed computation, see e.g. > http://java.dzone.com/articles/comparison-gridcloud-computing > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > On 9. apr. 2012, at 13:41, Benson Margulies wrote: > >> I'm working on a prototype of a scheme that uses SolrCloud to, in >> effect, distribute a computation by running it inside of a request >> processor. >> >> If there are N shards and M operations, I want each node to perform >> M/N operations. That, of course, implies that I know N. >> >> Is that fact available anyplace inside Solr, or do I need to just configure >> it? >