Hi Dan, This feels a bit like a buzzword soup.... with mushrooms. :)
MR jobs, at least the ones in Hadoopland, are very batch oriented, so that wouldn't be very suitable for most search applications. There are some technologies like Riak that combine MR and search. Let me use this funny little link: http://lmgtfy.com/?q=riak%20mapreduce%20search Sure, you can put indices on HDFS (but don't expect searches to be fast). Sure you can create indices using MapReduce, we've done that successfully for customers bringing long indexing jobs from many hours to minutes by using, yes, a cluster of machines (actually EC2 instances). But when you say "more into SOLR on the cloud (e.g. HDFS + MR + cloud of commodity machines)", I can't actually picture what precisely you mean... Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Dmitry Kan <dmitry....@gmail.com> > To: solr-user@lucene.apache.org > Cc: Upayavira <u...@odoko.co.uk> > Sent: Fri, March 25, 2011 8:26:33 AM > Subject: Re: solr on the cloud > > Hi, Upayavira > > Probably I'm confusing the terms here. When I say "distributed faceting" I'm > more into SOLR on the cloud (e.g. HDFS + MR + cloud of commodity machines) > rather than into traditional multicore/sharded SOLR on a single or multiple > servers with non-distributed file systems (is that what you mean when you > refer to "distribution of facet requests across hosts"?) > > On Fri, Mar 25, 2011 at 1:57 PM, Upayavira <u...@odoko.co.uk> wrote: > > > > > > > On Fri, 25 Mar 2011 13:44 +0200, "Dmitry Kan" <dmitry....@gmail.com> > > wrote: > > > Hi Yonik, > > > > > > Oh, this is great. Is distributed faceting available in the trunk? What > > > is > > > the basic server setup needed for trying this out, is it cloud with HDFS > > > and > > > SOLR with zookepers? > > > Any chance to see the related documentation? :) > > > > Distributed faceting has been available for a long time, and is > > available in the 1.4.1 release. > > > > The distribution of facet requests across hosts happens in the > > background. There's no real difference (in query syntax) between a > > standard facet query and a distributed one. > > > > i.e. you don't need SolrCloud nor Zookeeper for it. (they may provide > > other benefits, but you don't need them for distributed faceting). > > > > Upayavira > > > > > On Fri, Mar 25, 2011 at 1:35 PM, Yonik Seeley > > > <yo...@lucidimagination.com>wrote: > > > > > > > On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan <dmitry....@gmail.com> > > wrote: > > > > > Basically, of high interest is checking out the Map-Reduce for > > > > distributed > > > > > faceting, is it even possible with the trunk? > > > > > > > > Solr already has distributed faceting, and it's much more performant > > > > than a map-reduce implementation would be. > > > > > > > > I've also seen a product use the term "map reduce" incorrectly... as > > in, > > > > we "map" the request to each shard, and then "reduce" the results to a > > > > single list (of course, that's not actually map-reduce at all ;-) > > > > > > > > > > > :) this sounds pretty strange to me as well. It was only my guess, that > > > if > > > you have MR as computational model and a cloud beneath it, you could > > > naturally map facet fields to their counts inside single documents (no > > > matter, where they are, be it shards or "single" index) and pass them > > > onto > > > reducers. > > > > > > > > > > -Yonik > > > > http://www.lucenerevolution.org -- Lucene/Solr User Conference, May > > > > 25-26, San Francisco > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > Dmitry Kan > > > > > --- > > Enterprise Search Consultant at Sourcesense UK, > > Making Sense of Open Source > > > > > > > -- > Regards, > > Dmitry Kan >