Re: solr on the cloud

Otis Gospodnetic Fri, 25 Mar 2011 13:28:42 -0700

Hi Dan,

This feels a bit like a buzzword soup.... with mushrooms. :)


MR jobs, at least the ones in Hadoopland, are very batch oriented, so that 
wouldn't be very suitable for most search applications.  There are some 
technologies like Riak that combine MR and search.  Let me use this funny 
little 
link: http://lmgtfy.com/?q=riak%20mapreduce%20search


Sure, you can put indices on HDFS (but don't expect searches to be fast).  Sure 
you can create indices using MapReduce, we've done that successfully for 
customers bringing long indexing jobs from many hours to minutes by using, yes, 
a cluster of machines (actually EC2 instances).
But when you say "more into SOLR on the cloud (e.g. HDFS + MR +  cloud of 
commodity machines)", I can't actually picture what precisely you mean...  


Otis
---
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Dmitry Kan <dmitry....@gmail.com>
> To: solr-user@lucene.apache.org
> Cc: Upayavira <u...@odoko.co.uk>
> Sent: Fri, March 25, 2011 8:26:33 AM
> Subject: Re: solr on the cloud
> 
> Hi, Upayavira
> 
> Probably I'm confusing the terms here. When I say  "distributed faceting" I'm
> more into SOLR on the cloud (e.g. HDFS + MR +  cloud of commodity machines)
> rather than into traditional multicore/sharded  SOLR on a single or multiple
> servers with non-distributed file systems (is  that what you mean when you
> refer to "distribution of facet requests across  hosts"?)
> 
> On Fri, Mar 25, 2011 at 1:57 PM, Upayavira <u...@odoko.co.uk>  wrote:
> 
> >
> >
> > On Fri, 25 Mar 2011 13:44 +0200, "Dmitry Kan"  <dmitry....@gmail.com>
> >  wrote:
> > > Hi Yonik,
> > >
> > > Oh, this is great. Is  distributed faceting available in the trunk? What
> > > is
> > >  the basic server setup needed for trying this out, is it cloud with HDFS
> >  > and
> > > SOLR with zookepers?
> > > Any chance to see the  related documentation? :)
> >
> > Distributed faceting has been  available for a long time, and is
> > available in the 1.4.1  release.
> >
> > The distribution of facet requests across hosts happens  in the
> > background. There's no real difference (in query syntax) between  a
> > standard facet query and a distributed one.
> >
> > i.e. you  don't need SolrCloud nor Zookeeper for it. (they may provide
> > other  benefits, but you don't need them for distributed faceting).
> >
> >  Upayavira
> >
> > > On Fri, Mar 25, 2011 at 1:35 PM, Yonik  Seeley
> > > <yo...@lucidimagination.com>wrote:
> >  >
> > > > On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan <dmitry....@gmail.com>
> >  wrote:
> > > > > Basically, of high interest is checking out the  Map-Reduce for
> > > > distributed
> > > > > faceting, is  it even possible with the trunk?
> > > >
> > > > Solr  already has distributed faceting, and it's much more performant
> > >  > than a map-reduce implementation would be.
> > > >
> > >  > I've also seen a product use the term "map reduce" incorrectly...  as
> > in,
> > > > we "map" the request to each shard, and then  "reduce" the results to a
> > > > single list (of course, that's not  actually map-reduce at all ;-)
> > > >
> > > >
> > >  :) this sounds pretty strange to me as well. It was only my guess, that
> >  > if
> > > you have MR as computational model and a cloud beneath it,  you could
> > > naturally map facet fields to their counts inside single  documents (no
> > > matter, where they are, be it shards or "single"  index) and pass them
> > > onto
> > > reducers.
> >  >
> > >
> > > > -Yonik
> > > > http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> >  > > 25-26, San Francisco
> > > >
> > >
> >  >
> > >
> > > --
> > > Regards,
> > >
> >  > Dmitry Kan
> > >
> > ---
> > Enterprise Search Consultant at  Sourcesense UK,
> > Making Sense of Open  Source
> >
> >
> 
> 
> -- 
> Regards,
> 
> Dmitry Kan
>

Re: solr on the cloud

Reply via email to