Re: Loading an index (generated by map reduce) in SolrCloud

KNitin Tue, 23 Sep 2014 08:41:01 -0700

Thanks for all the responses. I will try copying the corresponding segments
to the corresponding shards


On Wed, Sep 17, 2014 at 8:26 PM, ralph tice <ralph.t...@gmail.com> wrote:

> If you are updating or deleting from your indexes I don't believe it is
> possible to get a consistent copy of the index from the file system
> directly without monkeying with hard links.  The safest thing is to use the
> ADDREPLICA command in the Collections API and then an UNLOAD from the CORE
> API if you want to take the data offline.  If you don't care to use
> additional servers/JVMs, you can use the replication handler to make backup
> instead.
>
> This older discussion covers most any backup strategy I can think of:
> http://grokbase.com/t/lucene/solr-user/12c37h0g18/backing-up-solr-4-0
>
> On Wed, Sep 17, 2014 at 9:01 PM, shushuai zhu <ss...@yahoo.com.invalid>
> wrote:
>
> > Hi, my case is a little simpler. For example, I have 100 collections now
> > in my solr cloud, and I want to backup 20 of them so I can restore them
> > later. I think I can just copy the index and log for each shard/core to
> > another location, then delete the collections. Later, I can create new
> > collections (likely with different names), then copy the index and log
> back
> > to the right directory structure on the node. After that, I can either
> > reload the collection or core.
> >
> > However, some testing shows these do not work. I could not reload the
> > collection or core. Have not tried re-starting the solr cloud. Can
> someone
> > point out the best way to achieve the goal? I prefer not to re-start solr
> > cloud.
> >
> > Shushuai
> >
> >
> > ________________________________
> >  From: ralph tice <ralph.t...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Wednesday, September 17, 2014 6:53 PM
> > Subject: Re: Loading an index (generated by map reduce) in SolrCloud
> >
> >
> > FWIW, I do a lot of moving Lucene indexes around and as long as the core
> is
> > unloaded it's never been an issue for Solr to be running at the same
> time.
> >
> > If you move a core into the correct hierarchy for a replica, you can call
> > the Collections API's CREATESHARD action with the appropriate params
> (make
> > sure you use createNodeSet to point to the right server) and Solr will
> load
> > the index appropriately.  It's easiest to create a dummy shard and see
> > where data lands on your installation than to try to guess.
> >
> > Ex:
> > PORT=8983
> > SHARD=myshard
> > COLLECTION=mycollection
> > SOLR_HOST=box1.mysolr.corp
> > curl "http://
> >
> >
> ${SOLR_HOST}:${PORT}/solr/admin/collections?action=CREATESHARD&shard=${SHARD}&collection=${COLLECTION}&createNodeSet=${SOLR_HOST}:${PORT}_solr"
> >
> > One file to watch out for if you are moving cores across machines/JVMs is
> > the core.properties file, which you don't want to duplicate to another
> > server/location when moving a data directory.  I don't recommend trying
> to
> > move transaction logs around either.
> >
> >
> >
> >
> >
> > On Wed, Sep 17, 2014 at 5:22 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> > > Details please. You say MapReduce. Is this the
> > > MapReduceIndexerTool? If so, you can use
> > > the --go-live option to auto-merge them. Your
> > > Solr instances need to be running over HDFS
> > > though.
> > >
> > > If you don't have Solr running over HDFS, you can
> > > just copy the results for each shard "to the right place".
> > > What that means is that you must insure that the
> > > shards produced via MRIT get copied to the corresponding
> > > Solr local directory for each shard. If you put the wrong
> > > one in the wrong place you'll have trouble with multiple
> > > copies of documents showing up when you re-add any
> > > doc that already exists in your Solr installation.
> > >
> > > BTW, I'd surely stop all my Solr instances while copying
> > > all this around.
> > >
> > > Best,
> > > Erick
> > >
> > > On Wed, Sep 17, 2014 at 1:41 PM, KNitin <nitin.t...@gmail.com> wrote:
> > > > Hello
> > > >
> > > >  I have generated a lucene index (with 6 shards) using Map Reduce. I
> > want
> > > > to load this into a SolrCloud Cluster inside a collection.
> > > >
> > > > Is there any out of the box way of doing this?  Any ideas are much
> > > > appreciated
> > > >
> > > > Thanks
> > > > Nitin
> > >
> >
>

Re: Loading an index (generated by map reduce) in SolrCloud

Reply via email to