On 03/28/2010 02:54 PM, Blargy wrote:
I was hoping someone could explain to me how your Solr multicore process currently operates. This is what I am thinking about and I was hoping I could get some ideas/suggestions. I have a master/slave setup where the master will be doing all the indexing via DIH. Ill be doing a full-import every day or two with delta-imports being run throughout the day. I want to be able to have have an offline core that will be responsible for the the full-importing and when finished it will be swapped with the live core. While the full-import may take a few hours on the offline core Ill have delta-imports running on the live core. All slaves will be replicating from the master live core. Any comments on this logic?
Whats the purpose of the full import if you will also be doing delta imports? Won't the live core end up the same as the offline core that got the full import? I'm sure you have a reason, just not following...
Ok, now to the implementation. I've been playing around with the core admin all day today but Im still unsure on the best way to accomplish the above process. Im guessing first I need to create a new core. Then Ill have to issue a DIH full-import against this new core. Then Ill run a swap command against offline and live cores which should switch the cores. This sounds about right but then Ill have a core named live which will not actually be live anymore right? Is there anyway around this?
Hmm...this is not really true. The core that is accessed by hitting /live will always be the live core (though the underlying SolrCore object will change) if that is the access path you use for live traffic - see below.
When setting up the new core what should I use for my instanceDir and dataDir? At first I had something like this home/items/data/live/index home/items/data/offline/index but I dont think this is right. Should I have something like this? home/items/data/index home/items-offline/data/index
Yes - like this - the index dir under the data dir. But you only should make the data dir - the core will make the index dir when it does not see it - you will have issues if you make an empty index dir - seeing the dir, the core won't create it, and so the index will never get created inside the dir.
When creating a new core from an existing core do the index files get copied?
I'm not sure what you mean here? I'm guessing the swap command as you reference above?
Swap will simply change what path references which core. So to start, localhost:8983/solr/live will hit one core, and localhost:8983/solr/offline will hit another core. You will direct all traffic to /live. Once you do the swap(live,offline), the live URL will actually hit the other core, and the offline URL will hit the previously live core. So there is no move or copy of files - it simply swaps which name accesses which core. Same thing if you are using solrj - it just changes which access name brings back a given underlying core.
Can someone please explain to me this whole process. Thanks!
-- - Mark http://www.lucidimagination.com