Solr core swap after rebuild in HA-setup / High-traffic

2012-03-14 Thread KeesSchepers
Hello everybody,

I am designing a new Solr architecture for one of my clients. This sorl
architecture is for a high-traffic website with million of visitors but I am
facing some design problems were I hope you guys could help me out.

In my situation there are 4 Solr servers running, 1 server is master and 3
are slave. They are running Solr version 1.4.

I use two cores 'live' and 'rebuild' and I use Solr DIH to rebuild a core
which goes like this:

1. I wipe the reindex core
2. I run the DIH to the complete dataset (4 million documents) in peices of
20.000 records (to prevent very long mysql locks)
3. After the DIH is finished (2 hours) we have to also have to update the
rebuild core with changes from the last two hours, this is a problem
4. After updating is done and the core is not more then some seconds behind
we want to SWAP the cores.

Everything goes well except for step 3. The rebuild and the core swap is all
okay. 

Because the website is undergoing changes every minute we cannot pauze the
delta-import on the live and walk behind for 2 hours. The problem is that I
can't figure out a closing system with not delaying the live core to long
and use the DIH instead of writing a lot of code.

Did anyone face this problem before or could give me some tips?

Thanks!


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-core-swap-after-rebuild-in-HA-setup-High-traffic-tp3826461p3826461.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr core swap after rebuild in HA-setup / High-traffic

2012-03-14 Thread KeesSchepers
Well, the point is as follows.

We have a mysql table where all the changes are tracked something very
simular to your situation. The first problem is that, the delta-import on
the live core needs to update this table to notify a record is done. I do
this very awfull now within a script transformer, offcourse DIH isn't
designed for this. 

The second thing is, that if the rebuild is running on the rebuild core, we
want to do a delta-import on this new core to make it less behind from the
live core but also while the rebuilding process is ongoing also the
delta-import on the live core runs every minute. 

The second problem is that the delta-import on the live core already set's
these rows to status 'processed' and the delta-update after the rebuild
wouldn't pick up these updates anymore.

There are some solutions but I can't figure out a clean way to solve this
architecture problem. Maybe there isn't a clean solution..

I am curious how other developers experiencing this thing..

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-core-swap-after-rebuild-in-HA-setup-High-traffic-tp3826461p3826835.html
Sent from the Solr - User mailing list archive at Nabble.com.