Thanks, Otis. Responses inline.
Hi,
We're using the new replication and it's working pretty well.
There's one detail
I'd like to get some more information about.
As the replication works, it creates versions of the index in the
data
directory. Originally we had index/, but now there are dated
versions such as
index.20100127044500/, which are the replicated versions.
Each copy is sized in the vicinity of 65G. With our current hard
drive it's fine
to have two around, but 3 gets a little dicey. Sometimes we're
finding that the
replication doesn't always clean up after itself. I would like to
understand
this better, or to not have this happen. It could be a
configuration issue.
Some more specific questions:
- Is it safe to remove the index/ directory (that doesn't have the
date on it)?
I think I tried this once and the whole thing broke, however maybe
something
else was wrong at the time.
No, that's the real, live index, you don't want to remove that one.
Yeah... I tried it once and remember things breaking.
However nothing in this directory has been modified for over a week
(since the last replication initialization). And I'm still sitting on
130GB of data for what is only 65GB on the master
- Is there a way to know which one is the current one? (I'm looking
at the file
index.properties, and it seems to be correct, but sometimes there's
a newer
version in the directory, which later is removed)
I think the "index" one is always current, no? If not, I imagine
the admin replication page will tell you, or even the Statistics page.
e.g.
reader :
SolrIndexReader{this=46a55e,r=readonlysegmentrea...@46a55e,segments=1}
readerDir : org.apache.lucene.store.NIOFSDirectory@/mnt/solrhome/
cores/foo/data/index
reader :
SolrIndexReader
{this=5c3aef1,r=readonlydirectoryrea...@5c3aef1,refCnt=1,segments=9}
readerDir : org.apache.lucene.store.NIOFSDirectory@/home/solr/solr_1.4/
solr/data/index.20100127044500
- Could it be that the index does not finish replicating in the
poll interval I
give it? What happens if, say there's a poll interval X and
replicating the
index happens to take longer than X sometimes. (Our current poll
interval is 45
minutes, and every time I'm watching it it completes in time.)
I think only 1 replication will/should be happening at a time.
Whew, that's comforting.