I think it comes w/ some caveats, but is now workable (although it may not give great performance), assuming you're using 2.3 (2.2????) or later. I would definitely do a search in the Lucene archives about NFS, especially paying attention to Mike McCandless' comments.

On Jun 30, 2008, at 1:08 PM, Bill Au wrote:

Isn't using Lucene over NFS *not* recommended?

Bill

On Mon, Jun 30, 2008 at 4:27 AM, Nico Heid <[EMAIL PROTECTED]> wrote:

Hey, I'm looking for some feedback on the following setup.
Due to the architects decision I will be working with NFS not Solr's own
distribution scripts.

A few Solr indexing machines use Multicore to divide the 300.000 Users to
1000
shards.
For several reasons we have to go with per user sharding (as you can see
300
per shard) Updates come in with about 166 updates per hour on each shard.
So
not a problem.

The question lies more in this concept: I set up a few Query Slaves, using
NFS
readonly mounts.
I do not use the index directory for the readonly slaves. I patched the
slaves
to use the most recent snapshot directory to avoid all the nasty nfs
issues.
(only a quick and dirty hack for testing) On a not yet defined interval I
do a
snapshot on the masters and send a http commit to the slave, so a new
reader
on the fresh snapshot is opened.
This seems to work without trouble so far, but I've not done extensive
testing.

To take this a step further (only an idea yet). I let the slaves work on
the
real index, as long as I do not optimize. Because the directory structure
is
not changing as long as I do not optimize, I can send commits to the
slaves.
Before I optimize I take a snapshot, send them a special "commit" to make
them
fall back to the most recent snapshot dir, optimize the index and send them
a
real commit when done.
Even though a little trickier I would be more up to date with the query
slaves.

So if you have any design comments or see major or minor flaws, feedback
would
be very welcome.

I do not use live data yet, this is the experimental stage. But I'll give feedback on how it performs and what issues I run into. There's also the
faint
chance of letting this setup (or a "fixed" one) run on the real user data, which would be roughly 20TB of usable data for indexing. This would be
really
interesting :-)

Have a nice week
Nico




--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







Reply via email to