On 4/12/2011 6:21 AM, stockii wrote:
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(
any way to do both in the same time ?
You can't index and optimize at the same time, and I'm pretty sure that
there isn't any way to make it possible that wouldn't involve a major
rewrite of Lucene, and possibly Solr. The devs would have to say
differently if my understanding is wrong.
The optimize takes place at the Lucene level. I can't give you much
in-depth information, but I can give you some high level stuff. What
it's doing is equivalent to a merge, down to one segment. This is not
the same as a straight file copy. It must read the entire Lucene data
structure and build a new one from scratch. The process removes deleted
documents and will also upgrade the version number of the index if it
was written with an older version of Lucene. It's very likely that the
reading side of the process is nearly as comprehensive as the CheckIndex
program, but it also has to write out a new index segment.
The net result -- the process gives your CPU and especially your I/O
subsystem a workout, simultaneously. If you were to make your I/O
subsystem faster, you would probably see a major improvement in your
optimize times.
On my installation, it takes about 11 minutes to optimize one my 16GB
shards, each with 9 million docs. These live in virtual machines that
are stored on a six-drive RAID10 array using 7200RPM SATA disks. One of
my pie-in-the-sky upgrade dreams is to replace that with a four-drive
RAID10 array using SSD, the other two drives would be regular SATA -- a
mirrored OS partition.
Thanks,
Shawn