Sorry, I should have given more background. We have, at the moment 3.8 million documents of 0.7MB/doc average so we have extremely large shards. We build about 400,000 documents to a shard resulting 200GB/shard. We are also using LVM snapshots to manage a snapshot of the shard which we serve while we continue to build.
In order to optimize the building shard of around 200GB we need 400GB of 
 disk space to allow for 2x size increase. Due to the nature of 
snapshotting, the volume containing the snapshot has to be as large as 
the build volume, i.e. 400GB.
If we could write the optimized build shard elsewhere instead of "in 
place" we could avoid the need for the serving volume to match the size 
of the building volume.
We'd like to avoid the need to have 200GB+ hanging around just to 
optimize.
Responses we got on whether writing "elsewhere" optimize make it clear 
that's not a solution.
I posted another question to the list just a bit ago asking whether 
mergefactor=1 would give us a single segment index that is always 
optimized so that we don't have the 2x overhead.
However, running a build with merge factor=1 shows that lots of segments 
get created/merged and that the index grows in size but shrinks at 
intervals to a degree too.  It is not clear how big the index is at any 
point in time.

Chris Hostetter wrote:
: Is it possible to tell Solr or Lucene, when optimizing, to write the files
: that constitute the optimized index to somewhere other than
: SOLR_HOME/data/index or is there something about the optimize that requires
: the final segment to be created in SOLR_HOME/data/index?

        For what purpose?

http://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341




-Hoss


Reply via email to