Ha! Searching "partial optimize" on
http://www.lucidimagination.com/search , we discover SOLR-603 which
gives the 'maxSegments' option to the command. The text
does not include the word 'partial'.
It's on http://wiki.apache.org/solr/UpdateXmlMessages. The command
gives a number of Lucene segments
I've heard there is a new "partial optimize" feature in Lucene, but it
is not mentioned in the Solr or Lucene wikis so I cannot advise you
how to use it.
On a previous project we had a 500GB index for 450m documents. It took
14 hours to optimize. We found that Solr worked well (given enough RAM
fo
I've now worked on three different search engines and they all have a
3X worst
case on space, so I'm familiar with this case. --wunder
On Oct 1, 2009, at 7:15 AM, Mark Miller wrote:
Nice one ;) Its not technically a case where optimize requires > 2x
though in case the user asking gets confuse
bq. and reindex without any merges.
Thats actually quite a hoop to jump as well - though if you determined
and you have tons of RAM, its somewhat doable.
Mark Miller wrote:
> Nice one ;) Its not technically a case where optimize requires > 2x
> though in case the user asking gets confused. Its a
Nice one ;) Its not technically a case where optimize requires > 2x
though in case the user asking gets confused. Its a case unrelated to
optimize that can grow your index. Then you need < 2x for the optimize,
since you won't copy the deletes.
It also requires that you jump hoops to delete everyth
Here is how you need 3X. First, index everything and optimize. Then
delete everything and reindex without any merges.
You have one full-size index containing only deleted docs, one full-
size index containing reindexed docs, and need that much space for a
third index.
Honestly, disk is che
Whoops - they way I have mail come in, not easy to tell if I'm replying
to Lucene or Solr list ;)
The way Solr works with Searchers and reopen, it shouldn't run into a
situation that requires greater than
2x to optimize. I won't guarantee it ;) But based on what I know, it
shouldn't happen under n
Phillip Farber wrote:
> I am trying to automate a build process that adds documents to 10
> shards over 5 machines and need to limit the size of a shard to no
> more than 200GB because I only have 400GB of disk available to
> optimize a given shard.
>
> Why does the size (du) of an index typically
It may take some time before resources are released and garbage
collected, so that may be part of the reason why things hang around
and du doesn't report much of a drop.
On Oct 1, 2009, at 8:54 AM, Phillip Farber wrote:
I am trying to automate a build process that adds documents to 10
shar