On Wed, Oct 7, 2009 at 3:56 PM, Mark Miller wrote:
> I guess you can't guarantee 2x though, as if you have queries coming in
> that take a while, a commit opening a new Reader will not guarantee the
> old Reader is quite ready to go away. Might want to wait a short bit
> after the commit.
Right -
Yonik Seeley wrote:
> On Wed, Oct 7, 2009 at 3:31 PM, Mark Miller wrote:
>
>> I can't tell why calling a commit or restarting is going to help
>> anything
>>
>
> Depends on what scenarios you consider, and what you are taking 2x of.
>
> 1) Open reader on index
> 2) Open writer and add two
On Wed, Oct 7, 2009 at 3:31 PM, Mark Miller wrote:
> I can't tell why calling a commit or restarting is going to help
> anything
Depends on what scenarios you consider, and what you are taking 2x of.
1) Open reader on index
2) Open writer and add two documents... the first causes a large
merge,
Okay - I think I've got you - your talking about the case of adding a
bunch of docs, not calling commit, and then trying to optimize. I keep
coming at it from a cold optimize. Making sense to me now.
Mark Miller wrote:
> I can't tell why calling a commit or restarting is going to help
> anything -
I can't tell why calling a commit or restarting is going to help
anything - or why you need more than 2x in any case. The only reason i
can see this being is if you have turned on auto-commit. Otherwise the
Reader is *always* only referencing what would have to be around anyway.
Your likely to jus
On Wed, Oct 7, 2009 at 3:16 PM, Phillip Farber wrote:
> Wow, this is weird. I commit before I optimize. In fact, I bounce tomcat
> before I optimize just in case. It makse sense, as you say, that then "the
> open reader can only be holding references to segments that wouldn't be
> deleted until
Oops, send before finished. "Partial Optimize" aka "maxSegments" is a
recent Solr 1.4/Lucene 2.9 feature.
As to 2x v.s. 3x, the general wisdom is that an optimize on a "simple"
index takes at most 2x disk space, and on a "compound" index takes at
most 3x. "Simple" is the default (*). At Divvio we
Wow, this is weird. I commit before I optimize. In fact, I bounce
tomcat before I optimize just in case. It makse sense, as you say, that
then "the open reader can only be holding references to segments that
wouldn't be deleted until the optimize is complete anyway".
But we're still exceedin
On Wed, Oct 7, 2009 at 1:34 PM, Shalin Shekhar Mangar
wrote:
> On Wed, Oct 7, 2009 at 10:45 PM, Jason Rutherglen <
> jason.rutherg...@gmail.com> wrote:
>
>> It would be good to be able to commit without opening a new
>> reader however with Lucene 2.9 the segment readers for all
>> available segmen
On Wed, Oct 7, 2009 at 1:50 PM, Phillip Farber wrote:
> So this implies that for a "normal" optimize, in every case, due to the
> Searcher holding open the existing segment prior to optimize that we'd
> always need 3x even in the normal case.
>
> This seems wrong since it is repeated stated that i
To be clear, the SRs created by merges don't have the term index
loaded which is the main cost. One would need to use
IndexReaderWarmer to load the term index before the new SR becomes a
part of SegmentInfos.
On Wed, Oct 7, 2009 at 10:34 AM, Shalin Shekhar Mangar
wrote:
> On Wed, Oct 7, 2009 at
Yonik Seeley wrote:
Does this means that there's always a lucene IndexReader holding segment
files open so they can't be deleted during an optimize so we run out of disk
space > 2x?
Yes.
A feature could probably now be developed now that avoids opening a
reader until it's requested.
That wa
I think that argument requires auto commit to be on and opening readers
after the optimize starts? Otherwise, the optimized version is not put
into place until a commit is called, and a Reader won't see the newly
merged segments until then - so the original index is kept around in
either case - hav
On Wed, Oct 7, 2009 at 10:45 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> It would be good to be able to commit without opening a new
> reader however with Lucene 2.9 the segment readers for all
> available segments are already created and available via
> getReader which manages the
It would be good to be able to commit without opening a new
reader however with Lucene 2.9 the segment readers for all
available segments are already created and available via
getReader which manages the reference counting internally.
Using reopen redundantly creates SRs that are already held
inte
On Wed, Oct 7, 2009 at 12:51 PM, Phillip Farber wrote:
>
> In a separate thread, I've detailed how an optimize is taking > 2x disk
> space. We don't use solr distribution/snapshooter. We are using the default
> deletion policy = 1. We can't optimize a 192G index in 400GB of space.
>
> This thread
16 matches
Mail list logo