Thanks, this is helpful.

I do indeed periodically update or delete just about every doc in the index, so it makes sense that optimization might be neccesary even in post 1.4, but I'm still on 1.4 -- add this to another thing to look into rather than assume after I upgrade.

Indeed I was aware that it would trigger a pretty complete index replication, but, since it seemed to greatly improve performance (in 1.4), so it goes. But yes, I'm STILL only updating once a day, even with all that. (And in fact, I'm only replicating once a day too, ha).

On 7/25/2011 10:50 AM, Erick Erickson wrote:
Yeah, the 1.4 code base is "older". That is, optimization will have more
effect on that vintage code than on 3.x and trunk code.

I should have been a bit more explicit in that other thread. In the case
where you add a bunch of documents, optimization doesn't buy you all
that much currently. If you delete a bunch of docs (or update a bunch of
existing docs), then optimization will reclaim resources. So you *could*
have a case where the size of your index shrank drastically after
optimization (say you updated the same 100K documents 10 times then
optimized).

But even that is "it depends" (tm). The new segment merging, as I remember,
will possibly reclaim deleted resources, but I'm parroting people who actually
know, so you might want to verify that if it

Optimization will almost certainly trigger a complete index replication to any
slaves configured, though.

So the usual advice is to optimize maybe once a day or week during off hours
as a starting point unless and until you can verify that your
particular situation
warrants optimizing more frequently.

Best
Erick

On Fri, Jul 22, 2011 at 11:53 AM, Jonathan Rochkind<rochk...@jhu.edu>  wrote:
How old is 'older'?  I'm pretty sure I'm still getting much faster performance 
on an optimized index in Solr 1.4.

This could be due to the nature of my index and queries (which include some 
medium sized stored fields, and extensive facetting -- facetting on up to a 
dozen fields in every request, where each field can include millions of unique 
values. Amazing I can do this with good performance at all!).

It's also possible i'm wrong about that faster performance, i haven't done 
robustly valid benchmarking on a clone of my production index yet. But it 
really looks like that way to me, from what investigation I have done.

If the answer is that optimization is believed no longer neccesary on versions 
LATER than 1.4, that might be the simplest explanation.
________________________________________
From: Pierre GOSSE [pierre.go...@arisem.com]
Sent: Friday, July 22, 2011 10:23 AM
To: solr-user@lucene.apache.org
Subject: RE: commit time and lock

Hi Mark

I've read that in a thread title " Weird optimize performance degradation", where Erick Erickson 
states that "Older versions of Lucene would search faster on an optimized index, but this is no longer 
necessary.", and more recently in a thread you initiated a month ago "Question about 
optimization".

I'll also be very interested if anyone had a more precise idea/datas of 
benefits and tradeoff of optimize vs merge ...

Pierre


-----Message d'origine-----
De : Marc SCHNEIDER [mailto:marc.schneide...@gmail.com]
Envoyé : vendredi 22 juillet 2011 15:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Hello,

Pierre, can you tell us where you read that?
"I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx"

Marc.

On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSE<pierre.go...@arisem.com>wrote:

Solr will response for search during optimization, but commits will have to
wait the end of the optimization process.

During optimization a new index is generated on disk by merging every
single file of the current index into one big file, so you're server will be
busy, especially regarding disk access. This may alter your response time
and has very negative effect on the replication of index if you have a
master/slave architecture.

I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx, so maybe you
don't really need optimization. What version of solr are you using ? Maybe
someone can point toward a relevant link about optimization other than solr
wiki
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations

Pierre


-----Message d'origine-----
De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
Envoyé : vendredi 22 juillet 2011 12:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Thanks for clarity.

One more thing I want to know about optimization.

Right now I am planning to optimize the server in 24 hour. Optimization is
also time taking ( last time took around 13 minutes), so I want to know
that
:

1. when optimization is under process that time will solr server response
or
not?
2. if server will not response then how to do optimization of server fast
or
other way to do optimization so our user will not have to wait to finished
optimization process.

regards
Jonty



On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE<pierre.go...@arisem.com
wrote:
Solr still respond to search queries during commit, only new indexations
requests will have to wait (until end of commit?). So I don't think your
users will experience increased response time during commits (unless your
server is much undersized).

Pierre

-----Message d'origine-----
De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
Envoyé : jeudi 21 juillet 2011 20:27
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Actually i m worried about the response time. i k commiting around 500
docs in every 5 minutes. as i know,correct me if i m wrong; at the
time of commiting solr server stop responding. my concern is how to
minimize the response time so user not need to wait. or any other
logic will require for my case. please suggest.

regards
jonty

On Tuesday, June 21, 2011, Erick Erickson<erickerick...@gmail.com>
wrote:
What is it you want help with? You haven't told us what the
problem you're trying to solve is. Are you asking how to
speed up indexing? What have you tried? Have you
looked at: http://wiki.apache.org/solr/FAQ#Performance?

Best
Erick

On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods<jonty.rh...@gmail.com>
wrote:
I am using solrj to index the data. I have around 50000 docs indexed.
As
at
the time of commit due to lock server stop giving response so I was
calculating commit time:

double starttemp = System.currentTimeMillis();
server.add(docs);
server.commit();
System.out.println("total time in commit = " +
(System.currentTimeMillis() -
starttemp)/1000);

It taking around 9 second to commit the 5000 docs with 15 fields.
However I
am not confirm the lock time of index whether it is start
since server.add(docs); time or server.commit(); time only.

If I am changing from above to following

server.add(docs);
double starttemp = System.currentTimeMillis();
server.commit();
System.out.println("total time in commit = " +
(System.currentTimeMillis() -
starttemp)/1000);

then commit time becomes less then 1 second. I am not sure which one
is
right.

please help.

regards
Jonty

Reply via email to