Re: Synchronizing commit and optimize

2006-04-28 Thread Marcus Stratmann
Yonik Seeley wrote:
>I think you are probably right about Jetty timing out the request.
>Solr doesn't implement timeouts for requests, and I havent' seen this
>behavior with Solr running on Resin.
>
>You could try another app server like Tomcat, or perhaps figure out of
>the Jetty timeout is configurable.

You were right, it's an Jetty issue.
In Jetty's configuration in jetty.xml I changed the parameter
maxIdleTime which seems to be in milliseconds (I wasn't able to
find documentation for this anywhere). Increasing this value to
360 (1 hour) did the trick for me. The line is
360

The default value in the example installation is 3. Maybe
it wolud be a good idea to increase this, too.

Thanks,
Marcus



Re: Java heap space

2006-04-28 Thread Chris Hostetter
: I'm currently testing a large index with more than 10 million
: documents and 24 fields, using the example installation with
: Jetty.
: When deleting or updateing documents from the index or doing
: search queries I get "Java heap space" error messages like
: this (in this case while trying to delete documents):
...
: In both cases obviously the server ran out of heap space.
: I'm wondering what I can do to prevent this. I started the
: server using the java options "-server -Xmx1024m".
:
: Does anybody else have problems with heap space using a large
: index? Is there anything I can do against this?

How big is your physical index directory on disk?

Generally speaking, there isn't much you can do to improve upon the memory
needs of a Lucene index that Solr isn't allready doing: Reuse a single
IndexSearcher as much as possible.

Your best bet is to allocate as much ram to the server as you can.
Depending on how full your caches are, and what hitratios you are getting
(the "STATISTICS" link from the Admin screen will tell you) you might want
to make some of them smaller to reduce the amount of RAM Solr uses for
them.

>From an acctual index standpoint, if you don't care about doc/field boosts
of lengthNorms, then the omitNorm="true" option on your fields (or
fieldtypes) will help save one byte per document per field you use it on.


-Hoss



Re: Synchronizing commit and optimize

2006-04-28 Thread Chris Hostetter

: The default value in the example installation is 3. Maybe
: it wolud be a good idea to increase this, too.

The example is really just designed to make it easy to see Solr in action
really fast ... not to demonstrate any recommended configuration of the
Jetty -- if we make too many tweaks to the Jetty config, people might
missunderstand.

But we should definitely add something to the FAQ about timeouts, i'll do
that now.


-Hoss



Re: Java heap space

2006-04-28 Thread Marcus Stratmann

Chris Hostetter wrote:

How big is your physical index directory on disk?

It's about 2.9G now.
Is there a direct connection between size of index and usage of ram?


Your best bet is to allocate as much ram to the server as you can.
Depending on how full your caches are, and what hitratios you are getting
(the "STATISTICS" link from the Admin screen will tell you) you might want
to make some of them smaller to reduce the amount of RAM Solr uses for
them.

Hm, after disabling all caches I still get OutOfMemoryErrors.
All I do currently while testing is to delete documents. No searching or 
inserting. Typically after deleting about 20,000 documents the server 
throws the first error message.



From an acctual index standpoint, if you don't care about doc/field boosts

of lengthNorms, then the omitNorm="true" option on your fields (or
fieldtypes) will help save one byte per document per field you use it on.
That is something I could test, though I think this won't significantly 
change the size of the index.


One thing that appears suspicious to me is that everything went fine as 
long as the number of documents was below 10 million. Problems started 
when this limit was exceeded. But maybe this is just a coincidence.


Marcus


Re: Java heap space

2006-04-28 Thread Chris Hostetter
: > How big is your physical index directory on disk?
: It's about 2.9G now.
: Is there a direct connection between size of index and usage of ram?

Generally yes.  Lucene loads a lot of your index into memory ... not
neccessarily the "stored" fields, but quite a bit of the index structure
needed forsearching ... not to mention things like the FieldCache if you
start sorting by specific fields.

You may want to consult the lucene user list to get more advice about the
minimum amount of RAM recommended just for using a lucene index of size
___G in a JVM ... personally, I've always made sure my JVM was
allocated a max Heap size at least twice as big as my acctual index ...
mainly a paranoia thing on my part.

: Hm, after disabling all caches I still get OutOfMemoryErrors.
: All I do currently while testing is to delete documents. No searching or
: inserting. Typically after deleting about 20,000 documents the server
: throws the first error message.

interesting .. are you getting the OutOfMemory on an actual delete
operation or when doing a commit after executing some deletes?

part of the problem may be that under the covers, any delete involves
doing a query (even if oyou are deleting by uniqueKey, that's implimented
as a delete by Term, which requires iterating over a TermEnum to find the
relevent document, and if your index is big enough, loading that TermEnum
and may be the cause of your OOM.

: One thing that appears suspicious to me is that everything went fine as
: long as the number of documents was below 10 million. Problems started
: when this limit was exceeded. But maybe this is just a coincidence.

Maybe, maybe not ... what options are you using in your solrconfig.xml's
indexDefaults and mainIndex blocks? ... 10 million documents could be the
magic point at which your mergeFactor triggers the merging of several
large segments into one uber segment -- which may be big enough to cause
an OOM when the IndexReader tries to open it.

(Doug, Yonik, and Erik understand the underlying lucene memory usange
better then i do, hopefully they'll chime in with some advice)


-Hoss