Could you give us a dump of  http://localhost:port/solr/admin/luke ?

A huge max field length and random terms in 2000 2 MB files is going to be a bit of a resource hog :)

Can you explain why you are doing that? You will have *so* many unique terms...

I can't remember if you can set it in Solr, but there is a way to lessen how much RAM terms take in Lucene though (term interval I believe?).

- Mark

Gargate, Siddharth wrote:
Hi all,
        I am testing indexing with 2000 text documents of size 2 MB
each. These documents contain words created with random characters. I
observed that the tomcat memory usage goes on increasing slowly. I tried
by removing all the cache configuration, but still memory usage
increases. Once the memory reaches to max heap specified, commit looks
like blocked until the memory is freed. With larger documents, I see
some OOMEs
        Below are few properties set in solrconfig.xml

<mainIndex>
    <useCompoundFile>false</useCompoundFile>
    <ramBufferSizeMB>128</ramBufferSizeMB>
    <mergeFactor>25</mergeFactor>
    <maxMergeDocs>2147483647</maxMergeDocs>
    <maxFieldLength>2147483647</maxFieldLength>
    <writeLockTimeout>1000</writeLockTimeout>
    <commitLockTimeout>10000</commitLockTimeout>

   <lockType>single</lockType>
   <unlockOnStartup>false</unlockOnStartup>
</mainIndex> <autoCommit> <maxDocs>10000</maxDocs> <maxTime>7000</maxTime> </autoCommit>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>10</maxWarmingSearchers>

Where does the memory get used? And how to avoid it?

Thanks,
Siddharth



--
- Mark

http://www.lucidimagination.com



Reply via email to