Could you give us a dump of http://localhost:port/solr/admin/luke ?
A huge max field length and random terms in 2000 2 MB files is going to
be a bit of a resource hog :)
Can you explain why you are doing that? You will have *so* many unique
terms...
I can't remember if you can set it in Solr, but there is a way to lessen
how much RAM terms take in Lucene though (term interval I believe?).
- Mark
Gargate, Siddharth wrote:
Hi all,
I am testing indexing with 2000 text documents of size 2 MB
each. These documents contain words created with random characters. I
observed that the tomcat memory usage goes on increasing slowly. I tried
by removing all the cache configuration, but still memory usage
increases. Once the memory reaches to max heap specified, commit looks
like blocked until the memory is freed. With larger documents, I see
some OOMEs
Below are few properties set in solrconfig.xml
<mainIndex>
<useCompoundFile>false</useCompoundFile>
<ramBufferSizeMB>128</ramBufferSizeMB>
<mergeFactor>25</mergeFactor>
<maxMergeDocs>2147483647</maxMergeDocs>
<maxFieldLength>2147483647</maxFieldLength>
<writeLockTimeout>1000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
<lockType>single</lockType>
<unlockOnStartup>false</unlockOnStartup>
</mainIndex>
<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>7000</maxTime>
</autoCommit>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>10</maxWarmingSearchers>
Where does the memory get used? And how to avoid it?
Thanks,
Siddharth
--
- Mark
http://www.lucidimagination.com