Hi Shawn,

Thanks for your valuable inputs.

For your information we are using SQL Server.

Also, we will try to use the JOIN instead of Cache Entity and check it.

Regards
P.Yuvaraj Kumar
--------------------------------------------
On Wed, 9/7/14, Shawn Heisey <s...@elyograg.org> wrote:

 Subject: Re: Getting OutOfMemoryError: Java heap space in Solr
 To: solr-user@lucene.apache.org
 Date: Wednesday, 9 July, 2014, 9:24 PM
 
 On 7/9/2014 6:02 AM,
 yuvaraj ponnuswamy wrote:
 > Hi,
 >
 > I am getting the
 OutofMemory Error: "java.lang.OutOfMemoryError: Java
 heap space" often in production due to the particular
 Treemap is taking more memory in the JVM.
 >
 > When i looked into
 the config files I am having the entity called
 UserQryDocument where i am fetching the data from certain
 tables.
 > Again i have a sub entiry
 called "UserLocation" where i am using the
 CachedSqlEntityProcessor to get the fields from Cache. It
 seems like it has the total of 2,00,000 records total.
 >
 processor="CachedSqlEntityProcessor"
 cacheKey="user_pin"
 cacheLookup="UserQueryDocumentNonAuthor.DocKey">
 >
 > Like this i have some
 other different entity and there also i am using this
 CachedSqlEntityProcessor in the sub entity.
 >
 > But when i looked
 into the Heap Dump : java_pid57.hprof i am able to see the
 TreeMap is causing the problem.
 >
 > But not able to find which entity is
 causing this issue.I am using the IBM Heap Ananlyser to look
 into the Dump.
 >
 > Can
 you please let me know is there any other way we can find
 out which entity is causing this issue or any other tool to
 analyse and debug the Out of Memory Issue to find the exact
 entity is causing this issue.
 >
 > I have attched the entity in
 dataconfig.xml and heap Anayser screen shot.
 
 JDBC drivers have a habit of
 loading the entire resultset into RAM. 
 Also, you are using the cached processor ...
 which will effectively do
 the same thing. 
 With millions of DB rows, this is going to require a
 LOT of heap memory.  You'll want to change
 your JDBC connection so that
 it doesn't
 load the entire result set, and you may also need to turn
 off
 entity caching in Solr.  You didn't
 mention what database you're using. 
 Here's how to fix MySQL and SQL Server so
 they don't load the entire
 result set. 
 The requirements for another database are likely to be
 different:
 
 
https://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F
 
 The best way to make DIH
 perform well is to use JOIN so that you can get
 all your data with one entity and one SELECT
 query.  Let the database do
 all the heavy
 lifting instead of having Solr send millions of queries. 
 GROUP_CONCAT on the SQL side and a
 regexTransformer 'splitBy' can
 sometimes be used to get multiple values into a
 field.
 
 Thanks,
 Shawn
 

Reply via email to