I have a Data Schema which is Hierarchical i.e. I have an Entity and a number of attributes. For a small subset of the Data - about 300 MB, I can do the import with 3 GB memory. Now with the entire 4 GB Dataset, I find I cannot do the import with 9 GB of memory. I am using the SqlEntityProcessor as below:
<dataConfig> <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://localhost\MSSQLSERVER;databaseName=SolrDB;user=solrusr;password=solrusr;"/> <document> <entity name="Entity" query="SELECT EntID, Image FROM ENTITY_TABLE"> <field column="EntID" name="EntID" /> <field column="Image" name="Image" /> <entity name="EntityAttribute1" query="SELECT AttributeValue, EntID FROM ATTR_TABLE WHERE AttributeID=1" cacheKey="EntID" cacheLookup="Entity.EntID" processor="SqlEntityProcessor" cacheImpl="SortedMapBackedCache"> <field column="AttributeValue" name="EntityAttribute1" /> </entity> <entity name="EntityAttribute2" query="SELECT AttributeValue, EntID FROM ATTR_TABLE WHERE AttributeID=2" cacheKey="EntID" cacheLookup="Entity.EntID" processor="SqlEntityProcessor" cacheImpl="SortedMapBackedCache"> <field column="AttributeValue" name="EntityAttribute2" /> </entity> </entity> </document> </dataConfig> What is the best way to import this data? Doing it without a cache, results in many SQL queries. With the cache, I run out of memory. I’m curious why 4GB of data cannot entirely fit in memory. One thing I need to mention is that I have about 400 to 500 attributes. Thanks in advance for any helpful advice. O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/DataImport-using-SqlEntityProcessor-running-Out-of-Memory-tp4135080.html Sent from the Solr - User mailing list archive at Nabble.com.