On 3/31/2012 4:30 AM, Suneel wrote:
Hello friends,

I am using DIH for solr indexing. I have 60 million records in SQL which
need to upload on solr. i started caching its smoothly working and memory
consumption is normal, But after some time incrementally memory consumption
going high and process reach more then 6 gb. that the reason i am not able
to caching my data.
please advise me if anything need to be done in configuration or in tomcat
configuration.

I saw your later message about virtual memory and the directoryFactory - most of the time it is best to go with the default (solr.StandardDirectoryFactory), which you can do by specifying it explicitly or by leaving that configuration out.

When you talk about caching, are you talking about Solr's caches or OS/process memory and disk cache? If you are talking about the caches that you can configure in solrconfig.xml (filterCache, queryResultCache, and documentCache), you should not be trying to cache large portions of your index there. I have over 11 million documents in each of my index shards (68 million for the whole index) and my numbers for those three caches are 64, 512, and 16384, with autoWarm counts of 4 and 32, since the documentCache doesn't directly support warming.

If you are talking about how much memory Windows says the Java process says it is taking up, take a look at the replies you have already gotten on your Virtual Memory message. As Erick and Michael told you, if you are using the latest version (3.5) with the standard directoryFactory config, most of the memory that you are seeing there is because the OS is memory mapping your entire on-disk index, taking advantage of the OS disk cache to speed up disk access without actually allocating the memory involved. This is a good thing, even though the process numbers look bad. JConsole or another java memory tool can show you the true picture.

With 60 million records, even if those records are small, your Solr index will probably grow to several gigabytes. For the best performance, your server must have enough memory so that the entire index can fit into RAM, after discounting memory usage for the OS itself and the java process that contains Solr. If you can get MOST of the index into RAM, performance will likely still be acceptable.

You message implies that 6GB worries you very much, so I am guessing that your server has somewhere in the range of 4GB to 8GB of RAM, but your index is very much larger than this. You don't actually say whether you lose performance. Do you, or are you just worried about the memory usage? If Solr's query times start increasing, that is usually a good indicator that it is not healthy.

Thanks,
Shawn

Reply via email to