Hi Yonik, Thanks again for the detail input.
Let me try to re-confirm my understanding - 1. What you say is - if sorting is asked for a field, the same field from all documents, which are indexed, would be put in a memory in an un-inverted form. So given this if I have a field of String type with say 20 characters, then (assuming no multibyte characters - all ascii) for 200M documents I need to have at least 20x200 MB, i.e. 4GB memory. 2. So, if I want to have sorting on 2 such fields I need to allocate at least 8 GB of memory. 3. Another case is - if there are 2 search requests concurrently hitting the server, each with sorting on the same 20 character date field, then also it would need 2x2GB memory. So if I know that I need to support at least 4 concurrent search requests, I need to start the JVM at least with 8 GB heap size. Please let me know if my understanding is correct. Regards, Sourav -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Monday, November 24, 2008 6:03 PM To: solr-user@lucene.apache.org Subject: Re: Sorting and JVM heap size .... On Mon, Nov 24, 2008 at 8:48 PM, souravm <[EMAIL PROTECTED]> wrote: > I have around 200M documents in index. The field I'm sorting on is a date > string (containing date and time in dd-mmm-yyyy hh:mm:yy format) and the > field is part of the search criteria. > > Also please note that the number of documents returned by the search criteria > is much less than 200M. In fact even in case of 0 hit I found jvm out of > memory exception. Right... that's just how the Lucene FieldCache used for sorting works right now. The entire field is un-inverted and held in memory. 200M docs is a *lot*... you might try indexing your date fields as integer types that would take only 4 bytes per doc - and that will still take up 800M. Given that 2 searchers can overlap, that still adds up to more than your heap - you will need to up that. The other option is to split your index across multiple nodes and use distributed search. If you want to do any faceting in the future, or sort on multiple fields, you will need to do this anyway. -Yonik **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS******** End of Disclaimer ********INFOSYS***