You can't calculate it precisely, but you can test it. This is where EC2 is handy.
Otis Solr & ElasticSearch Support http://sematext.com/ On Jun 11, 2013 2:16 AM, "gururaj kosuru" <gururaj.kos...@gmail.com> wrote: > How can one calculate an ideal max shard size for a solr core instance if I > am running a cloud with multiple systems of 4GB? > > Thanks > > > On 11 June 2013 11:18, Walter Underwood <wun...@wunderwood.org> wrote: > > > An index does not need to fit into the heap. But a 4GB machine is almost > > certainly too small to run Solr with 40 million documents. > > > > wunder > > > > On Jun 10, 2013, at 10:36 PM, gururaj kosuru wrote: > > > > > Hi Walter, > > > thanks for replying. Do you mean that it is necessary > for > > > the index to fit into the heap? if so, will a heap size that is greater > > > than the actual RAM size slow down the queries? > > > > > > Thanks, > > > Gururaj > > > > > > > > > On 11 June 2013 10:36, Walter Underwood <wun...@wunderwood.org> wrote: > > > > > >> 2GB is a rather small heap. Our production systems run with 8GB and > > >> smaller indexes than that. Our dev and test systems run with 6GB > heaps. > > >> > > >> wunder > > >> > > >> On Jun 10, 2013, at 9:52 PM, gururaj kosuru wrote: > > >> > > >>> Hello, > > >>> I have recently started using solr 3.4 and have a standalone > > >>> system deployed that has 40,000,000 data rows with 3 indexed fields > > >>> totalling around 9 GB. I have given a Heap size of 2GB and I run the > > >>> instance on Tomcat on an i7 system with 4 GB RAM. My queries involve > > >>> searching among the indexed fields using the keyword 'OR' to search > for > > >>> multiple words in a single field and the keyword 'AND' to get the > > >>> intersection of multiple matches from different fields. The problem I > > >> face > > >>> is an out of memory exception that happens when the solr core has > been > > >>> queried for a long time and I am forced to restart the solr instance. > > My > > >>> primary questions are : > > >>> > > >>> 1. Can I make any changes in my query as I have noticed that if I > > divide > > >> my > > >>> query into parts eg- if my query has "A and B and C", executing only > A > > >>> gives out 75,000 results. However, If I run the whole query, I get an > > out > > >>> of memory error in SegmentNorms.java at line 156 which allocates a > new > > >> byte > > >>> array of size count. > > >>> > > >>> 2. Does my index need to be able to fit into the RAM in one go? > > >>> > > >>> 3. Will moving to Solr Cloud solve the out of memory exception issue > > and > > >> if > > >>> so, what will be the ratio of my RAM size to the shard size that > should > > >> be > > >>> used? > > >>> > > >>> Thanks a lot, > > >>> Gururaj > > >> > > >> -- > > >> Walter Underwood > > >> wun...@wunderwood.org > > >> > > >> > > >> > > >> > > > > -- > > Walter Underwood > > wun...@wunderwood.org > > > > > > > > >