Hi all, Thanks for all your suggestions. Looks like I have to add a lot of RAM or use SSD to hold my index data eventually. For now, I am trying to reduce the size of the index data by removing unnecessary fields and set stored="false" for some fields.
Thanks, Po-Yu On Mon, Dec 1, 2014 at 10:20 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Po-Yu, > > To add what others have said: > * Your query cache is clearly not serving its purpose, so you are just > wasting your heap on it. Consider disabling it. > * That's a pretty big index. Do your queries really always have to go > against the whole index? Are there multiple "tenants" in this index that > would let you break up the index into multiple smaller indices? Can you > segment your index by time? Maybe by doing that some indices will be > hotter and some colder, and the OS could do a better job caching. > * You didn't say anything about your queries. Maybe they can be tighten to > pull less data off disk? > * Add RAM :) > > Otis > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > > On Sat, Nov 29, 2014 at 12:59 AM, Po-Yu Chuang <ratbert.chu...@gmail.com> > wrote: > > > Hi all, > > > > I am using Solr 4.9 with Tomcat. Thanks to the suggestions from Yonik and > > Dmitry about the slow start up. Everything works fine now, but I noticed > > that the load average of the server is high because there is constantly > > heavy disk read access. Please point me some directions. > > > > Some numbers about my system: > > RAM: 18G > > swap space: 2G > > number of documents: 27 million > > Solr home: 185G > > disk read access constantly 40-60M/s > > document cache size: 16K entries > > document cache hit ratio: 0.65 > > query cache size: 16K > > query cache hit ratio: 0.03 > > > > At first, I wondered if the disk read comes from swap, so I decreased the > > swappiness from 60 to 10, but the disk read is still there, which means > > that the disk read access does not result from swapping in. > > > > Then, I tried different document cache size and query different size. The > > effect on changing query cache size is not obvious. I tried 512, 16K, > 256K > > entries and the hit ratio is between 0.01 to 0.03. > > > > For document cache, the larger cache size did improve the hit ratio of > > document cache size (I tried 512, 16K, 256K, 512K, 1024K and the hit > ratio > > is between 0.58 - 0.87), but the disk read is still high. > > > > Is adjusting document cache size a reasonable direction? Or I should just > > increase the physical memory? Is there any method to estimate the right > > size of document cache (or other caches) and to estimate the size of > > physical memory needed? > > > > Thanks, > > Po-Yu > > >