Sounds good! So the take away lesson here is to remember cache pre-warming. And of course keep track of RAM allocation :)
On Tue, Jan 17, 2012 at 11:23 PM, Daniel Bruegge < daniel.brue...@googlemail.com> wrote: > Ok, I have now changed the static warming in the solrconfig.xml using > first- and newSearcher. > "Content" is my field to facet on. Now the commits take longer, which is OK > for me, but the searches are really faster right now. I also reduced the > number of documents on my shards to 15mio/shard. So the index is about > 3.5G, which fits also in my memory I hope. > > <listener event="newSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <lst> > <str name="q">*:*</str> > <str name="facet">true</str> > <str name="facet.field">content</str> > <str name="facet.limit">1</str> > <str name="facet.mincount">1</str> > </lst> > </arr> > </listener> > <listener event="firstSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <lst> > <str name="q">*:*</str> > <str name="facet">true</str> > <str name="facet.field">content</str> > <str name="facet.limit">1</str> > <str name="facet.mincount">1</str> > </lst> > </arr> > </listener> > > > On Tue, Jan 17, 2012 at 2:36 PM, Daniel Bruegge < > daniel.brue...@googlemail.com> wrote: > > > Evictions are 0 for all cache types. > > > > Your server max heap space with 12G is pretty huge. Which is good I > think. > > The CPU on my server is a 8-Core Intel i7 965. > > > > Commit frequency is low, because shards are added and old shards exist > for > > historical reasons. Old shards will be then cleaned after couple of > months. > > > > I will try to add maximum 15mio per shard and see what will happen here. > > > > This thing is, that I will add more shards over time, so that I can > handle > > maybe 500-800mio documents. Maybe more. It depends. > > > > On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dmitry....@gmail.com> > wrote: > > > >> Hi Daniel, > >> > >> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is > >> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m > >> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs > is > >> over 6,5 million. > >> > >> Do you see any evictions in your caches? What kind of server is it, in > >> terms of CPU and OS? How often do you commit to the index? > >> > >> Dmitry > >> > >> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge < > >> daniel.brue...@googlemail.com> wrote: > >> > >> > Hi Dmitry, > >> > > >> > I had everything on one Solr Instance before, but this got to heavy > and > >> I > >> > had the same issue here, that the 1st facet.query was really slow. > >> > > >> > When querying the facet: > >> > - facet.limit = 100 > >> > > >> > Cache settings are like this: > >> > > >> > <filterCache class="solr.FastLRUCache" > >> > size="16384" > >> > initialSize="4096" > >> > autowarmCount="4096"/> > >> > > >> > <queryResultCache class="solr.LRUCache" > >> > size="512" > >> > initialSize="512" > >> > autowarmCount="0"/> > >> > > >> > <documentCache class="solr.LRUCache" > >> > size="512" > >> > initialSize="512" > >> > autowarmCount="0"/> > >> > > >> > How big was your index? Did it fit into the RAM which you gave the > Solr > >> > instance? > >> > > >> > Thanks > >> > > >> > > >> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dmitry....@gmail.com> > >> wrote: > >> > > >> > > I had a similar problem for a similar task. And in my case merging > the > >> > > results from two shards turned out to be a culprit. If you can > >> logically > >> > > store your data just in one shard, your faceting should become > faster. > >> > Size > >> > > wise it should not be a problem for SOLR. > >> > > > >> > > Also, you didn't say anything about the facet.limit value, cache > >> > > parameters, usage of filter queries. Some of these can be > >> interconnected. > >> > > > >> > > Dmitry > >> > > > >> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge < > >> > > daniel.brue...@googlemail.com> wrote: > >> > > > >> > > > Hi, > >> > > > > >> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents > >> (local > >> > > > index 6GB), the other with 10mio documents (2.7GB size). > >> > > > I am trying to create some kind of 'word cloud' to see the > >> frequency of > >> > > > words for a *text_general *field. > >> > > > For this I am currently using a facet over this field and I am > also > >> > > > restricting the documents by using some other filters in the > query. > >> > > > > >> > > > The performance is really bad for the first call and then pretty > >> fast > >> > for > >> > > > the following calls. > >> > > > > >> > > > The maximum Java heap size is 3G for each shard. Both shards are > >> > running > >> > > on > >> > > > the same physical server which has 12G RAM. > >> > > > > >> > > > Question: Should I reduce the documents in one shard, so that the > >> index > >> > > is > >> > > > equal or less the Java Heap size for this shard? Or is > >> > > > there another method to avoid this slow calls? > >> > > > > >> > > > Thank you > >> > > > > >> > > > Daniel > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Regards, > >> > > > >> > > Dmitry Kan > >> > > > >> > > >> > >> > >> > >> -- > >> Regards, > >> > >> Dmitry Kan > >> > > > > > -- Regards, Dmitry Kan