Re: really slow performance when trying to get facet.field

Daniel Bruegge Tue, 17 Jan 2012 13:24:33 -0800

Ok, I have now changed the static warming in the solrconfig.xml using
first- and newSearcher.
"Content" is my field to facet on. Now the commits take longer, which is OK
for me, but the searches are really faster right now. I also reduced the
number of documents on my shards to 15mio/shard. So the index is about
3.5G, which fits also in my memory I hope.


    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
<lst>
    <str name="q">*:*</str>
            <str name="facet">true</str>
            <str name="facet.field">content</str>
            <str name="facet.limit">1</str>
            <str name="facet.mincount">1</str>
        </lst>
      </arr>
    </listener>
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst>
<str name="q">*:*</str>
            <str name="facet">true</str>
            <str name="facet.field">content</str>
            <str name="facet.limit">1</str>
            <str name="facet.mincount">1</str>
        </lst>
      </arr>
    </listener>


On Tue, Jan 17, 2012 at 2:36 PM, Daniel Bruegge <
daniel.brue...@googlemail.com> wrote:

> Evictions are 0 for all cache types.
>
> Your server max heap space with 12G is pretty huge. Which is good I think.
> The CPU on my server is a 8-Core Intel i7 965.
>
> Commit frequency is low, because shards are added and old shards exist for
> historical reasons. Old shards will be then cleaned after couple of months.
>
> I will try to add maximum 15mio per shard and see what will happen here.
>
> This thing is, that I will add more shards over time, so that I can handle
> maybe 500-800mio documents. Maybe more. It depends.
>
> On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dmitry....@gmail.com> wrote:
>
>> Hi Daniel,
>>
>> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
>> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
>> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is
>> over 6,5 million.
>>
>> Do you see any evictions in your caches? What kind of server is it, in
>> terms of CPU and OS? How often do you commit to the index?
>>
>> Dmitry
>>
>> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
>> daniel.brue...@googlemail.com> wrote:
>>
>> > Hi Dmitry,
>> >
>> > I had everything on one Solr Instance before, but this got to heavy and
>> I
>> > had the same issue here, that the 1st facet.query was really slow.
>> >
>> > When querying the facet:
>> > - facet.limit = 100
>> >
>> > Cache settings are like this:
>> >
>> >    <filterCache class="solr.FastLRUCache"
>> >                 size="16384"
>> >                 initialSize="4096"
>> >                 autowarmCount="4096"/>
>> >
>> >    <queryResultCache class="solr.LRUCache"
>> >                     size="512"
>> >                     initialSize="512"
>> >                     autowarmCount="0"/>
>> >
>> >    <documentCache class="solr.LRUCache"
>> >                   size="512"
>> >                   initialSize="512"
>> >                   autowarmCount="0"/>
>> >
>> > How big was your index? Did it fit into the RAM which you gave the Solr
>> > instance?
>> >
>> > Thanks
>> >
>> >
>> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dmitry....@gmail.com>
>> wrote:
>> >
>> > > I had a similar problem for a similar task. And in my case merging the
>> > > results from two shards turned out to be a culprit. If you can
>> logically
>> > > store your data just in one shard, your faceting should become faster.
>> > Size
>> > > wise it should not be a problem for SOLR.
>> > >
>> > > Also, you didn't say anything about the facet.limit value, cache
>> > > parameters, usage of filter queries. Some of these can be
>> interconnected.
>> > >
>> > > Dmitry
>> > >
>> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
>> > > daniel.brue...@googlemail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents
>> (local
>> > > > index 6GB), the other with 10mio documents (2.7GB size).
>> > > > I am trying to create some kind of 'word cloud' to see the
>> frequency of
>> > > > words for a *text_general *field.
>> > > > For this I am currently using a facet over this field and I am also
>> > > > restricting the documents by using some other filters in the query.
>> > > >
>> > > > The performance is really bad for the first call and then pretty
>> fast
>> > for
>> > > > the following calls.
>> > > >
>> > > > The maximum Java heap size is 3G for each shard. Both shards are
>> > running
>> > > on
>> > > > the same physical server which has 12G RAM.
>> > > >
>> > > > Question: Should I reduce the documents in one shard, so that the
>> index
>> > > is
>> > > > equal or less the Java Heap size for this shard? Or is
>> > > > there another method to avoid this slow calls?
>> > > >
>> > > > Thank you
>> > > >
>> > > > Daniel
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Regards,
>> > >
>> > > Dmitry Kan
>> > >
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Dmitry Kan
>>
>
>

Re: really slow performance when trying to get facet.field

Reply via email to