On Thu, Apr 21, 2011 at 9:44 AM, Ofer Fort <o...@tra.cx> wrote:
> Not sure i fully understand,
> If "facet.method=enum steps over all terms in the index for that field",
> than what does setting the q=field:subset do? if i set the q=*:*, than how
> do i get the frequency only on my subset?

It's an implementation detail.  Faceting *does* just give you counts
that just match
q=field:subset.  How it does it is a different matter (i.e. for
facet.method=enum, it
must step over all terms in the field), so it's closer to O(nterms in
field) rather than O(ndocs in base set)

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


> Ofer
>
> On Thu, Apr 21, 2011 at 4:40 PM, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
>>
>> On Thu, Apr 21, 2011 at 9:24 AM, Ofer Fort <o...@tra.cx> wrote:
>> > Another strange behavior is that the Qtime seems pretty stable, no
>> > matter
>> > how many object match my query. 200K and 20K both take about 17s.
>> > I would have guessed that since the time is going over all the terms of
>> > all
>> > the subset documents, would mean that the more documents, the more time.
>>
>> facet.method=enum steps over all terms in the index for that field...
>> that takes time regardless of how many documents are in the base set.
>>
>> There are also short-circuit methods that avoid looking at the docs
>> for a term if it's docfreq is low enough that it couldn't possibly
>> make it into the priority queue.  Because if this, it can actually be
>> faster to facet on a larger base set (try *:* as the base query).
>>
>> Actually, it might be interesting to see the query time if you set
>> facet.mincount equal to the number of docs in the base set - that will
>> test pretty much just the time to enumerate over the terms without
>> doing any set intersections at all.  Be careful not to set mincount
>> greater than the number of docs in the base set though - solr will
>> short-circuit that too and skip enumeration altogether.
>>
>> The work on the bulkpostings branch should definitely speed up your
>> case even more - but I have no idea when it will "land" on trunk.
>>
>>
>> -Yonik
>> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
>> 25-26, San Francisco
>
>

Reply via email to