Facets are only really useful if you want the counts for multiple values (e.g., 
"eldudearino", "ladudearina"). I'd suggest just leaving all the facet 
parameters off of that query - the numFound that is returned should give you 
what you want.

The slowness may be due to the facet cache needing to be regenerated (which 
should only happen if you've done a commit since the last time you ran the 
query). Regardless of what time slice you use, behind the scenes Solr has to 
basically get the author_username value for every document in the index and put 
them in an in-memory data structure. This can be quite slow, especially if 
there are many distinct values for that field.

-Michael

-----Original Message-----
From: kevinlieb [mailto:ke...@politear.com] 
Sent: Monday, October 08, 2012 4:27 PM
To: solr-user@lucene.apache.org
Subject: Funny behavior in facet query on large dataset

I am doing a facet query in Solr (3.4) and getting very bad performance. 
This is in a solr shard with 22 million records, but I am specifically doing a 
small time slice.  However even if I take the time slice query out it takes the 
same amount of time, so it seems to be searching the entire data set.

I am trying to find all documents that contain the word "dude" or "thedude"
or "anotherdude" and count how many of these were written by "eldudearino"
(of course names are changed here to protect the innocent...).

My query is like this: 

http://myserver:8080/solr/select/?fq=created_at:NOW-5MINUTES&q=(+(text:(%22dude%22+%22thedude%22+%22%23anotherdude%22))+)&facet=true&indent=on&facet.mincount=1&wt=xml&version=2.2&rows=0&fl=author_username,author_id&facet.field=author_username&fq=author_username:(%22@eldudearino%22)

Any ideas what I could be doing wrong?

Thanks in advance!





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Funny-behavior-in-facet-query-on-large-dataset-tp4012584.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to