Jean-Sebastien, We have had similar issues. In our cases, our QTime varied between 100ms and as much as 120s (that's right, 120,000ms). The times were so long that they resulted in timeouts upstream.
In our case, we have settled in on the following hypothesis: The actual retrieval time (clock time, as we call it) requires that the matching documents be fetched from disk, decompressed (in our case, anyway), and processed. We found that we had numerous "very large" documents in our index - on the order of tens of MBs or more sometimes, and that performance in these cases was horrific. As an example, we could execute a simple query and see QTime of 100ms and clock time of 300ms when sorted in such a way that the smallest documents came first. In particular, we would retrieve the first 10 documents out of a result set of thousands. If we reversed the sort order, and the largest documents came first, we found clock times that I showed above - very large. Our conclusion, therefore, was that practical search times - as opposed to QTime - were a function of document size of the retrieved documents. We therefore settled on an approach that 'split' large logical documents into numerous underlying documents, thereby limiting their size. FWIW. Scott On Wed, Aug 14, 2013 at 10:09 AM, Jean-Sebastien Vachon < jean-sebastien.vac...@wantedanalytics.com> wrote: > Hi All, > > I am running some benchmarks to tune our Solr 4.3 cloud and noticed that > while the reported QTime is quite satisfactory (100 ms or so), the elapsed > time is quite large (around 5 seconds). The collection contains 12.8M > documents and the index size on disk is about 35 GB.. I have only one shard > and 4 replicas (we intent to have 5 shards but wanted to see how Solr would > perform with only one shard so that we could benefit from all Solr > functions) > > I checked for huge GC but found none. I also checked if we had intensive > IO and we don't. All five nodes have 48GB of ram of which 4GB is allocated > to Tomcat 7 and Solr. The caches have a hit ratio over 80%. Zookeeper is > running on the same boxes (5 instances, one per node) but there does not > seem to be much activity going on. > > This is a sample query: > > > http://10.0.5.211:8201/solr/Current/select?fq=position_first_seen_date_id:[3484TO > 3516]&q= (title:java OR semi_clean_title:java OR > ad_description:java)&rows=10&start=0&fl=job_id,position_id,super_alias_id,advertiser,super_alias,credited_source_id,position_first_seen_date_id,position_last_seen_date_id, > position_posted_date_id, position_refreshed_date_id, position_job_type_id, > position_function_id,position_green_code,title_id,semi_clean_title_id,clean_title_id,position_empl_count,place_id, > state_id,county_id,msa_id,country_id,position_id,position_job_type_mva, > ad_activity_status_id, position_score, > ad_score,position_salary,position_salary_range_id,position_salary_source,position_naics_6_code,position_education_level_id, > is_staffing,is_bulk,is_anonymous,is_third_party,is_dirty,ref_num,tags,lat,long,position_duns_number,url,advertiser_id, > title, semi_clean_title, ad_description, position_description, > ad_bls_salary, position_bls_salary, covering_source_id, > content_model_id,position_soc_2011_8_code&group.field=position_id&group=true&group.ngroups=false&group.main=true&sort=position_first_seen_date_id > desc,score desc > > Any idea what could cause this? > -- Scott Lundgren Director of Engineering Carbon Black, Inc. (210) 204-0483 | scott.lundg...@carbonblack.com