On Mon, 2015-11-02 at 17:27 +0530, Modassar Ather wrote: > The query q=network se* is quick enough in our system too. It takes > around 3-4 seconds for around 8 million records. > > The problem is with the same query as phrase. q="network se*".
I misunderstood your query then. I tried replicating it with q="der se*" http://rosalind:52300/solr/collection1/select?q=%22der+se*% 22&wt=json&indent=true&facet=false&group=true&group.field=domain gets expanded to parsedquery": "(+DisjunctionMaxQuery((content_text:\"kan svane\" | author:kan svane* | text:\"kan svane\" | title:\"kan svane\" | url:kan svane* | description:\"kan svane\")) ())/no_coord" The result was 1,043,258,271 hits in 15,211 ms Interestingly enough, a search for q="kan svane*" resulted in 711 hits in 12,470 ms. Maybe because 'kan' alone matches 1 billion+ documents. On that note, q=se* resulted in -951812427 hits in 194,276 ms. Now this is interesting. The negative number seems to be caused by grouping, but I finally got the response time up in the minutes. Still no memory problems though. Hits without grouping were 3,343,154,869. For comparison, q=http resulted in -1527418054 hits in 87,464 ms. Without grouping the hit count was 7,062,516,538. Twice the hits of 'se*' in half the time. > I changed my SolrCloud setup from 12 shard to 8 shard and given each > shard 30 GB of RAM on the same machine with same index size > (re-indexed) but could not see the significant improvement for the > query given. Strange. I would have expected the extra free memory for disk space to help performance. > Also can you please share your experiences with respect to RAM, GC, > solr cache setup etc as it seems by your comment that the SolrCloud > environment you have is kind of similar to the one I work on? > There is a short write up at https://sbdevel.wordpress.com/net-archive-search/ - Toke Eskildsen, State and University Library, Denmark