Improving the speed of Solr query over 16 million tweets

naryad Wed, 19 Dec 2012 20:50:36 -0800

I use Solr (SolrCloud) to index and search my tweets. There are about 16
million tweets and the index size is approximately 3 GB. The tweets are
indexed in real time as they come so that real time search is enabled.
Currently I use lowercase field type for my tweet body field. For a single
search term in the search, it is taking around 7 seconds and with addition
of each search term, time taken for search is linearly increasing. 3GB is
the maximum RAM allocated for the solr process. Sample solr search query
looks like this


*tweet_body:*big* AND tweet_body:*data* AND tweet_tag:big_data*
Any suggestions on improving the speed of searching? Currently I run only 1
shard which contains the entire tweet collection. Not sure if redeclaring
the field as text_en and reindexing the entire thing is the only option.
Currently I figure that the query is scanning all the documents.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Improving-the-speed-of-Solr-query-over-16-million-tweets-tp4028222.html
Sent from the Solr - User mailing list archive at Nabble.com.

Improving the speed of Solr query over 16 million tweets

Reply via email to