Hi, Thanks for your suggestions. I'll be able to provide answers to a few of your questions right now rest I'll answer after some time. It takes around 150k to 200k queries before it goes down again after restarting it. In a typical query we are returning around 20 fields. Memory utilization peaks only after sometime.
Regard, Suryansh On Tuesday, July 23, 2013, Jack Krupansky wrote: > There was also a bug in the lazy loading of multivalued fields at one > point recently in Solr 4.2 > > https://issues.apache.org/**jira/browse/SOLR-4589<https://issues.apache.org/jira/browse/SOLR-4589> > "4.x + enableLazyFieldLoading + large multivalued fields + varying fl = > pathological CPU load & response time" > > Do you use multivalued fields very heavily? > > I'm still not ready to suggest that 1,000 fields is an okay thing to do, > but there are still plenty of nuances in Solr performance that could > explain the difficulties, before we even get to the 1,000 field issue > itself. > > The real bottom line is that as you increase field count, there are lots > of other aspects of Solr memory and performance degradation that increase > as well. Some of those factors can be dealt with simply with more memory, > more and faster CPU cores, or even more sharding, or other tuning, but not > necessarily all of them. > > I think that I am already on the record on other threads as suggesting > that "a couple hundred" is about the limit for field count for a "slam > dunk" use of Solr. That doesn't mean you can't go above a couple hundred > fields, just that you are in uncharted territory and may need to take > extraordinary measures to get everything working satisfactorily. There's no > magic hard limit, just a general sense that smaller numbers of of field are > like "a walk in a park", while higher numbers of fields are like "chopping > through a jungle." We each have our own threshold for... "adventure." > > We need answers to the previous questions we raised before we can analyze > this a lot further. > > Oh, and make sure there is enough OS system memory available for caching > of the index pages. Sometimes, it is little things like this that can crush > Solr performance. > > Unfortunately, Solr is not a packaged "solution" that automatically and > magically auto-configures everything to "work just right". Instead, it is a > powerful toolkit that lets you do amazing things, but you the > developer/architect need to supply amazing intelligence, wisdom, foresight, > and insight to get it (and its hardware and software environment) to do > those amazing things. > > -- Jack Krupansky > > -----Original Message----- From: Alexandre Rafalovitch > Sent: Tuesday, July 23, 2013 9:54 AM > To: solr-user@lucene.apache.org > Subject: Re: how number of indexed fields effect performance > > Do you need all of the fields loaded every time and are they stored? Maybe > there is a document with gigantic content that you don't actually need but > it gets deserialized anyway. Try lazy loading > setting: enableLazyFieldLoading in solrconfig.xml > > Regards, > Alex. > > Personal website: http://www.outerthoughts.com/ > LinkedIn: > http://www.linkedin.com/in/**alexandrerafalovitch<http://www.linkedin.com/in/alexandrerafalovitch> > - Time is the quality of nature that keeps events from happening all at > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > On Tue, Jul 23, 2013 at 12:36 AM, Jack Krupansky <j...@basetechnology.com > >wrote: > > After restarting Solr and doing a couple of queries to warm the caches, >> are queries already slow/failing, or does it take some time and a number >> of >> queries before failures start occurring? >> >> One possibility is that you just need a lot more memory for caches for >> this amount of data. So, maybe the failures are caused by heavy garbage >> collections. So, after restarting Solr, check how much Java heap is >> available, then do some warming queries, then check the Java heap >> available >> again. >> >> Add the debugQuery=true parameter to your queries and look at the timings >> to see what phases of query processing are taking the most time. Also >> check >> whether the reported QTime seems to match actual wall clock time; >> sometimes >> formatting of the results and network transfer time can dwarf actual query >> time. >> >> How many fields are you returning on a typical query? >> >> >> -- Jack Krupansky >> >> >> -----Original Message----- From: Suryansh Purwar >> Sent: Monday, July 22, 2013 11:06 PM >> To: solr-user@lucene.apache.org ; j...@basetechnology.com >> >> Subject: how number of indexed fields effect performance >> >> It was running fine initially when we just had around 100 fields >> indexed. In this case as well it runs fine but after sometime broken pipe >> exception starts coming which results in shard getting down. >> >> Regards, >> Suryansh >> >> >> >> On Tuesday, July 23, 2013, Jack Krupansky wrote: >> >> Was all of this running fine previously and only started running slow >> >>> recently, or is this your first measurement? >>> >>> Are very simple queries (single keyword, no filters or facets or sorting >>> or anything else, and returning only a few fields) working reasonably >>> well? >>> >>> -- Jack Krupansky >>> >>> -----Original Message----- From: Suryansh Purwar >>> Sent: Monday, July 22, 2013 4:07 PM >>> To: solr-user@lucene.apache.org >>> Subject: how number of indexed fields effect performance >>> >>> Hi, >>> >>> We have a two shard solrcloud cluster with each shard allocated 3 >>> separate >>> machines. We do complex queries involving a number of filter queries >>> coupled with group queries and faceting. All of our machines are 64 bit >>> with 32 gb ram. Our index size is around 10gb with around 8,00,000 >>> documents. We have around 1000 indexed fields per document. 6gb of >>> memeory >>> is allocated to tomcat under which solr is running on each of the six >>> machines. We have a zookeeper ensemble consisting of 3 zookeeper >>> instances >>> running on 3 of the six machines with 4gb memory allocated to each of the >>> zookeeper instance. First solr start taking too much time with "Broken >>> pipe >>> exception because of timeout from client side" coming again and again, >>> then >>> after sometime a whole shard goes down with one machine at at time >>> followed >>> by other machines. Is having 1000 fields indexed with each document >>> resulting in this problem? If it is so, what would be the ideal number of >>> indexed fields in such environment. >>> >>> Regards, >>> Suryansh >>> >>> >>> >> >