And the followup question would be.. if some of these documents are legitimately this large (they really do have that much text), is there a good way to still allow that to be searchable and not explode our index? These would be "text_en" type fields.
On Mon, Jun 2, 2014 at 6:09 AM, Joe Gresock <jgres...@gmail.com> wrote: > So, we're definitely running into some very large documents (180MB, for > example). I haven't run the analysis on the other 2 shards yet, but this > could definitely be our problem. > > Is there any conventional wisdom on a good "maximum size" for your indexed > fields? Of course it will vary for each system, but assuming a heap of > 10g, does anyone have past experience in limiting their field sizes? > > Our caches are set to 128. > > > On Sun, Jun 1, 2014 at 8:32 AM, Joe Gresock <jgres...@gmail.com> wrote: > >> These are some good ideas. The "huge document" idea could add up, since >> I think the shard1 index is a little larger (32.5GB on disk instead of >> 31.9GB), so it is possible there's one or 2 really big ones that are >> getting loaded into memory there. >> >> Btw, I did find an article on the Solr document routing ( >> http://searchhub.org/2013/06/13/solr-cloud-document-routing/), so I >> don't think that our ID structure is a problem in itself. But I will >> follow up on the large document idea. >> >> I used this article ( >> https://support.datastax.com/entries/38367716-Solr-Configuration-Best-Practices-and-Troubleshooting-Tips) >> to find the index heap and disk usage: >> http://localhost:8983/solr/admin/cores?action=STATUS&memory=true >> >> Though looking at the data index directory on disk basically said the >> same thing. >> >> I am pretty sure we're using the smart round-robining client, but I will >> double check on Monday. >> >> We have been using CollectD and graphite to monitor our VMs, as well as >> jvisualvm, though we haven't tried SPM. >> >> Thanks for all the ideas, guys. >> >> >> On Sat, May 31, 2014 at 11:54 PM, Otis Gospodnetic < >> otis.gospodne...@gmail.com> wrote: >> >>> Hi Joe, >>> >>> Are you/how are you sure all 3 shards are roughly the same size? Can you >>> share what you run/see that shows you that? >>> >>> Are you sure queries are evenly distributed? Something like SPM >>> <http://sematext.com/spm/> should give you insight into that. >>> >>> How big are your caches? >>> >>> Otis >>> -- >>> Performance Monitoring * Log Analytics * Search Analytics >>> Solr & Elasticsearch Support * http://sematext.com/ >>> >>> >>> On Sat, May 31, 2014 at 5:54 PM, Joe Gresock <jgres...@gmail.com> wrote: >>> >>> > Interesting thought about the routing. Our document ids are in 3 >>> parts: >>> > >>> > <10-digit identifier>!<epoch timestamp>!<format> >>> > >>> > e.g., 5/12345678!130000025603!TEXT >>> > >>> > Each object has an identifier, and there may be multiple versions of >>> the >>> > object, hence the timestamp. We like to be able to pull back all of >>> the >>> > versions of an object at once, hence the routing scheme. >>> > >>> > The nature of the identifier is that a great many of them begin with a >>> > certain number. I'd be interested to know more about the hashing >>> scheme >>> > used for the document routing. Perhaps the first character gives it >>> more >>> > weight as to which shard it lands in? >>> > >>> > It seems strange that certain of the most highly-searched documents >>> would >>> > happen to fall on this shard, but you may be onto something. We'll >>> scrape >>> > through some non-distributed queries and see what we can find. >>> > >>> > >>> > On Sat, May 31, 2014 at 1:47 PM, Erick Erickson < >>> erickerick...@gmail.com> >>> > wrote: >>> > >>> > > This is very weird. >>> > > >>> > > Are you sure that all the Java versions are identical? And all the >>> JVM >>> > > parameters are the same? Grasping at straws here. >>> > > >>> > > More grasping at straws: I'm a little suspicious that you are using >>> > > routing. You say that the indexes are about the same size, but is it >>> is >>> > > possible that your routing is somehow loading the problem shard >>> > abnormally? >>> > > By that I mean somehow the documents on that shard are different, or >>> > have a >>> > > drastically higher number of hits than the other shards? >>> > > >>> > > You can fire queries at shards with &distrib=false and NOT have it >>> go to >>> > > other shards, perhaps if you can isolate the problem queries that >>> might >>> > > shed some light on the problem. >>> > > >>> > > >>> > > Best >>> > > er...@baffled.com >>> > > >>> > > >>> > > On Sat, May 31, 2014 at 8:33 AM, Joe Gresock <jgres...@gmail.com> >>> wrote: >>> > > >>> > > > It has taken as little as 2 minutes to happen the last time we >>> tried. >>> > It >>> > > > basically happens upon high query load (peak user hours during the >>> > day). >>> > > > When we reduce functionality by disabling most searches, it >>> > stabilizes. >>> > > > So it really is only on high query load. Our ingest rate is >>> fairly >>> > low. >>> > > > >>> > > > It happens no matter how many nodes in the shard are up. >>> > > > >>> > > > >>> > > > Joe >>> > > > >>> > > > >>> > > > On Sat, May 31, 2014 at 11:04 AM, Jack Krupansky < >>> > > j...@basetechnology.com> >>> > > > wrote: >>> > > > >>> > > > > When you restart, how long does it take it hit the problem? And >>> how >>> > > much >>> > > > > query or update activity is happening in that time? Is there any >>> > other >>> > > > > activity showing up in the log? >>> > > > > >>> > > > > If you bring up only a single node in that problematic shard, do >>> you >>> > > > still >>> > > > > see the problem? >>> > > > > >>> > > > > -- Jack Krupansky >>> > > > > >>> > > > > -----Original Message----- From: Joe Gresock >>> > > > > Sent: Saturday, May 31, 2014 9:34 AM >>> > > > > To: solr-user@lucene.apache.org >>> > > > > Subject: Uneven shard heap usage >>> > > > > >>> > > > > >>> > > > > Hi folks, >>> > > > > >>> > > > > I'm trying to figure out why one shard of an evenly-distributed >>> > 3-shard >>> > > > > cluster would suddenly start running out of heap space, after 9+ >>> > months >>> > > > of >>> > > > > stable performance. We're using the "!" delimiter in our ids to >>> > > > distribute >>> > > > > the documents, and indeed the disk size of our shards are very >>> > similar >>> > > > > (31-32GB on disk per replica). >>> > > > > >>> > > > > Our setup is: >>> > > > > 9 VMs with 16GB RAM, 8 vcpus (with a 4:1 oversubscription ratio, >>> so >>> > > > > basically 2 physical CPUs), 24GB disk >>> > > > > 3 shards, 3 replicas per shard (1 leader, 2 replicas, whatever). >>> We >>> > > > > reserve 10g heap for each solr instance. >>> > > > > Also 3 zookeeper VMs, which are very stable >>> > > > > >>> > > > > Since the troubles started, we've been monitoring all 9 with >>> > jvisualvm, >>> > > > and >>> > > > > shards 2 and 3 keep a steady amount of heap space reserved, >>> always >>> > > having >>> > > > > horizontal lines (with some minor gc). They're using 4-5GB >>> heap, and >>> > > > when >>> > > > > we force gc using jvisualvm, they drop to 1GB usage. Shard 1, >>> > however, >>> > > > > quickly has a steep slope, and eventually has concurrent mode >>> > failures >>> > > in >>> > > > > the gc logs, requiring us to restart the instances when they can >>> no >>> > > > longer >>> > > > > do anything but gc. >>> > > > > >>> > > > > We've tried ruling out physical host problems by moving all 3 >>> Shard 1 >>> > > > > replicas to different hosts that are underutilized, however we >>> still >>> > > get >>> > > > > the same problem. We'll still be working on ruling out >>> > infrastructure >>> > > > > issues, but I wanted to ask the questions here in case it makes >>> > sense: >>> > > > > >>> > > > > * Does it make sense that all the replicas on one shard of a >>> cluster >>> > > > would >>> > > > > have heap problems, when the other shard replicas do not, >>> assuming a >>> > > > fairly >>> > > > > even data distribution? >>> > > > > * One thing we changed recently was to make all of our fields >>> stored, >>> > > > > instead of only half of them. This was to support atomic >>> updates. >>> > Can >>> > > > > stored fields, even though lazily loaded, cause problems like >>> this? >>> > > > > >>> > > > > Thanks for any input, >>> > > > > Joe >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > I know what it is to be in need, and I know what it is to have >>> > plenty. >>> > > I >>> > > > > have learned the secret of being content in any and every >>> situation, >>> > > > > whether well fed or hungry, whether living in plenty or in want. >>> I >>> > can >>> > > > do >>> > > > > all this through him who gives me strength. *-Philippians >>> 4:12-13* >>> > > > > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > I know what it is to be in need, and I know what it is to have >>> plenty. >>> > I >>> > > > have learned the secret of being content in any and every >>> situation, >>> > > > whether well fed or hungry, whether living in plenty or in want. >>> I can >>> > > do >>> > > > all this through him who gives me strength. *-Philippians >>> 4:12-13* >>> > > > >>> > > >>> > >>> > >>> > >>> > -- >>> > I know what it is to be in need, and I know what it is to have plenty. >>> I >>> > have learned the secret of being content in any and every situation, >>> > whether well fed or hungry, whether living in plenty or in want. I >>> can do >>> > all this through him who gives me strength. *-Philippians 4:12-13* >>> > >>> >> >> >> >> -- >> I know what it is to be in need, and I know what it is to have plenty. I >> have learned the secret of being content in any and every situation, >> whether well fed or hungry, whether living in plenty or in want. I can >> do all this through him who gives me strength. *-Philippians 4:12-13* >> > > > > -- > I know what it is to be in need, and I know what it is to have plenty. I > have learned the secret of being content in any and every situation, > whether well fed or hungry, whether living in plenty or in want. I can > do all this through him who gives me strength. *-Philippians 4:12-13* > -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13*