I would be curious what the cause is. Samarth says that it worked for over a year /and supposedly docs were being added all the time/. Did the index grew considerably in the last period? Perhaps he could attach visualvm while it is in the 'black hole' state to see what is actually going on. I don't know if the instance is used also for searching, but if its only indexing, maybe just shorter commit intervals would alleviate the problem. To add context, our indexer is configured with 16gb heap, on machine with 64gb ram, but busy one, so sometimes there is no cache to spare for os. The index is 300gb (out of which 140gb stored values), and it is working just 'fine' - 30doc/s on average, but our docs are large /0.5mb on avg/ and fetched from two databases, so the slowness is outside solr. I didnt see big improvements with bigger heap, but I don't remember exact numbers. This is solr4.
Roman On 8 Feb 2014 12:23, "Shawn Heisey" <s...@elyograg.org> wrote: > On 2/8/2014 1:40 AM, samarth s wrote: > > Yes it is amazon ec2 indeed. > > > > To expqnd on that, > > This solr deployment was working fine, handling the same load, on a 34 GB > > instance on ebs storage for quite some time. To reduce the time taken by > a > > commit, I shifted this to a 30 GB SSD instance. It performed better in > > writes and commits for sure. But, since the last week I started facing > this > > problem of infinite back to back commits. Not being able to resolve > this, I > > have finally switched back to a 34 GB machine with ebs storage, and now > the > > commits are working fine, though slow. > > The extra 4GB of RAM is almost guaranteed to be the difference. If your > index continues to grow, you'll probably be having problems very soon > even with 34GB of RAM. If you could put it on a box with 128 to 256GB > of RAM, you'd likely see your performance increase dramatically. > > Can you share your solrconfig.xml file? I may be able to confirm a > couple of things I suspect, and depending on what's there, may be able > to offer some ideas to help a little bit. It's best if you use a file > sharing site like dropbox - the list doesn't deal with attachments very > well. Sometimes they work, but most of the time they don't. > > I will reiterate my main point -- you really need a LOT more memory. > Another option is to shard your index across multiple servers. This > doesn't actually reduce the TOTAL memory requirement, but it is > sometimes easier to get management to agree to buy more servers than it > is to get them to agree to buy really large servers. It's a paradox > that doesn't make any sense to me, but I've seen it over and over. > > Thanks, > Shawn > >