Hi Shawn, Thanks so much for your response. We basically are very write intensive and write throughput is pretty essential to our product. Reads are sporadic and actually is functioning really well.
We write on average (at the moment) 8-12 batches of 35 documents per minute. But we really will be looking to write more in the future, so need to work out scaling of solr and how to cope with more volume. Schema (I have changed the names) : http://pastebin.com/x1ry7ieW Config: http://pastebin.com/pqjTCa7L As you can see we haven't played around much with caches and such. I am now load testing on 4.2.1 and will be re-indexing our data so now is really the time to make any tweeks we can to get the throughput we want. We query based mostly on the latest documents added and use facet to populate drop downs for distinct values which the selection then gets added to the basic query of: rows=20&df=text&fl=Id,EP,ExP,PC,UTCTime,CIp,Br,OS,LU&start=0&q=UTCTime:[2013-04-25T23:00:00Z+TO+2013-05-02T22:00:00Z]+AND+H:(https\:\/\/.com)&sort=UTCTime+desc So we will add further fields onto the above, typically users are adding only 1 or 2 further restrictions. Facet queries will be the same as the above, we always restrict by the date and the customer reference. Hope this is enough information to be going on with. Again thanks for your help. Netty. On 1 May 2013 17:31, Shawn Heisey <s...@elyograg.org> wrote: > On 5/1/2013 8:42 AM, Annette Newton wrote: > >> It was a single delete with a date range query. We have 8 machines each >> with 35GB memory, 10GB is allocated to the JVM. Garbage collection has >> always been a problem for us with the heap not clearing on Full garbage >> collection. I don't know what is being held in memory and refuses to be >> collected. >> >> I have seen your java heap configuration on previous posts and it's very >> like ours except that we are not currently using LargePages (I don't know >> how much difference that has made to your memory usage). >> >> We have tried various configurations around Java including the G1 >> collector >> (which was awful) but all settings seem to leave the old generation at >> least 50% full, so it quickly fills up again. >> >> -Xms10240M -Xmx10240M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC >> -XX:+CMSParallelRemarkEnabled -XX:NewRatio=2 -XX:+CMSScavengeBeforeRemark >> -XX:CMSWaitDuration=5000 -XX:+CMSClassUnloadingEnabled >> -XX:**CMSInitiatingOccupancyFraction**=80 -XX:+** >> UseCMSInitiatingOccupancyOnly >> >> If I could only figure out what keeps the heap to the current level I feel >> we would be in a better place with solr. >> > > With a single delete request, it was probably the commit that was very > slow and caused the problem, not the delete itself. This has been my > experience with my large indexes. > > My attempts with the G1 collector were similarly awful. The idea seems > sound on paper, but Oracle needs to do some work in making it better for > large heaps. Because my GC tuning was not very disciplined, I do not know > how much impact UseLargePages is having. > > Your overall RAM allocation should be good. If these machines aren't > being used for other software, then you have 24-25GB of memory available > for caching your index, which should be very good with 26GB of index for > that machine. > > Looking over your message history, I see that you're using Amazon EC2. > Solr performs much better on bare metal, although the EC2 instance you're > using is probably very good. > > SolrCloud is optimized for machines that are on the same Ethernet LAN. > Communication between EC2 VMs (especially if they are not located in nearby > data centers) will have some latency and a potential for dropped packets. > I'm going to proceed with the idea that EC2 and virtualization are not the > problems here. > > I'm not really surprised to hear that with an index of your size that so > much of a 10GB heap is retained. There may be things that could reduce > your memory usage, so could you share your solrconfig.xml and schema.xml > with a paste site that does XML highlighting (pastie.org being a good > example), and give us an idea of how often you update and commit? Feel > free to search/replace sensitive information, as long that work is > consistent and you don't entirely remove it. Armed with that information, > we can have a discussion about your needs and how to achieve them. > > Do you know how long cache autowarming is taking? The cache statistics > should tell you how long it took on the last commit. > > Some examples of typical real-world queries would be helpful too. Examples > should be relatively complex for your setup, but not worst-case. An > example query for my setup that meets this requirement would probably be > 4-10KB in size ... some of them are 20KB! > > Not really related - a question about one of your old messages that never > seemed to get resolved: Are you still seeing a lot of CLOSE_WAIT > connections in your TCP table? A later message from you mentioned 4.2.1, > so I'm wondering specifically about that version. > > Thanks, > Shawn > > -- Annette Newton Database Administrator ServiceTick Ltd T:+44(0)1603 618326 Seebohm House, 2-4 Queen Street, Norwich, England NR2 4SQ www.servicetick.com *www.sessioncam.com* -- *This message is confidential and is intended to be read solely by the addressee. The contents should not be disclosed to any other person or copies taken unless authorised to do so. If you are not the intended recipient, please notify the sender and permanently delete this message. As Internet communications are not secure ServiceTick accepts neither legal responsibility for the contents of this message nor responsibility for any change made to this message after it was forwarded by the original author.*