Yeah a separate by month or year is good and can really help in this case. Bill Bell Sent from mobile
> On Aug 2, 2015, at 5:29 PM, Jay Potharaju <jspothar...@gmail.com> wrote: > > Shawn, > Thanks for the feedback. I agree that increasing timeout might alleviate > the timeout issue. The main problem with increasing timeout is the > detrimental effect it will have on the user experience, therefore can't > increase it. > I have looked at the queries that threw errors, next time I try it > everything seems to work fine. Not sure how to reproduce the error. > My concern with increasing the memory to 32GB is what happens when the > index size grows over the next few months. > One of the other solutions I have been thinking about is to rebuild > index(weekly) and create a new collection and use it. Are there any good > references for doing that? > Thanks > Jay > >> On Sun, Aug 2, 2015 at 10:19 AM, Shawn Heisey <apa...@elyograg.org> wrote: >> >>> On 8/2/2015 8:29 AM, Jay Potharaju wrote: >>> The document contains around 30 fields and have stored set to true for >>> almost 15 of them. And these stored fields are queried and updated all >> the >>> time. You will notice that the deleted documents is almost 30% of the >>> docs. And it has stayed around that percent and has not come down. >>> I did try optimize but that was disruptive as it caused search errors. >>> I have been playing with merge factor to see if that helps with deleted >>> documents or not. It is currently set to 5. >>> >>> The server has 24 GB of memory out of which memory consumption is around >> 23 >>> GB normally and the jvm is set to 6 GB. And have noticed that the >> available >>> memory on the server goes to 100 MB at times during a day. >>> All the updates are run through DIH. >> >> Using all availble memory is completely normal operation for ANY >> operating system. If you hold up Windows as an example of one that >> doesn't ... it lies to you about "available" memory. All modern >> operating systems will utilize memory that is not explicitly allocated >> for the OS disk cache. >> >> The disk cache will instantly give up any of the memory it is using for >> programs that request it. Linux doesn't try to hide the disk cache from >> you, but older versions of Windows do. In the newer versions of Windows >> that have the Resource Monitor, you can go there to see the actual >> memory usage including the cache. >> >>> Every day at least once i see the following error, which result in search >>> errors on the front end of the site. >>> >>> ERROR org.apache.solr.servlet.SolrDispatchFilter - >>> null:org.eclipse.jetty.io.EofException >>> >>> From what I have read these are mainly due to timeout and my timeout is >> set >>> to 30 seconds and cant set it to a higher number. I was thinking maybe >> due >>> to high memory usage, sometimes it leads to bad performance/errors. >> >> Although this error can be caused by timeouts, it has a specific >> meaning. It means that the client disconnected before Solr responded to >> the request, so when Solr tried to respond (through jetty), it found a >> closed TCP connection. >> >> Client timeouts need to either be completely removed, or set to a value >> much longer than any request will take. Five minutes is a good starting >> value. >> >> If all your client timeout is set to 30 seconds and you are seeing >> EofExceptions, that means that your requests are taking longer than 30 >> seconds, and you likely have some performance issues. It's also >> possible that some of your client timeouts are set a lot shorter than 30 >> seconds. >> >>> My objective is to stop the errors, adding more memory to the server is >> not >>> a good scaling strategy. That is why i was thinking maybe there is a >> issue >>> with the way things are set up and need to be revisited. >> >> You're right that adding more memory to the servers is not a good >> scaling strategy for the general case ... but in this situation, I think >> it might be prudent. For your index and heap sizes, I would want the >> company to pay for at least 32GB of RAM. >> >> Having said that ... I've seen Solr installs work well with a LOT less >> memory than the ideal. I don't know that adding more memory is >> necessary, unless your system (CPU, storage, and memory speeds) is >> particularly slow. Based on your document count and index size, your >> documents are quite small, so I think your memory size is probably good >> -- if the CPU, memory bus, and storage are very fast. If one or more of >> those subsystems aren't fast, then make up the difference with lots of >> memory. >> >> Some light reading, where you will learn why I think 32GB is an ideal >> memory size for your system: >> >> https://wiki.apache.org/solr/SolrPerformanceProblems >> >> It is possible that your 6GB heap is not quite big enough for good >> performance, or that your GC is not well-tuned. These topics are also >> discussed on that wiki page. If you increase your heap size, then the >> likelihood of needing more memory in the system becomes greater, because >> there will be less memory available for the disk cache. >> >> Thanks, >> Shawn > > > -- > Thanks > Jay Potharaju