What is the client for these requests? Do they all go in on the same socket or do they use separate sockets?
If they are a SolrJ program, and you say 'new CommonsHttpSolrServer' for each request, each request goes in on a new socket. This creates a new thread for each request, and the old threads don't die until the client socket times out. I haven't looked at the file-fetching request handler, Check the number of threads in the server. Also look at the sockets with the Unix/Windows program 'netstat -an'. It sounds like the load balancer does not have a 'retry request on another server in the pool' option. This is the core system architecture problem, if you want this level of uptime. On 4/10/10, Blargy <zman...@hotmail.com> wrote: > > Lance, > > We have have thousands of searches per minute so a minute of downtime is out > of the question. If for whatever reason one of our solr slaves goes down I > want to remove it ASAP from the loadbalancers rotation, hence the 2 second > check. > > Maybe I am doing something wrong but the my HAProxy healthcheck is as > follows: > ... > option httpchk GET /solr/items/admin/file?file=healthcheck.txt > ... > so basically I am requesting that file to determine if that particular slave > is up or not. Is this the preferred way of doing this? I kind of like the > "Enable/Disable" feature of this healthcheck file. > > You mentioned: > > "It should not run out of file descriptors from doing this. The code > does a 'new File(healthcheck file name).exists()' and throws away the > descriptor. This should not be a resource leak for file desciptors." > > yet if i run the following on the command line: > # lsof -p xxxx > Where xxx is the pid of the solr, I get the following output: > > ... > java 4408 root 220r REG 8,17 56085252 817639 > /var/solr/home/items/data/index/_4y.tvx > java 4408 root 221r REG 8,17 10499759 817645 > /var/solr/home/items/data/index/_4y.tvd > java 4408 root 222r REG 8,17 296791079 817647 > /var/solr/home/items/data/index/_4y.tvf > java 4408 root 223r REG 8,17 7010660 817648 > /var/solr/home/items/data/index/_4y.nrm > java 4408 root 224r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 225r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 226r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 227r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 228r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 229r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 230r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > java 4408 root 231r REG 8,17 0 817622 > /var/solr/home/items/conf/healthcheck.txt > ... at it keeps going .... > > and I've see it as high as 3000. I've had to update my ulimit to 10000 to > overcome this problem however I feel this is really just a bandaid to a > deeper problem. > > Am I doing something wrong (Solr or HAProxy) or is this a possible resource > leak? > > Thanks for any input! > -- > View this message in context: > http://n3.nabble.com/Healthcheck-Too-many-open-files-tp710631p711141.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Lance Norskog goks...@gmail.com