All,
Again, thanks for the help. I found the real jrun logs (which is
something I speculated about initially.) With CentOS/CF9 the jrun logs
are not under */opt/jrun4/ *but they are actually located at
*/opt/coldfusion9/runtime/logs/*. Needless to say, the jrun logs were
helpful. It appears as if nearly all of the 503 responses are due to
incomplete file uploads.
I am a little surprised that these errors are now being kicked out by
jrun when in the past they made it to coldfusion. But because it seems
that Java is not handling the error condition, I believe the reason why
they no longer make it to coldfusion might correspond to when I changed
from the JVM installed with coldfusion to a newer JVM installed
separately. (I changed JVM's to get the latest security patches.)
Because the number of problems is less than 1% of the files that are
uploaded, and it appears that our resources are not maxed out when it
happens, I am assuming most of the blame for the problems is due to a
network error on the sender side. (Many of our end-users are spread out
across the country and use home internet connections of varying
reliability.) I made the suggestion to the site owner that I rewrite our
old upload dialog (a plain form allowing up to 6 files to be uploaded at
once) with an ajax based solution that will single thread the files one
at a time. I am hopeful that by limiting to a single file at a time,
there will be fewer network errors.
Of course there is still the few 503 errors that happen without
uploading data... There are not many and at the moment none are in the
Jrun logs. I am hopeful that some of the changes I made in the last few
days (including incoporating some of Charlie's suggest request changes.)
might have helped. But my pragmatism trumps optimism when it comes to
computers, so I convinced the owner to get a Fusion Reactor
subscription. If/when the situation happens again, I'll have a few more
tools to use to help figure out what happened.
FYI, here is the log entry that led me to the upload determination...
08/13 12:41:48 error unexpected end of part
java.io.IOException: unexpected end of part
at
com.oreilly.servlet.multipart.PartInputStream.fill(PartInputStream.java:96)
at
com.oreilly.servlet.multipart.PartInputStream.read(PartInputStream.java:179)
at
com.oreilly.servlet.multipart.PartInputStream.read(PartInputStream.java:152)
at com.oreilly.servlet.multipart.FilePart.write(FilePart.java:257)
at
com.oreilly.servlet.multipart.FilePart.writeTo(FilePart.java:215)
at coldfusion.filter.FormScope.fillForm(FormScope.java:253)
at
coldfusion.filter.FusionContext.SymTab_initForRequest(FusionContext.java:408)
at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:33)
at
coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)
at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)
at
coldfusion.filter.RequestThrottleFilter.invoke(RequestThrottleFilter.java:126)
at coldfusion.CfmServlet.service(CfmServlet.java:201)
at
coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:86)
at
coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)
at
coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:94)
at jrun.servlet.FilterChain.service(FilterChain.java:101)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106)
at
jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at
jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:286)
at
jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543)
at
jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203)
at
jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:320)
at
jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:428)
at
jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:266)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
08/13 12:41:48 error (JRun Service: ProxyService
[jrun.servlet.jrpp.JRunProxyService@6988843a])
JRunPRoxyServer.invokeRunnable:
java.lang.IllegalStateException
at jrun.servlet.JRunResponse.getWriter(JRunResponse.java:205)
at jrun.servlet.JRunResponse.sendError(JRunResponse.java:597)
at
jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:328)
at
jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543)
at
jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203)
at
jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:320)
at
jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:428)
at
jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:266)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
On 08/09/2013 02:38 PM, Frank Moorman wrote:
Thanks all for the insight...
And just as Charlie predicted, the event happened again without
tripping the alert.
One benefit, even though I was not actively watching what happened, I
did have the server monitor running. The event happened when the
server was only using 440MB, with 1.2GB free in the jvm allocation. So
it definitely is not a memory issue. (It also happened in between GC
cycles, so that isn't the issue either.)
As for the possibility of the CPU, I won't discount this, but I doubt
it would be from CF. We do use CFDocument/CFPDF which I know grab
resources, but normally those pages are during the morning, and it
actually happened twice last night at a time when I would not expect it.
I'll have to gather more information. I'm starting to think that the
cause may be outside CF. I'm going to try to look at all the system
logs and try to piece together exactly what was happening at the time
of the event.
Another question though... Fusion Reactor monitors the entire system,
not just CF right? (i.e. it can track running system processes, not
just what CF is doing) If this is true, this may be the next step if
my efforts are fruitless.
Thanks,
Frank
On 08/09/2013 12:55 AM, Charlie Arehart wrote:
Like you, I would think this is not memory related. I think that's
just a really old error message, from the days when even the then
Macromedia engineers could only throw up their hands and guess when
something was amiss.
I recently saw this error message happening for a client where we
found (since they were on IIS) that the jrun_iis6_wildcardxxxx.logs
(in [ColdFusion9]\runtime\lib\wsconfig\nn\LogFiles) had indications
of errors also. I realize you're on Apache, and you say you looked
at all the logs, but did you check out those logs in that wsconfig
dir and its subdirs? It's just a stab in the dark whether any log
messages there (around the same time) will be useful.
I would focus on something making the CF instance not responsive. I
know you said you raised the simult request threads from 10 to 40,
and it seemed fine at 10. But maybe you have new load, or a new
problem that makes requests hang.
As Ajas said, FR (or as you're using it, the CF Server Monitor) can
show you any running requests (the CFSM only show them if you turn on
"start monitoring"). If you can be on when it happens you may be
surprised what you find. If all 10 (or now 40) are hung, even if only
for a while, that could lead to the error---not that CF's down, but
the connector thinks it can't be reached.
And as you noted in a later message, turning on the alerts will help
(in either CFSM, again where "start monitoring" must be enabled for
them to work, or in FR, or SeeFusion), as that will give you info
even when you can't be "watching" the monitors. Since you're using
the CFSM, and you say you configured the alerts, did you confirm that
you get the email they send? There's no test feature. What I do is
set the memory alert to below the current memory used, which should
trigger an alert within a few minutes. But then I turn that alert
off. I find it useless, since the JVM (since 1.5) can often let used
memory climb to the max before deciding to do a major GC, so you can
get those memory alerts when there's no real problem, if indeed a GC
at that point would have collected a lot of "not really used" memory.
But I do recommend that slow server alert in the CFSM, or the running
requests alert in FR. For almost everyone, if you have many requests
running at once, that's a "canary in the coal mine" indicating that
problems may be afoot. The question then is whether the alert shows
many slow requests. If it just shows many fast ones, then that is
just a sign of a lot of traffic, and if it's being handled fast, you
need to increase the number of "max simult requests", and the alert
level in whatever monitor you're using.
And be careful about setting the other values in "request tuning" so
low (web services, flash remoting, and remote cfcs). There's never a
harm in them being more than you need. But if they are less than you
need, that could be where a bottleneck happens. I know you say you
don't serve web services, but I've seen shows have their own cf pages
calling their own CFCs as web services. And if that request limit was
low, then that becomes a single threading bottleneck. Or maybe you DO
have code calling CFCs remotely (via ajax). Or about flash remoting,
the monitor (and FR) use those, and your own code may (even if
unexpectedly). Again, why constrict them? If you don't use them,
there's no harm in them being larger (like 5 or 10, each).
Finally, note that you could have cf requests using either cfthread
or reporting, and there are limits for each of those (configurable in
the admin). And though you are not using CF Standard, I'll say for
other readers that they could have all this sort of problem caused by
using some tag that is itself single-threaded in CF Standard, as are
many tags, including cfdocument, cfpdf, and more. That could cause a
"low traffic" site to still have hung requests.
Let us know if any of that helps, or not. But yes, if it remains and
you don't solve it, I am available for consulting, and with my
satisfaction guarantee, you don't have to pay for time you don't feel
is valuable.
/charlie
*From:*[email protected] [mailto:[email protected]] *On Behalf Of *Frank
Moorman
*Sent:* Thursday, August 08, 2013 7:42 PM
*To:* [email protected]
*Subject:* [ACFUG Discuss] Out of Memory?!?
All,
I'm trying to figure out and determine a Jrun Out of Memory error. I
get the following in my logs:
[Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page
for JRun too busy or out of memory
[Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page
for JRun too busy or out of memory
It doesn't happen often, (maybe once or occasionally twice a business
day) but as everyone understands, users aren't happy when it happens
to them.
This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm
version 1.7. (The jvm was installed separately from CF for security
and coldfusion uses it.)
I doubt it is actually an out of memory condition (though I could be
wrong) The server has 6GB of physical memory and another 6GB of swap.
It rarely needs to use swap. (i.e. I have not observed it.)
The jvm is given significant memory to use as well. It is using a
64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB
max. When I look through the server monitor, it is normal to see 1 to
1.5GB allocated and between 100-750MB used. (I see a normal sawtooth
pattern with the memory usage, so it looks like what I would expect
from the garbage collection routing. It does spike occasionally but I
have never seen it close to the 3GB max. (I've never even seen it hit
2GB used.)
The server is set for 40 template requests (I recently upped it from
10 to see if that was the problem and it still occurred with the same
frequency.)
Flash remoting is set to 2, webservice 1, CFC 1. (These remote
settings are only set for the monitor, as the server does not provide
any webservices outside the running application) Jrun is set to 50
requests, and 1000 queued. (Enough to cover the CF requests.)
I looked at Charlie's blog... I have checked the logs, and other
than the apache error log (above) I do not see anything. I've check
the system /var/log/messages, I've checked all the CF logs (I also
archived everything yesterday, and the cf logs are practically empty
even after today's occurrence.) I did not find any jvm abort logs
that Charlie mentioned in his blog. (I checked in the CF directory
mentioned as well as the system logs and the actual JVM directory) I
also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and
was surprised because the only entries were months ago. (Because of
the age of the log, I'm curious if I am looking at the right place
for it.)
Does anyone have any ideas on what might be happening? or something
else that I should check?
I have searched the web and found different ideas (even the rare "add
more memory")
Another mentions the requests being overloaded, but I honestly do not
believe that the 10 simultaneous template requests was low for the
traffic for this site. After quadrupling it, with the problem still
occurring, it is even less likely.
I've seen some mentioning client variable storage, but the server is
set to use cookies for that, not a database. While I do not use
client storage, I know there are items like the last time visited
etc, so I may just turn it off completely.
Another one I found interested mentions a bug with MySql drivers with
the "Maintain Connections" setting and suggested to uncheck this box.
I search for this and found the bug mentioned, one site even
speculated it was still a problem with CF9, but I could not find any
details. Does anyone know of this issue, I've seen it mentioned, but
a lack of any details other than its bad to have that checked. (The
page that mentioned it did say it ate memory.)
I'd love more ideas, I know these are not an easy or straight forward
error. I may try removing the client storage next, but other ideas
are welcome. (i.e. I'm not very convinced that the other things I
found on the web will be effective.)
Thanks,
Frank
-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform
For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by FusionLink <http://www.fusionlink.com>
-------------------------------------------------------------
-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform
For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by FusionLink <http://www.fusionlink.com>
-------------------------------------------------------------
-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform
For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by FusionLink <http://www.fusionlink.com>
-------------------------------------------------------------
-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform
For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by http://www.fusionlink.com
-------------------------------------------------------------