DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=42198>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=42198

           Summary: Insufficient synchronization for CometEvent.close
           Product: Tomcat 6
           Version: 6.0.11
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: critical
          Priority: P2
         Component: Catalina
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


I am currently porting our eventing framework for web based clients to Tomcat 6,
as we would like to make use of the new CometProcessor interface.

I detected that the concurrent access of event processor thread and response
provider thread seems to be unsufficiently synchronized.

I built a small test webapp which tries to simulate the communication behaviour
of our framework and did some tests.
(see attachment comettest.war)
You can easily use the test webapp by simply deploying it to the webapps folder
of Tomcat,
then start one more more browsers (IE and FireFox shown a little different
behaviour),
and request the index page of the webapp.

I did my tests first with tomcat 6.0.10,
then with http://svn.apache.org/repos/asf/tomcat/tc6.0.x/tags/TOMCAT_6_0_11,
revision 530531, and finally with the trunk, revision 531159.
For the APR connector I used tcnative-1.dll in version 1.1.8.

Without enhanced synchronization, the XMLHttpRequests are often hanging after
reaching readyState 3 (often with Firefox, sometimes with IE).
Furthermore, CometEventImpl.close sometimes fails, and with the NIO connector I
saw e.g. this Exception:

java.lang.NullPointerException
        at
org.apache.coyote.http11.InternalNioOutputBuffer.writeToSocket(InternalNioOutputBuffer.java:436)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.flushBuffer(InternalNioOutputBuffer.java:761)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.endRequest(InternalNioOutputBuffer.java:398)
        at 
org.apache.coyote.http11.Http11NioProcessor.action(Http11NioProcessor.java:1087)
        at org.apache.coyote.Response.action(Response.java:183)
        at org.apache.coyote.Response.finish(Response.java:305)
        at 
org.apache.catalina.connector.OutputBuffer.close(OutputBuffer.java:276)
        at 
org.apache.catalina.connector.Response.finishResponse(Response.java:486)
        at 
org.apache.catalina.connector.CometEventImpl.close(CometEventImpl.java:85)
        at comettest.CometServlet.closeEvent(CometServlet.java:331)
        at comettest.CometServlet.access$2(CometServlet.java:317)
        at 
comettest.CometServlet$EventProvider.sendResponse(CometServlet.java:146)
        at comettest.CometServlet$EventProvider.run(CometServlet.java:95)
        at java.lang.Thread.run(Thread.java:595)


With the APR connector sometimes even the VM crashed. (see attached dump)


This is probably related to my observation that, if the concurrent close is
executed immediately after the CoyoteAdapter.service call, there is not END
event signalled.
Also with the NIO connector I saw that I did not get END events for all BEGIN
events.

Thus, my guess was that response objects might have been recycled too early, and
I modified the classes org.apache.catalina.connector.CometEventImpl,
org.apache.catalina.connector.CoyoteAdapter, and
org.apache.catalina.connector.Request to add a synchronization between event
processor thread and response provider thread via the CometEventImpl object.
(see attachment patches-reich.jar)

This synchronization prevents a recycling of request and response before the
close operation is completely finished and ensures that the close operation is
executed at most once.
- If one Thread enters the close method, the state of the object changes from
OPEN to CLOSING. When the close has finished, state goes to CLOSED.
- One sync point is in the CoyoteAdapter.service method when decides whether the
request shall be closed or put into the CometPoller.
  If the CometEventImpl.close method has not been called until then (state is
still OPEN), any later invocation of close is allowed to perform recycling of
request and response object by its own, until a new event is dispatched for the
request.
- The second sync point is at event dispatching to the CoyoteAdapter.event 
method.
  If the CometEventImpl state is still OPEN, the CoyoteAdapter takes over
responsibilty for recycling again, until the end of event dispatching.
  If CometEventImpl state is CLOSED, it does not make sense to send an END event
into the valve, because CometEventImpl.close has already recycled request and
response object.

I found out that the asynchronous response providers must not call the close
method of the OutputStream directly.
They must only use the CometEvent.close method, as otherwise there are still
problems due to unsynchronized access (I did not retest this with the trunk 
yet).


An alternative solution would be to leave the job of recycling completely to
CoyoteAdapter.event, but then we must be 100% sure that we get an appropriate
event from the underlying connector.

In my mind, the far better alternative would be to leave all recycling
completely to the garbage collector, i.e. do not reuse any objects at all, to
avoid the problems that arise automatically when we leave the world of
synchronous request processing, but surely this cannot be done in a day.

With my adaptation, my test application runs without problems for quite a while
with both connectors if there are only short resonse times for the poll 
requests.
I still sometimes (but not that often) got VM crashes with the APR connector at
the same point of execution of my response provider thread.
I am not sure if my synchronization is still not good enough, or if these
crashes happened when I pressed the reload button to reload the test page into
the browser.
(I saw in the repository created by 'ant download' that you are already working
on tcnative 1.1.10, so that problem might already be solved.)

As I ran my tests (server and browsers) on a WinXP/SP2 machine with a single
processor, I cannot tell whether my adaptation also works well on other
operating systems or on multi processor machines. Furthermore, I don't know if
it really fits with your concepts of the CometEvent lifecycle model. Therefore,
I cannot claim that my adaptations are a patch.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to