the problem on iSeries with the sprintf is very common since on this 64bits boxes integer and pointers are very different (32 bits int and 128 bits pointers).
The Apache 2 with modified mod_jk should be tested today so I could tell you if the patches solve the problem. Regards 2007/2/2, Eric Wertman <[EMAIL PROTECTED]>:
I did rebuild the sources as prefork, it seems to have stopped the problem. I did also make those 2 corrections you pointed out earlier in the thread. Eric Wertman wrote: > Connector: > > <Connector port="8010" protocol="AJP/1.3" enableLookups="false" > maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" /> > > I've tried all sorts of permutations of this... None of the values > make much difference. My ab test is -n 1000 -c 200. > > --------------------------------------- > > workers.properties: > > # PlatformServer > worker.PlatformServer.type=ajp13 > worker.PlatformServer.host=localhost > worker.PlatformServer.port=8010 > worker.PlatformServer.connection_pool_timeout=60 > worker.PlatformServer.socket_timeout=60 > worker.PlatformServer.retries=5 > > I've been through a lot of different settings here too. I've tried > some almost insane values for maxThreads > --------------------------------------- > > httpd.conf (Using virtual servers, forgot to mention that before) > > <IfModule jk_module> > JkWorkersFile conf/workers.properties > JkShmFile logs/mod_jk.shm > JkShmSize 8192 > JkLogFile logs/mod_jk.log > JkLogLevel debug > </IfModule> > > --------------------------------------- > > mod_jk startup at debug level. I only included it because it shows > what mod_jk thinks of my connector. > > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239): > creating worker PlatformServer > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144): > about to create instance PlatformServer of ajp13 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157): > about to validate and init PlatformServer > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (1971): worker PlatformServer contact is 'localhost:8010' > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2098): setting endpoint options: > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2101): keepalive: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2105): timeout: 60 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2109): buffer size: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2113): pool timeout: 60 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2117): connect timeout: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2121): reply timeout: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2125): prepost timeout: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2129): recovery options: 0 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2133): retries: 5 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2137): max packet size: 8192 > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c > (2008): setting connection pool size to 25 with min 13 > > --------------------------------------- > > Can't reproduce at debug level. This is info (got it on the first try). > > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c > (1504): Unknown AJP protocol code: 41 > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info] jk_ajp_common.c > (1842): (PlatformServer) request failed, because of server error > without recovery in send loop attempt=0 > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c > (970): wrong message format 0x3837 from 127.0.0.1:8010 > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c > (1566): (PlatformServer) Tomcat is down or refused connection. No > response has been sent to the client (yet) > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info] jk_ajp_common.c > (1877): (PlatformServer) receiving from tomcat failed, recoverable > operation attempt=0 > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info] jk_ajp_common.c > (1916): (PlatformServer) sending request to tomcat failed, > recoverable operation attempt=1 > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info] mod_jk.c (2142): > Service error=-5 for worker=PlatformServer > [Wed Jan 31 18:31:11 2007] [2539586:0000] [info] mod_jk.c (401): > Write without start, starting with defaults > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c > (970): wrong message format 0x0400 from 127.0.0.1:8010 > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c > (1566): (PlatformServer) Tomcat is down or refused connection. No > response has been sent to the client (yet) > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c > (1504): Unknown AJP protocol code: 02 > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info] jk_ajp_common.c > (1877): (PlatformServer) receiving from tomcat failed, recoverable > operation attempt=0 > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info] jk_ajp_common.c > (1916): (PlatformServer) sending request to tomcat failed, > recoverable operation attempt=1 > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info] jk_ajp_common.c > (1842): (PlatformServer) request failed, because of server error > without recovery in send loop attempt=0 > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info] mod_jk.c (2142): > Service error=-5 for worker=PlatformServer > > --------------------------------------- > > Here's the Tomcat log: > > Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke > INFO: Unknown message 0 > Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke > INFO: Unknown message 0 > > --------------------------------------- > > > And here's the only thing I know about dbx: > > dbx ../bin/httpd core > Type 'help' for help. > warning: The core file is not a fullcore. Some info may not be available. > [using memory image in core] > reading symbolic information ... > > > Segmentation fault in sig_coredump at 0x10037590 > 0x10037590 (sig_coredump+0x84) 80410014 lwz r2,0x14(r1) > (dbx) where > sig_coredump() at 0x10037590 > malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, > 0x7858c51, 0x17, 0x0) at 0xd03002c4 > malloc_common.malloc_common_53_36(??) at 0xd02fd8b8 > jk_pool_dyn_alloc() at 0xdc05a4c0 > jk_pool_alloc() at 0xdc05a5b4 > jk_b_set_buffer_size() at 0xdc06c0fc > ajp_service() at 0xdc064c28 > jk_handler() at 0xdc050e60 > ap_run_handler() at 0x10003a24 > ap_invoke_handler() at 0x10004810 > ap_process_request() at 0x1002fd88 > ap_process_http_connection() at 0x1004d0dc > ap_run_process_connection() at 0x10013dc0 > ap_process_connection() at 0x100144f0 > process_socket() at 0x1000b03c > worker_thread() at 0x1000a8e0 > dummy_worker() at 0xdbdb8384 > > --------------------------------------- > > I'll have a look and see if I can't get more information out of > tomcat.. there must be a way to increase the logging there. I am > using mpm=worker. If you still think it's worth the time I will > re-compile with prefork and take another shot. > > Thanks again for your help. > > Eric > > Rainer Jung wrote: >> Hi Eric, >> >> maybe your observation is in fact related to Henri's, since you also >> get core dumps (segmentation faults), which is quite unusual. See >> further comments inline. >> >> Eric Wertman wrote: >>> Sorry to jump in, I'm new here. I started watching this list >>> because of a problem I'm having with the mod_jk 1.2.20 as well. I'm >>> not getting core files, but I do have problems that I can't >>> reproduce at log level trace or debug. >> >> Apache logs segmentation faults. You might be able to produce a core >> (which would be very helpful) by setting the coredumpdirectory in >> apache and maybe tweaking your os config. One point that might help >> to produce cores, would be to not run apache as root. >> >>> I'm running it on AIX 5.3 (ml05). I compiled the apache 2.2.4 and >>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM >>> 1.5 JRE. >>> >>> I compiled them all using the IBM cc_r compiler. I've actually >>> tried a number of different flags trying to resolve my problem, >>> assuming initially that I was doing something incorrectly. The >>> behavior persists, though, and my compilations have all been clean. >>> Apache without mod_jk doesn't give me any problems. >>> >>> Superficially it seems to work fine. Once I run tests with ab, it >>> starts to get a little ugly. I get a number of failed requests, and >>> these types of errors: >>> >>> >>> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1504): Unknown AJP protocol code: 41 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (970): wrong message format 0x3837 from 127.0.0.1:8010 >> >> hose mean, taht something is fundamentally wrong in the answer >> mopd_jk receives from tomcat. It doesn't really look like AJP/1.3. Is >> your MPM worker or prefork? Could you try again with prefork, in case >> it is worker? >> >> Do you get error message on the tomcat side? >> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1566): (PlatformServer) Tomcat is down or refused connection. No >>> response has been sent to the client (yet) >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable >>> operation attempt=0 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1916): (PlatformServer) sending request to tomcat failed, >>> recoverable operation attempt=1 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1842): (PlatformServer) request failed, because of server error >>> without recovery in send loop attempt=0 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] mod_jk.c (2142): >>> Service error=-5 for worker=PlatformServer >> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] mod_jk.c (401): >>> Write without start, starting with defaults >> >> Ths one is very unusual, and it is the log statement, that is the >> only one, that could have produced Henri's core dump, although we >> still do not know why. It correlates to a protocol error. >> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (970): wrong message format 0x031a from 127.0.0.1:8010 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1592): (PlatformServer) Tomcat is down or network problems. Part of >>> the response has already been sent to the cli >>> ent >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable >>> operation attempt=1 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1504): Unknown AJP protocol code: D8 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1842): (PlatformServer) request failed, because of server error >>> without recovery in send loop attempt=0 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] mod_jk.c (2142): >>> Service error=-5 for worker=PlatformServer >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1916): (PlatformServer) sending request to tomcat failed, >>> recoverable operation attempt=2 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (915): sendfull returned -14 with errno=14 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (948): (PlatformServer) can't receive the response message from >>> tomcat, network problems or tomcat is down (127.0. >>> 0.1:8010), err=-9 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1566): (PlatformServer) Tomcat is down or refused connection. No >>> response has been sent to the client (yet) >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable >>> operation attempt=0 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (1292): (PlatformServer) failed resending request body (812760852) >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1916): (PlatformServer) sending request to tomcat failed, >>> recoverable operation attempt=3 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1916): (PlatformServer) sending request to tomcat failed, >>> recoverable operation attempt=1 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (915): sendfull returned -9 with errno=9 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info] jk_ajp_common.c >>> (1201): (PlatformServer) error sending request. Will try another >>> pooled connection >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (915): sendfull returned -14 with errno=14 >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c >>> (948): (PlatformServer) can't receive the response message from >>> tomcat, network problems or tomcat is down (127.0. >>> >>> And these in the apache error_log: >>> >>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache >>> initialised >>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) >>> servers for SSL >>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: >>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d >>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) >>> mod_jk/1.2.20 mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming >>> normal operations >>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30 >>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: >>> sysvsem (default: sysvsem) >>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal >>> Segmentation fault (11) >>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener >>> thread didn't exit >>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener >>> thread didn't exit >>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over >>> scoreboard slot from 1552520 (quiescing) >>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal >>> Segmentation fault (11) >>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal >>> Segmentation fault (11) >>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal >>> Segmentation fault (11) >>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal >>> Segmentation fault (11) >>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal >>> Segmentation fault (11) >>> >>> >>> Again, I thought I would be able to track them down by changing the >>> log level to "trace", but the problem disappears at that log level. >>> I had assumed it was slowing the connection rate down enough to >>> mitigate the problem. Not sure if any of that helps.. if you can >>> help me or want more info let me know. >> >> With JkLogLevel debug (which logs more than info but less than trace) >> we would see, which packets you receive from tomcat. Maybe debug >> still shows the problem? >> >> Could you please post your config (Jk* from apache, workers.roperties >> and Connector from tomcat). >> >>> >>> Thanks! >>> >>> Eric >> >> Regards, >> >> Rainer >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]