https://bz.apache.org/bugzilla/show_bug.cgi?id=67927

            Bug ID: 67927
           Summary: TLSCertificateReloadListener triggers race condition
                    (?) in OpenSSL code which causes the JVM to die
           Product: Tomcat 9
           Version: 9.0.x
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Connectors
          Assignee: dev@tomcat.apache.org
          Reporter: micha...@apache.org
  Target Milestone: -----

This has been verified on Tomcat 9 from
bec7a51d7fc3fb913c755b258169d1816b77bea5, but I guess this happens on all
Tomcat versions.

* OS: Windows 10 and HP-UX
* Java 8 in both cases
* tomcat-native 1.2.39 and 1.2.38
* OpenSSL 3.0.11 and 1.1.1t

Reproducing on Windows is harder, on HP-UX it fails quite fast.

Reproducer:
* Take the connector configs from
https://bz.apache.org/bugzilla/show_bug.cgi?id=67666#c0.
* Configure listener: 
> <Listener 
> className="org.apache.catalina.security.TLSCertificateReloadListener" 
> checkPeriod="30" daysBefore="1" />
* Generate a soon to expire certificate or already expired which will trigger
the listener
* Start Tomcat and you should start to see the listener warning you
* Run:
> while true ; do for port in 20001 20002 20003 20004 30001 30002 ; do echo -n 
> "$port: "; echo Q | openssl s_client -connect localhost:$port -brief 2>&1 | 
> grep -e DONE -e errno  ; done; done
You should see repeating:
> ...
> 20001: DONE
> 20002: DONE
> 20003: DONE
> 20004: DONE
> 30001: DONE
> 30002: DONE
> ...
> 20004: DONE
> 30001: write:errno=232
> 30002: write:errno=232
> 20001: connect:errno=239
> 20002: connect:errno=239
> 20003: connect:errno=239
> 20004: connect:errno=239
> ...
* Run:
> tail -f logs/catalina.out
and see the JVM crash.

On Windows I see:
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000001ec4bf30f9a, 
> pid=5240, tid=0x0000000000005b40
> #
> # JRE version: OpenJDK Runtime Environment (Zulu 8.68.0.21-CA-win64) 
> (8.0_362-b09) (build 1.8.0_362-b09)
> # Java VM: OpenJDK 64-Bit Server VM (25.362-b09 mixed mode windows-amd64 
> compressed oops)
> # Problematic frame:
> # C  [tcnative-1.dll+0x80f9a]
> #
> ....
> 
> Stack: [0x0000009785200000,0x0000009785300000],  sp=0x00000097852febb0,  free 
> space=1018k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [tcnative-1.dll+0x80f9a]
> C  [tcnative-1.dll+0x2d580]
> C  [tcnative-1.dll+0x10256]
> C  [tcnative-1.dll+0x10c8f]
> C  0x000001ec2f0e9d8e
> 
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  org.apache.tomcat.jni.SSLSocket.attach(JJ)I+0
> j  
> org.apache.tomcat.util.net.AprEndpoint.setSocketOptions(Lorg/apache/tomcat/util/net/SocketWrapperBase;)Z+114
> j  org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run()V+32
> j  
> org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(Lorg/apache/tomcat/util/threads/ThreadPoolExecutor$Worker;)V+92
> j  org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run()V+5
> j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
> j  java.lang.Thread.run()V+11
> v  ~StubRoutines::call_stub

On HP-UX I see:
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (11) at pc=c97d00b0, pid=6253, tid=80
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0) (build 1.8.0.26-hp-ux-b1)
> # Java VM: Java HotSpot(TM) Server VM (25.26-b1 mixed mode hp-ux-ia64 )
> # Problematic frame:
> # C  [libcrypto.so.1.1+0x4ab727a0]  OPENSSL_sk_dup+0x4ac131b0
> #
> # Core dump written. Default location: /var/opt/tomcat90-eval/core or 
> core.6253 (max size 2097151 kB). To ensure a full core dump, try "ulimit -c 
> unlimited" before starting Java again
> #
> #  Please report this error to HPE customer support.
> #
> 
> ---------------  T H R E A D  ---------------
> 
> Current thread (018dd000):  JavaThread "https-openssl-apr-30002-exec-1" 
> daemon [_thread_in_native, id=80, lwp_id=7915536, stack(54b01000,54b41000)]
> 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 00000008
> ...
> Stack: [54b01000,54b41000],  sp=54b40a90,  free space=254k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libcrypto.so.1.1+0x4ab727a0]  OPENSSL_sk_dup+0x4ac131b0
> 
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  org.apache.tomcat.jni.SSLSocket.attach(JJ)I+0
> j  
> org.apache.tomcat.util.net.AprEndpoint.setSocketOptions(Lorg/apache/tomcat/util/net/SocketWrapperBase;)Z+114
> j  org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run()V+32
> j  
> org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(Lorg/apache/tomcat/util/threads/ThreadPoolExecutor$Worker;)V+92
> j  org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run()V+5
> j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
> j  java.lang.Thread.run()V+11
> v  ~StubRoutines::call_stub

While I have no idea how to approach this on Windows, I have loaded the
coredump in GDB on HP-UX:
> # gdb /opt/java8/bin/java core
> HP gdb 6.8 for HP Itanium (32 or 64 bit) and target HP-UX 11iv2 and 11iv3.
> Copyright 1986 - 2011 Free Software Foundation, Inc.
> Hewlett-Packard Wildebeest 6.8 (based on GDB) is covered by the
> GNU General Public License. Type "show copying" to see the conditions to
> change it and/or distribute copies. Type "show warranty" for warranty/support.
> ..
> Core was generated by `java'.
> Program terminated with signal 6, Aborted.
> 
> #0  0x60000000c020ff10:0 in _lwp_killtype Method, field __vfp not found in 
> target
> type ConstantPool, field __vfp not found in target
> constant CollectedHeap::G1CollectedHeap not found in target
> +0x30 ()
>    from /usr/lib/hpux32/libpthread.so.1
> (gdb) where
> #0  0x60000000c020ff10:0 in _lwp_kill+0x30 ()
>    from /usr/lib/hpux32/libpthread.so.1
> #1  0x60000000c0174ec0:0 in pthread_kill+0x9f0 ()
>    from /usr/lib/hpux32/libpthread.so.1
> #2  0x60000000c0403740:0 in raise+0xe0 () from /usr/lib/hpux32/libc.so.1
> #3  0x60000000c0528690:0 in abort+0x170 () from /usr/lib/hpux32/libc.so.1
> #4  0x60000000cd743de0:0 in os::abort ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os/hp-ux/vm/os_hp-ux.cpp:2090
> #5  0x60000000cdd605f0:0 in VMError::report_and_die ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/utilities/vmError.cpp:1421
> #6  0x60000000cdd61210:0 in crash_handler ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os/hp-ux/vm/vmError_hp-ux.cpp:84
> #7  <signal handler called>
> #8  0x60000000cd78a200:0 in ParallelScavengeHeap::block_start ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/oops/oop.inline.hpp:440
> #9  0x60000000cd7233d0:0 in os::print_location ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/os.cpp:1222
> #10 0x60000000cd759f50:0 in os::print_register_info ()
> ---Type <return> to continue, or q <return> to quit---
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os_cpu/hp-ux_ia64/vm/os_hp-ux_ia64.cpp:1199
> #11 0x60000000cdd5dbb0:0 in VMError::report ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/utilities/vmError.cpp:676
> #12 0x60000000cdd60190:0 in VMError::report_and_die ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/utilities/vmError.cpp:1318
> #13 0x60000000cd75c170:0 in os::Hpux::JVM_handle_hpux_signal ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os_cpu/hp-ux_ia64/vm/os_hp-ux_ia64.cpp:1157
> #14 0x60000000cd72c600:0 in os::Hpux::signalHandler ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os/hp-ux/vm/os_hp-ux.cpp:4435
> #15 <signal handler called>
> #16 0x60000000c97d00b0:0 in OPENSSL_sk_dup+0x70 ()
>    from /usr/lib/hpux32/libcrypto.so.1.1
> #17 0x60000000c5901750:0 in SSL_new+0x290 () from 
> /usr/lib/hpux32/libssl.so.1.1
> #18 0x60000000c55908e0:0 in ssl_create () at src/sslnetwork.c:128
> #19 0x60000000c5591780:0 in Java_org_apache_tomcat_jni_SSLSocket_attach ()
>     at src/sslnetwork.c:603
> #20 0x200000007a800d60 in Java native_call_stub frame
> #21 0x200000007a815af0 in JNI frame: org.apache.tomcat.jni.SSLSocket::attach 
> (lo---Type <return> to continue, or q <return> to quit---
> ng, long) ->int
> #22 0x200000007a806920 in interpreted frame: 
> org.apache.tomcat.util.net.AprEndpoint::setSocketOptions 
> (org.apache.tomcat.util.net.SocketWrapperBase) ->boolean bci: 114
> #23 0x200000007a8066e0 in interpreted frame: 
> org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor::run () 
> ->void bci: 32
> #24 0x200000007a806d80 in interpreted frame: 
> org.apache.tomcat.util.threads.ThreadPoolExecutor::runWorker 
> (org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker) ->void bci: 92
> #25 0x200000007a806d40 in interpreted frame: 
> org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker::run () ->void bci: 5
> #26 0x200000007a806d80 in interpreted frame: 
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable::run () ->void 
> bci: 4
> #27 0x200000007a806d80 in interpreted frame: java.lang.Thread::run () ->void 
> bci: 11
> #28 0x200000007a800470 in Java entry frame
> #29 0x60000000ccd30d70:0 in JavaCalls::call_helper ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/javaCalls.cpp:415
> #30 0x60000000cd7500f0:0 in os::os_exception_wrapper ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os/hp-ux/vm/os_hp-ux.cpp:4367
> #31 0x60000000ccd356e0:0 in JavaCalls::call ()
> ---Type <return> to continue, or q <return> to quit---
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/javaCalls.cpp:318
> #32 0x60000000ccd39c90:0 in JavaCalls::call_virtual ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/javaCalls.cpp:215
> #33 0x60000000ccd3d760:0 in JavaCalls::call_virtual ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/javaCalls.cpp:221
> #34 0x60000000cceefa60:0 in thread_entry ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/prims/jvm.cpp:3182
> #35 0x60000000cdbe3af0:0 in JavaThread::run ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/share/vm/runtime/thread.cpp:3204
> #36 0x60000000cd72afa0:0 in _start ()
>     at 
> /eevee_home/home/jinteg/jenkins_tmp/java8/h8.0.26.rc1b1/hotspot/src/os/hp-ux/vm/os_hp-ux.cpp:1211
> #37 0x60000000c013b820:0 in __pthread_bound_body+0x1c0 ()
>    from /usr/lib/hpux32/libpthread.so.1

Line in question:
> if ((ssl = SSL_new(ctx->ctx)) == NULL) {

I bet that either "ctx->ctx" is in an invalid state or it is NULL. I am certain
that it can also be reproduced on macOS, *BSD and GNU/Linux.
Is it possible that the connector is not locked during the reload which opens a
small windows for a NULL pointer?

Let me know what type of information I can provide.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to