On Wed, 4 Sep 2019 07:50:29 +0200
Otto Moerbeek <[email protected]> wrote:
> The kernel is supposed to abort programs that have a stack pointer
> not pointing to a MAP_STACK flagged reagion. The repeating is indeed a
> bug.
>
> Pleaase post your test program on bugs. This need to be fixed to be
> able to do debug the boost problem further.
>
> -Otto
Theo de Raadt fixed the kernel last week. I made some progress with
boost this week, and now send a revised diff; but I now have a problem
with C++ exceptions through ontop_fcontext(), so today's diff is still
broken.
The fixed kernel aborted the program with SIGSEGV and left a core
dump. My diff for jump*.S had a wrong `mr %r3, %r5`, where the %r5
should be %r6. My mistake lost the future stack pointer in %r6.
After fixing the %r6, I made 2 other changes:
1. I fixed make_fcontext() in make*.S to work with jump_fcontext()
in jump*.S, by adding a trampoline to pass the transfer_t to
the context-function.
2. I tried to fix ontop_fcontext() in ontop*.S, so that it passes
the transfer_t to the ontop-function.
To build the jump and fibonacci examples, I copied
WRKSRC/libs/context/example to my home directory, then did
$ cat Makefile
CXX = eg++
DEBUG = -g
CPPFLAGS = -I/usr/local/include
LDFLAGS = -L/usr/local/lib -lboost_context-mt
all: fibonacci jump
$ make
Now ./jump works, but ./fibonacci fails with
$ ./fibonacci
0 1 1 2 3 5 8 13 21 34
main: done
terminate called after throwing an instance of
'boost::context::detail::forced_unwind'
Abort trap (core dumped)
There are 2 contexts: a main function that takes 10 numbers, and a
ctx::continuation that would generate the infinite sequence of
fibonacci numbers. As the main function returns, it calls the C++
destructor ~continuation. The destructor uses ontop_fcontext() to
throw the C++ exception forced_unwind in the context of the fibonacci
generator. This exception should unwind the stack until
context_entry() catches the forced_unwind; context_entry() is defined
in <boost/context/continuation_fcontext.hpp>.
I interpret the error "terminate called..." to mean that
context_entry() failed to catch the exception. My guess is that
the ontop_fcontext() fails to pass through C++ exceptions.
boost/context uses 3 machine-dependent functions:
- transfer_t jump_fcontext(fcontext_t to, void *vp) pauses the
current context and jumps to the other context. It returns
struct transfer_t {fcontext_t from, void *vp}.
- fcontext_t make_fcontext(void *sp, std::size_t size,
void (*fn)(transfer_t)) takes a stack of the given size, and
returns a new context. The first jump to the new context will
call the entry point fn({from, vp}).
- transfer_t ontop_fcontext(fcontext_t to, void *vp,
transfer_t (*fn)(transfer_t)) jumps to the other context, but
calls fn({from, vp}) in the other context.
{jump,make,ontop}_ppc32_sysv_elf_gas.S implements these functions
in PowerPC assembly. fcontext_t is a pointer to a special 244-byte
frame on top of the stack.
To recap the original problem: the System V ABI says to return the
8-byte transfer_t in registers %r3 and %r4. Linux deviates from the
ABI by passing a hidden parameter transfer_t *%r3; this points to an
8-byte return area. The existing code is for Linux and needs patches
to work on OpenBSD.
ontop_fcontext calls transfer_t fn(transfer_t). This call is
- fn(hidden transfer_t *%r3, transfer_t *%r4) on Linux
- fn(transfer_t *%r3) on OpenBSD
For Linux, the code removes the 244-byte frame from the stack and
makes a tail call to fn. The storage for *%r3 and *%r4 comes from the
hidden parameters to ontop_fcontext or jump_fcontext in both contexts.
Because of the tail call, ontop_fcontext() is no longer in the stack
when the fibonacci example throws force_unwind.
For OpenBSD, there are no hidden parameters, so we are missing 8 bytes
of storage for *%r3. I also don't want to keep the 244-byte frame on
the stack, because 244 isn't a multiple of 16, so the stack pointer
would be misaligned. I try to solve this by removing the 244-byte
frame and then allocating a 16-byte frame to store *%r3; but now
ontop_fcontext() keeps the 16-byte frame in the stack, and I don't
know how to throw force_unwind through this frame!
Also, the upstream code has 2 or 3 other bugs:
1. make_fcontext() misaligns the new stack. It tries to 16-align the
stack pointer $r1, but it points $r1 to the 244-byte frame, so
jump_fcontext() will remove the frame, then call the entry point
with a misaligned $r1. I haven't fixed this.
2. New contexts set %r13 to garbage. In the ABI, %r13 must point to
the executable's small data. I might not fix this, because
OpenBSD deviates from the ABI by never setting %r13. Compilers
seem to never use %r13, because they don't know whether the code
is for the executable or a shared library.
3. I'm not sure, but the call to _exit(0) in make_fcontext() might be
wrong on systems that use the secure PLT, including OpenBSD.
I might want to resize and rearrange the 244-byte frame to fit the ABI
(https://refspecs.linuxbase.org/elf/elfspec_ppc.pdf), but this would
require rewriting most of the code, and might not solve the
force_unwind problem. My next idea is to try using eg++ -S to teach
myself how to handle C++ exceptions in assembly.
The broken diff follows.
Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/boost/Makefile,v
retrieving revision 1.89
diff -u -p -r1.89 Makefile
--- Makefile 9 Aug 2019 11:25:29 -0000 1.89
+++ Makefile 12 Sep 2019 01:59:21 -0000
@@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
FIX_EXTRACT_PERMISSIONS = Yes
REVISION-main= 6
-REVISION-md= 1
+REVISION-md= 2
SO_VERSION= 9.0
BOOST_LIBS= boost_atomic-mt \
Index: patches/patch-libs_context_build_Jamfile_v2
===================================================================
RCS file: patches/patch-libs_context_build_Jamfile_v2
diff -N patches/patch-libs_context_build_Jamfile_v2
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_build_Jamfile_v2 12 Sep 2019 01:59:21 -0000
@@ -0,0 +1,17 @@
+$OpenBSD$
+
+ppc32_sysv_elf has 2 instances of "<toolset>clang".
+The second "clang" should be "gcc".
+
+Index: libs/context/build/Jamfile.v2
+--- libs/context/build/Jamfile.v2.orig
++++ libs/context/build/Jamfile.v2
+@@ -326,7 +326,7 @@ alias asm_sources
+ <address-model>32
+ <architecture>power
+ <binary-format>elf
+- <toolset>clang
++ <toolset>gcc
+ ;
+
+ alias asm_sources
Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 12 Sep
2019 01:59:21 -0000
@@ -0,0 +1,66 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type jump_fcontext,@function
+ jump_fcontext:
++ # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
++ # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
++
+ # reserve space on stack
+ subi %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ jump_fcontext:
+ stw %r29, 216(%r1) # save R29
+ stw %r30, 220(%r1) # save R30
+ stw %r31, 224(%r1) # save R31
++#ifdef __Linux__
+ stw %r3, 228(%r1) # save hidden
++#endif
+
+ # save CR
+ mfcr %r0
+@@ -135,8 +140,12 @@ jump_fcontext:
+ # store RSP (pointing to context-data) in R6
+ mr %r6, %r1
+
+- # restore RSP (pointing to context-data) from R4
++ # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+ mr %r1, %r4
++#else
++ mr %r1, %r3
++#endif
+
+ lfd %f14, 0(%r1) # restore F14
+ lfd %f15, 8(%r1) # restore F15
+@@ -178,7 +187,9 @@ jump_fcontext:
+ lwz %r29, 216(%r1) # restore R29
+ lwz %r30, 220(%r1) # restore R30
+ lwz %r31, 224(%r1) # restore R31
++#ifdef __Linux__
+ lwz %r3, 228(%r1) # restore hidden
++#endif
+
+ # restore CR
+ lwz %r0, 232(%r1)
+@@ -195,8 +206,13 @@ jump_fcontext:
+ addi %r1, %r1, 244
+
+ # return transfer_t
++#ifdef __Linux__
+ stw %r6, 0(%r3)
+ stw %r5, 4(%r3)
++#else
++ mr %r3, %r6
++ # %r4, %r4
++#endif
+
+ # jump to context
+ bctr
Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 12 Sep
2019 01:59:21 -0000
@@ -0,0 +1,67 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+@@ -90,7 +90,13 @@ make_fcontext:
+ subi %r3, %r3, 336
+
+ # third arg of make_fcontext() == address of context-function
++#ifdef __Linux__
++ # save context-function as PC
+ stw %r5, 240(%r3)
++#else
++ # save context-function for trampoline
++ stw %r5, 252(%r3)
++#endif
+
+ # set back-chain to zero
+ li %r0, 0
+@@ -99,10 +105,12 @@ make_fcontext:
+ mffs %f0 # load FPSCR
+ stfd %f0, 144(%r3) # save FPSCR
+
++#ifdef __Linux__
+ # compute address of returned transfer_t
+ addi %r0, %r3, 252
+ mr %r4, %r0
+ stw %r4, 228(%r3)
++#endif
+
+ # load LR
+ mflr %r0
+@@ -111,6 +119,11 @@ make_fcontext:
+ 1:
+ # load LR into R4
+ mflr %r4
++#ifndef __Linux__
++ # compute abs address of trampoline; use as PC
++ addi %r7, %r4, trampoline - 1b
++ stw %r7, 240(%r3)
++#endif
+ # compute abs address of label finish
+ addi %r4, %r4, finish - 1b
+ # restore LR
+@@ -123,6 +136,19 @@ make_fcontext:
+ mtlr %r6
+
+ blr # return pointer to context-data
++
++#ifndef __Linux__
++trampoline:
++ # On systems other than Linux, jump_fcontext is returning the
++ # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
++ # our context-function.
++ lwz %r0, 8(%r1) # address of context-function
++ mtctr %r0
++ stw %r3, 8(%r1)
++ stw %r4, 12(%r1) # move transfer_t to stack
++ la %r3, 8(%r1) # address of transfer_t
++ bctr
++#endif
+
+ finish:
+ # save return address into R0
Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 12 Sep
2019 01:59:21 -0000
@@ -0,0 +1,91 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type ontop_fcontext,@function
+ ontop_fcontext:
++ # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
++ # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
++
+ # reserve space on stack
+ subi %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ ontop_fcontext:
+ stw %r29, 216(%r1) # save R29
+ stw %r30, 220(%r1) # save R30
+ stw %r31, 224(%r1) # save R31
++#ifdef __Linux__
+ stw %r3, 228(%r1) # save hidden
++#endif
+
+ # save CR
+ mfcr %r0
+@@ -135,8 +140,12 @@ ontop_fcontext:
+ # store RSP (pointing to context-data) in R7
+ mr %r7, %r1
+
+- # restore RSP (pointing to context-data) from R4
++ # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+ mr %r1, %r4
++#else
++ mr %r1, %r3
++#endif
+
+ lfd %f14, 0(%r1) # restore F14
+ lfd %f15, 8(%r1) # restore F15
+@@ -178,20 +187,25 @@ ontop_fcontext:
+ lwz %r29, 216(%r1) # restore R29
+ lwz %r30, 220(%r1) # restore R30
+ lwz %r31, 224(%r1) # restore R31
++#ifdef __Linux__
+ lwz %r4, 228(%r1) # restore hidden
++#endif
+
+ # restore CR
+ lwz %r0, 232(%r1)
+ mtcr %r0
+ # restore LR
+ lwz %r0, 236(%r1)
++#ifdef __Linux__
+ mtlr %r0
++#endif
+ # ignore PC
+
+ # adjust stack
+ addi %r1, %r1, 244
+
+- # return transfer_t
++#ifdef __Linux__
++ # return transfer_t
+ stw %r7, 0(%r4)
+ stw %r5, 4(%r4)
+
+@@ -200,6 +214,21 @@ ontop_fcontext:
+
+ # jump to ontop-function
+ bctr
++#else
++ # On systems other than Linux, the caller didn't allocate memory
++ # for the transfer_t, so we must allocate it.
++ mtctr %r5 # ontop-function
++ stwu %r1, -16(%r1) # allocate stack frame
++ stw %r0, 20(%r1) # save LR in caller's LR save word
++ stw %r7, 8(%r1)
++ stw %r4, 12(%r1)
++ la %r3, 8(%r1) # address of transfer_t
++ bctrl # call ontop-function, return here
++ lwz %r0, 20(%r1)
++ mtlr %r0 # restore LR
++ addi %r1, %r1, 16 # free stack frame
++ blr # return to caller
++#endif
+ .size ontop_fcontext, .-ontop_fcontext
+
+ /* Mark that we don't need executable stack. */
--
George Koehler <[email protected]>