Re: frame unwinding patches

2017-04-26 Thread Ulf Hermann
> I dropped the arm32 frame pointer unwinder for now (maybe we need a less
> demanding testcase for that or, more awesome, add code to translate the
> exidx section for that).

Another problem is that QV4-generated code on a new frame pushes LR first and 
then FP. Code generated by gcc with "-arm -mapcs-frame -fno-omit-frame-pointer" 
pushes FP first and then LR. The libc raise() I have here miraculously does the 
same as QV4. Also, QV4 can alternatively use either r11 or r7 for LR, depending 
on if we're in ARM or THUMB mode (which I cannot detect in the unwind hook). As 
that is written somewhere in AAPCS, I guess you can coax gcc to do the same 
thing (but just leaving out the "-arm" above simply leads to no frame pointers 
at all).

Well, let's forget about this for now. I'll keep something that works with QV4 
in ARM mode and ignore everything else.

Ulf


Re: [PATCH 5/5] Add frame pointer unwinding for aarch64

2017-04-26 Thread Mark Wielaard
On Tue, 2017-04-25 at 15:38 +0200, Ulf Hermann wrote:
> > My question is about this "initial frame". In our testcase we don't have
> > this case since the backtrace starts in a function that has some CFI.
> > But I assume you have some tests that rely on this behavior.
> 
> Actually the test I provided does exercise this code. The initial
> __libc_do_syscall() frame does not have CFI. Only raise() has. You can
> check that by dropping the code for pc & 0x1.

Maybe I am using the wrong binaries (exec and core), but for me there is
no difference.

With or with commenting out the adjustments:

diff --git a/backends/aarch64_unwind.c b/backends/aarch64_unwind.c
index cac4ebd..36cd0e1 100644
--- a/backends/aarch64_unwind.c
+++ b/backends/aarch64_unwind.c
@@ -63,6 +63,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), 
Dwarf_Addr pc __attribute__
 
   // The initial frame is special. We are expected to return lr directly in 
this case, and we'll
   // come back to the same frame again in the next round.
+/*
   if ((pc & 0x1) == 0)
 {
   newLr = lr;
@@ -70,6 +71,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), 
Dwarf_Addr pc __attribute__
   newSp = sp;
 }
   else
+*/
 {
   if (!readfunc(fp + LR_OFFSET, &newLr, arg))
 newLr = 0;
@@ -80,7 +82,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), 
Dwarf_Addr pc __attribute__
   newSp = fp + SP_OFFSET;
 }
 
-  newPc = newLr & (~0x1);
+  newPc = newLr /* & (~0x1) */;
   if (!setfunc(-1, 1, &newPc, arg))
 return false;
 
@@ -92,5 +94,5 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), 
Dwarf_Addr pc __attribute__
   // If the fp is invalid, we might still have a valid lr.
   // But if the fp is valid, then the stack should be moving in the right 
direction.
   // Except, if this is the initial frame. Then the stack doesn't move.
-  return newPc != 0 && (fp == 0 || newSp > sp || (pc & 0x1) == 0);
+  return newPc != 0 && (fp == 0 || newSp > sp /* || (pc & 0x1) == 0 */);
 }

The testcase (run-backtrace-fp-core-aarch64.sh) PASSes and produces the
same output for:

LD_LIBRARY_PATH=backends:libelf:libdw src/stack -v --exec
backtrace.aarch64.fp.exec --core backtrace.aarch64.fp.core

PID 349 - core
TID 350:
#0  0x0040583c raise - /home/ulf/backtrace.aarch64.fp.exec
../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00401aac - 1 sigusr2 - /home/ulf/backtrace.aarch64.fp.exec
#2  0x00401ba8 - 1 stdarg - /home/ulf/backtrace.aarch64.fp.exec
#3  0x00401c04 - 1 backtracegen - /home/ulf/backtrace.aarch64.fp.exec
#4  0x00401c10 - 1 start - /home/ulf/backtrace.aarch64.fp.exec
#5  0x00402f44 - 1 start_thread - /home/ulf/backtrace.aarch64.fp.exec
/build/glibc-MsMi75/glibc-2.19/nptl/pthread_create.c:311
#6  0x0041dc70 - 1 __clone - /home/ulf/backtrace.aarch64.fp.exec
TID 349:
#0  0x00403fcc pthread_join - /home/ulf/backtrace.aarch64.fp.exec
/build/glibc-MsMi75/glibc-2.19/nptl/pthread_join.c:92
#1  0x00401810 - 1 main - /home/ulf/backtrace.aarch64.fp.exec
#2  0x00406544 - 1 __libc_start_main - 
/home/ulf/backtrace.aarch64.fp.exec
#3  0x00401918 - 1 $x - /home/ulf/backtrace.aarch64.fp.exec
src/stack: dwfl_thread_getframes tid 349 at 0x401917 in 
/home/ulf/backtrace.aarch64.fp.exec: address out of range

Since I cannot find the __libc_do_syscall I assume I am not using the
right executable & core? Could you double check them on the
mjw/fp-unwind branch?

> > The first question is how/why the (pc & 0x1) == 0 check works?
> > Why is that the correct check?
> > 
> > Secondly, if it really is the initial (or signal frame) we are after,
> > should we pass in into bool *signal_framep argument. Currently we don't
> 
> We have this piece of code in __libdwfl_frame_unwind, in frame_unwind.c:
> 
>   if (! state->initial_frame && ! state->signal_frame)
>   pc--;
> 
> AArch64 has a fixed instruction width of 32bit. So, normally the pc is
> aligned to 4 bytes. Except if we decrement it, then we are guaranteed
> to have an odd number, which we can then test to see if the frame in
> question is the initial or a signal frame.

Aha, OK. I forgot we explicitly decrement the pc for the frame before
doing the actual unwind. That makes sense.

> Of course it would be nicer to pass this information directly, but the
> signal_frame parameter is supposed to be an output parameter. After
> all we do the following after calling ebl_unwind():
> 
>   state->unwound->signal_frame = signal_frame;

Right, but that doesn't mean we couldn't also provide it as input if we
know that it is a signal or initial frame already. It just means that
unwinders would have to explicitly set it to false if cannot determine
it for the unwound frame (which is for all of them except the s390x
unwinder). It would really be just one line change in the call to and in
the unwinder functions. This isn't a public API, so we can change it to
be smarter.

Cheers,

Mark


Re: [PATCH] Add missing entries to .gitignore

2017-04-26 Thread Mark Wielaard
On Fri, 2017-04-21 at 18:51 +0200, Ulf Hermann wrote:
> + * .gitignore: Add fillfile and peel_type tests.

Pushed to master.

Thanks,

Mark


Re: [PATCH] Include endian.h when handling BYTE_ORDER

2017-04-26 Thread Mark Wielaard
On Thu, 2017-04-20 at 17:07 +0200, Ulf Hermann wrote:
> BYTE_ORDER and friends are customarily defined in endian.h.

Right. Which we do in all other places where BYTE_ORDER is used.
Pushed to master.

Thanks,

Mark


Re: [PATCH] Add EXEEXT to gendis

2017-04-26 Thread Mark Wielaard
On Thu, 2017-04-20 at 17:02 +0200, Ulf Hermann wrote:
> Otherwise the build will fail on systems that actually need file
> extension for executables.

Thanks, pushed to master.
We seem to add it in other cases. Surprising this seems to be the only
case that got forgotten. I would have expected more issues.

Cheers,

Mark


Re: [PATCH 5/5] Add frame pointer unwinding for aarch64

2017-04-26 Thread Ulf Hermann
On 04/26/2017 04:33 PM, Mark Wielaard wrote:
> On Tue, 2017-04-25 at 15:38 +0200, Ulf Hermann wrote:
>>> My question is about this "initial frame". In our testcase we don't have
>>> this case since the backtrace starts in a function that has some CFI.
>>> But I assume you have some tests that rely on this behavior.
>>
>> Actually the test I provided does exercise this code. The initial
>> __libc_do_syscall() frame does not have CFI. Only raise() has. You can
>> check that by dropping the code for pc & 0x1.
> 
> Maybe I am using the wrong binaries (exec and core), but for me there is
> no difference.

In fact, with the new binaries there is no difference. I was confused, sorry.

However, if you strip .eh_frame and .eh_frame_hdr from the exe (thus triggering 
the fp unwinding on the first frame), you will see that it skips sigusr2. At 
the same time it invents another frame 0x403f40 on the main thread. Apparently 
pthread_join creates two stack frames. As it correctly unwinds the rest, the 
latter seemed harmless to me.

With .eh_frame and .eh_frame_hdr:

ulf@zebra:~/dev/build-elfutils/tests$ ./backtrace 
--core=backtrace.aarch64.fp.core -e backtrace.aarch64.fp.exec
0x400x4a3000/home/ulf/backtrace.aarch64.fp.exec
0x7fb6380x7fb6381000linux-vdso.so.1
TID 350:
# 0 0x40583craise
# 1 0x401aac - 1sigusr2
# 2 0x401ba8 - 1stdarg
# 3 0x401c04 - 1backtracegen
# 4 0x401c10 - 1start
# 5 0x402f44 - 1start_thread
# 6 0x41dc70 - 1__clone
TID 349:
# 0 0x403fccpthread_join
# 1 0x401810 - 1main
# 2 0x406544 - 1__libc_start_main
# 3 0x401918 - 1$x
./backtrace: dwfl_thread_getframes: address out of range

Without .eh_frame and .eh_frame_hdr, code from PATCH V2:

ulf@zebra:~/dev/build-elfutils/tests$ ./backtrace 
--core=backtrace.aarch64.fp.core -e backtrace.aarch64.fp.stripped 
0x400x4a3000/home/ulf/backtrace.aarch64.fp.exec
0x7fb6380x7fb6381000linux-vdso.so.1
TID 350:
# 0 0x40583c(null)
# 1 0x401aac - 1(null)
# 2 0x401ba8 - 1(null)
# 3 0x401c04 - 1(null)
# 4 0x401c10 - 1(null)
# 5 0x402f44 - 1(null)
# 6 0x41dc70 - 1(null)
./backtrace: dwfl_thread_getframes: address out of range
TID 349:
# 0 0x403fcc(null)
# 1 0x403f40 - 1(null)
# 2 0x401810 - 1(null)
# 3 0x406544 - 1(null)
# 4 0x401918 - 1(null)
./backtrace: dwfl_thread_getframes: address out of range

Without .eh_frame and .eh_frame_hdr, without initial frame adjustment:

ulf@zebra:~/dev/build-elfutils/tests$ ./backtrace 
--core=backtrace.aarch64.fp.core -e backtrace.aarch64.fp.stripped 
0x400x4a3000/home/ulf/backtrace.aarch64.fp.exec
0x7fb6380x7fb6381000linux-vdso.so.1
TID 350:
# 0 0x40583c(null)
# 1 0x401ba8 - 1(null)
# 2 0x401c04 - 1(null)
# 3 0x401c10 - 1(null)
# 4 0x402f44 - 1(null)
# 5 0x41dc70 - 1(null)
./backtrace: dwfl_thread_getframes: address out of range
TID 349:
# 0 0x403fcc(null)
# 1 0x401810 - 1(null)
# 2 0x406544 - 1(null)
# 3 0x401918 - 1(null)
./backtrace: dwfl_thread_getframes: address out of range

You have to drop all the asserts from backtrace.c to actually test this:

diff --git a/tests/backtrace.c b/tests/backtrace.c
index 1ff6353..a910a77 100644
--- a/tests/backtrace.c
+++ b/tests/backtrace.c
@@ -71,14 +71,14 @@ static void
 callback_verify (pid_t tid, unsigned frameno, Dwarf_Addr pc,
 const char *symname, Dwfl *dwfl)
 {
-  static bool seen_main = false;
+//  static bool seen_main = false;
   if (symname && *symname == '.')
 symname++;
-  if (symname && strcmp (symname, "main") == 0)
-seen_main = true;
+//  if (symname && strcmp (symname, "main") == 0)
+//seen_main = true;
   if (pc == 0)
 {
-  assert (seen_main);
+//  assert (seen_main);
   return;
 }
   if (check_tid == 0)
@@ -103,11 +103,11 @@ callback_verify (pid_t tid, unsigned frameno, Dwarf_Addr 
pc,
   && (strcmp (symname, "__kernel_vsyscall") == 0
   || strcmp (symname, "__libc_do_syscall") == 0))
reduce_frameno = true;
-  else
-   assert (symname && strcmp (symname, "raise") == 0);
+//  else
+// assert (symname && strcmp (symname, "raise") == 0);
   break;
 case 1:
-  assert (symname != NULL && strcmp (symname, "sigusr2") == 0);
+//  assert (symname != NULL && strcmp (symname, "sigusr2") == 0);
   break;
 case 2: // x86_64 only
   /* __restore_rt - glibc maybe does not have to have this symbol.  */
@@ -125,11 +125,11 @@ callback_verify (pid_t tid, unsigned frameno, Dwarf_Addr 
pc,
}
   /* FALLTHRU */
 case 4:
-  assert (symname != NULL && strcmp (symname, "stdarg") == 0);
+//  assert (symname != NULL && strcmp (symname, "stdarg") == 0);
   break;
 case 5:
  

Re: [PATCH 5/5] Add frame pointer unwinding for aarch64

2017-04-26 Thread Mark Wielaard
On Wed, 2017-04-26 at 17:27 +0200, Ulf Hermann wrote:
> However, if you strip .eh_frame and .eh_frame_hdr from the exe (thus
> triggering the fp unwinding on the first frame), you will see that it
> skips sigusr2. At the same time it invents another frame 0x403f40 on
> the main thread. Apparently pthread_join creates two stack frames. As
> it correctly unwinds the rest, the latter seemed harmless to me.

I am a little concerned about testing against an exec where .eh_frame is
forcibly removed since that is an allocated section you are messing up
the binary (which shows because the symbol table doesn't match anymore).

It seems nicer to do the checks instead with a hacked up
libdwfl/frame_unwind.c that simply doesn't handle cfi and so always uses
the frame pointer unwinder:

diff --git a/libdwfl/frame_unwind.c b/libdwfl/frame_unwind.c
index fb42c1a..6de64e5 100644
--- a/libdwfl/frame_unwind.c
+++ b/libdwfl/frame_unwind.c
@@ -539,6 +539,7 @@ new_unwound (Dwfl_Frame *state)
 static void
 handle_cfi (Dwfl_Frame *state, Dwarf_Addr pc, Dwarf_CFI *cfi, Dwarf_Addr bias)
 {
+  return;
   Dwarf_Frame *frame;
   if (INTUSE(dwarf_cfi_addrframe) (cfi, pc, &frame) != 0)
 {

You are right that in that case we loose/skip over sigusr2 from raise
and end up at stdarg directly if we remove the pc & 0x1 check.

But... that really is because we deliberately skip it.
Proper/simple link-register/frame unwinding should say:

-  newPc = newLr & (~0x1);
+  newPc = lr;

The newPc is the current link register, not the new one.
With that we get the backtrace as expected.

But... I now realize why you needed something like that in the case of
mixed CFI/no-framepointer/no-CFI/framepointer code. Like we have in our
testcase. In that case there is no good way to determine whether or not
there really were proper frame pointers and/or how the previous frame
was unwound. Our testcase is somewhat mean by using some
signal/no-return code which, which is hard to properly unwind without
full frame pointers or full CFI. And with the simpler code that doesn't
try to guess whether or not to skip a frame you do end up with an extra
siguser2 and/or main frame.

We could try to be clever and realize the link register and pc are the
same and then use the newLR instead as newPC. That however might just
mean that it is a recursive call to the same function.

So maybe the proper "fix" for that is to make our testcase a little less
strict and allow the occasional extra frame instead of trying to make
the frame pointer unwinder "extra smart".

Maybe something like the attached patch?

Cheers,

Mark
diff --git a/backends/aarch64_unwind.c b/backends/aarch64_unwind.c
index cac4ebd..18aaf9a 100644
--- a/backends/aarch64_unwind.c
+++ b/backends/aarch64_unwind.c
@@ -61,26 +61,15 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf_Addr pc __attribute__
 
   Dwarf_Word newPc, newLr, newFp, newSp;
 
-  // The initial frame is special. We are expected to return lr directly in this case, and we'll
-  // come back to the same frame again in the next round.
-  if ((pc & 0x1) == 0)
-{
-  newLr = lr;
-  newFp = fp;
-  newSp = sp;
-}
-  else
-{
-  if (!readfunc(fp + LR_OFFSET, &newLr, arg))
-newLr = 0;
-
-  if (!readfunc(fp + FP_OFFSET, &newFp, arg))
-newFp = 0;
-
-  newSp = fp + SP_OFFSET;
-}
-
-  newPc = newLr & (~0x1);
+  if (!readfunc(fp + LR_OFFSET, &newLr, arg))
+newLr = 0;
+
+  if (!readfunc(fp + FP_OFFSET, &newFp, arg))
+newFp = 0;
+
+  newSp = fp + SP_OFFSET;
+
+  newPc = lr;
   if (!setfunc(-1, 1, &newPc, arg))
 return false;
 
@@ -91,6 +80,5 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf_Addr pc __attribute__
 
   // If the fp is invalid, we might still have a valid lr.
   // But if the fp is valid, then the stack should be moving in the right direction.
-  // Except, if this is the initial frame. Then the stack doesn't move.
-  return newPc != 0 && (fp == 0 || newSp > sp || (pc & 0x1) == 0);
+  return newPc != 0 && (fp == 0 || newSp > sp);
 }
diff --git a/tests/backtrace-subr.sh b/tests/backtrace-subr.sh
index a303e32..9731c43 100644
--- a/tests/backtrace-subr.sh
+++ b/tests/backtrace-subr.sh
@@ -59,7 +59,7 @@ check_backtracegen()
 # Ignore it here as it is a bug of OS, not a bug of elfutils.
 check_err()
 {
-  if [ $(egrep -v <$1 'dwfl_thread_getframes: (No DWARF information found|no matching address range|address out of range)$' \
+  if [ $(egrep -v <$1 'dwfl_thread_getframes: (No DWARF information found|no matching address range|address out of range|Invalid register)$' \
  | wc -c) \
-eq 0 ]
   then
diff --git a/tests/backtrace.c b/tests/backtrace.c
index 1ff6353..21abe8a 100644
--- a/tests/backtrace.c
+++ b/tests/backtrace.c
@@ -90,6 +90,10 @@ callback_verify (pid_t tid, unsigned frameno, Dwarf_Addr pc,
   return;
 }
   Dwfl_Module *mod;
+  /* See case 4. Special case to help out simple frame pointer unwinders. */
+  static bool duplicate_s