Hello,

When I built blob with arm-iwmmxt-linux-gnueabi toolchain, I found the
SP value before invoking number() in printf() may be 0 or 4 modulo 8.
If SP is 0 modulo 8, printf worked well, but while SP is 4 modulo 8,
printf failed. It cannot store long long type parameter into stack
before invoking number() with strd instruction. I was using GCC-4.1.1.

According to ABI for ARM architecture in
http://www.arm.com/miscPDFs/13176.pdf, it seems GCC should address the
implementation of va_arg. Below is the excerption from that spec. Does
anybody know how to solve this issue or make sure GCC always generate
8-byte aligned SP?

2.3.2.3 Repair of va_start and va_arg
To avoid injecting a fault into their users' programs in execution
environments that do not correctly align SP, software development
tools should offer an option (Q-o-I) to repair the C library's
stdarg.h macros va_start and va_arg, as follows. (We assume va_start
expands to a call to the intrinsic function __va_start, and va_arg to
a call to __va_arg. It is already very difficult (or impossible) to
implement va_start and va_arg in a way that evaluates each argument
only once (as required by the C standard) without the assistance of at
least one intrinsic function).
__va_start should return a pointer value ap with bit[1] set if SP was
4 modulo 8 on entry to the containing function.

The function containing the call to __va_start has the variadic
parameter list allocated in the stack frame.
Because arguments are guaranteed to be 4-byte aligned (by C's argument
promotion rules and the AAPCS requirement that SP be 4-byte aligned at
all instants), bits[1:0] of ap are otherwise 0.
Coding the SP-misaligned case as 1 produces a __va_start compatible
with an ordinary (not repaired) __va_arg in conforming environments in
which SP is 0 modulo 8 at function entry.
If T is a data type requiring 8-byte alignment, __va_arg(ap, T) must
increment the pointer it calculates by 4 bytes (to skip a padding word
inserted at compile time) if:
(bit[1] of ap is 0 and bit[2] of ap is 1) or (bit[1] of ap is 1 and
bit[2] of ap is 0).

Whatever the sort of T, __va_arg(ap, T) must clear bit 1 of the
pointer it calculates before dereferencing it.
This implementation of __va_arg is compatible with an ordinary (not
repaired) __va_start in conforming environments in which SP is 0
modulo 8 at function entry and bit 1 of ap is always 0.

--
best regards,
-Bridge

Reply via email to