On Mon, Jun 8, 2015 at 9:07 AM, lin zuojian <manjian2...@gmail.com> wrote: > Hi, > in arm.c > static void > arm_conditional_register_usage (void) > ... > if (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP) > { > /* VFPv3 registers are disabled when earlier VFP > versions are selected due to the definition of > LAST_VFP_REGNUM. */ > for (regno = FIRST_VFP_REGNUM; > regno <= LAST_VFP_REGNUM; ++ regno) > { > fixed_regs[regno] = 0; > call_used_regs[regno] = regno < FIRST_VFP_REGNUM + 16 > || regno >= FIRST_VFP_REGNUM + 32; > } > } > > these lines will change the called used registers, when using > compiler flags like: -mfpu=neon. > That causes weird bugs. Consider the situation in Android ARM > architecture: I have a shared object supposed to run in a neon cpu, > and -mfpu=neon added. But the system is not compiled using this > flag. So when calling the system's library, my code will risk using > the clobbered d8-d16
No, you are misunderstanding this - because of the packing nature of the various s and d registers in the VFP register file, we need to use the register numbering with respect to the "S" register file. Thus FIRST_VFP_REGNUM + 16 is the correct boundary check as S0-S15 are call clobbered (mapping to D0-D7, Q0-Q3) , (S16-S31) are thus marked callee save or not call_used. > The example will be: > while (true) { > struct my_struct s = {0}; // my_struct is 8 bytes long. > call_system_library... > } > in this example. d8 is used to initialize s to zero. The assembly > code like: > push {d8} // because d8 is not call used. > // the loop header > vmov.i32 d8, #0 > // the loop body > vstr d8, &s > bl system_library > b loop_body > > d8 is clobbered after branch link to system library, so the second > loop will initialize s to random value, which causes crash. > > So I am forced to remove the -mfpu=neon for compatibility. My > question is whether the gcc code show above confront to ARM > standard. If so, why ARM make such a weird standard. No, you do not need to remove the option for any compatibility. The ABI has been carefully designed for precisely allowing this sort of usage mixing code with -mfpu=neon and -mfpu=vfpv3-d16. The failure you describe indicates that something else is broken in your system library or that your system libraryfunction is not obeying the ABI and clobbering D8. If you have an actual reproducible issue please report it on bugzilla following the rules for reporting bugs by producing a standalone testcase. as documented here https://gcc.gnu.org/bugs/ Thanks, Ramana > -- > Lin Zuojian