On 27/05/14 15:31, Maciej W. Rozycki wrote: > On Tue, 13 Aug 2013, Kyrylo Tkachov wrote: > >>> On 08/09/13 11:01, Julian Brown wrote: >>>> On Thu, 8 Aug 2013 15:44:17 +0100 >>>> Kyrylo Tkachov <kyrylo.tkac...@arm.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> The recently added gcc.target/arm/pr58041.c test exposed a bug in the >>>>> backend. When compiling for NEON and with -mno-unaligned-access we >>>>> end up generating the vld1.64 and vst1.64 instructions instead of >>>>> doing the accesses one byte at a time like -mno-unaligned-access >>>>> expects. This patch fixes that by enabling the NEON expander and >>>>> insns that produce these instructions only when unaligned accesses >>>>> are allowed. >>>>> >>>>> Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu. >>>>> >>>>> Ok for trunk and 4.8? >>>> >>>> I'm not sure if this is right, FWIW -- do the instructions in question >>>> trap if the CPU is set to disallow unaligned accesses? I thought that >>>> control bit only affected ARM core loads & stores, not NEON ones. >>> >>> Thinking again - the ARM-ARM says - the alignment check is for element >>> size, so an alternative might be to use vld1.8 instead to allow for this >>> at which point we might as well do something else with the test. I note >>> that these patterns are not allowed for BYTES_BIG_ENDIAN so that might >>> be a better alternative than completely disabling it. >> >> Looking at the section on unaligned accesses, it seems that the >> ldrb/strb-class instructions are the only ones that are unaffected by the >> SCTLR.A bit and do not produce alignment faults in any case. >> The NEON load/store instructions, including vld1.8 can still cause an >> alignment fault when SCTLR.A is set. So it seems we can only use the >> byte-wise >> core memory instructions for unaligned data. > > This change however has regressed gcc.dg/vect/vect-72.c on the > arm-linux-gnueabi target, -march=armv5te, in particular in 4.8. > > Beforehand the code fragment in question produced was: > > .L14: > sub r1, r3, #16 > add r3, r3, #16 > vld1.8 {q8}, [r1] > cmp r3, r0 > vst1.64 {d16-d17}, [r2:64]! > bne .L14 > > Afterwards it is: > > .L14: > vldr d16, [r3, #-16] > vldr d17, [r3, #-8] > add r3, r3, #16 > cmp r3, r1 > vst1.64 {d16-d17}, [r2:64]! > bne .L14 > > and the second VLDR instruction traps with SIGILL (the value in R3 is > 0x10b29, odd as you'd expect, pointing into `ib'). I don't know why and > especially why only the second of the two (regrettably I've been unable to > track down an instruction reference that'd be detailed enough to specify > what exceptions VLDR can produce and under what conditions). >
SIGILL means "undefined instruction". It does not mean misaligned address (unless the kernel is messed up) -- that would be SIGBUS. > Interestingly enough the trap does not happen when the program is > single-stepped under GDB (via gdbserver), however it then aborts once this > copy loop has completed as `ia' contains rubbish and fails the test. > > Is there a fix that needs backporting to 4.8 or is this an issue that was > unknown so far? > > Hardware and Linux used: > > $ cat /proc/cpuinfo > Processor : ARMv7 Processor rev 2 (v7l) > processor : 0 > BogoMIPS : 2013.49 > > processor : 1 > BogoMIPS : 1963.08 > > Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x1 > CPU part : 0xc09 > CPU revision : 2 > > Hardware : OMAP4430 Panda Board > Revision : 0020 > Serial : 0000000000000000 > $ uname -a > Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC > 2010 armv7l GNU/Linux > $ > > Maciej >