On Tue, May 27, 2014 at 3:31 PM, Maciej W. Rozycki <ma...@codesourcery.com> wrote: > On Tue, 13 Aug 2013, Kyrylo Tkachov wrote: > >> > On 08/09/13 11:01, Julian Brown wrote: >> > > On Thu, 8 Aug 2013 15:44:17 +0100 >> > > Kyrylo Tkachov <kyrylo.tkac...@arm.com> wrote: >> > > >> > >> Hi all, >> > >> >> > >> The recently added gcc.target/arm/pr58041.c test exposed a bug in the >> > >> backend. When compiling for NEON and with -mno-unaligned-access we >> > >> end up generating the vld1.64 and vst1.64 instructions instead of >> > >> doing the accesses one byte at a time like -mno-unaligned-access >> > >> expects. This patch fixes that by enabling the NEON expander and >> > >> insns that produce these instructions only when unaligned accesses >> > >> are allowed. >> > >> >> > >> Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu. >> > >> >> > >> Ok for trunk and 4.8? >> > > >> > > I'm not sure if this is right, FWIW -- do the instructions in question >> > > trap if the CPU is set to disallow unaligned accesses? I thought that >> > > control bit only affected ARM core loads & stores, not NEON ones. >> > >> > Thinking again - the ARM-ARM says - the alignment check is for element >> > size, so an alternative might be to use vld1.8 instead to allow for this >> > at which point we might as well do something else with the test. I note >> > that these patterns are not allowed for BYTES_BIG_ENDIAN so that might >> > be a better alternative than completely disabling it. >> >> Looking at the section on unaligned accesses, it seems that the >> ldrb/strb-class instructions are the only ones that are unaffected by the >> SCTLR.A bit and do not produce alignment faults in any case. >> The NEON load/store instructions, including vld1.8 can still cause an >> alignment fault when SCTLR.A is set. So it seems we can only use the >> byte-wise >> core memory instructions for unaligned data. > > This change however has regressed gcc.dg/vect/vect-72.c on the > arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
And what are all the configure flags you are using in case some one has to reproduce this issue ? > > Beforehand the code fragment in question produced was: > > .L14: > sub r1, r3, #16 > add r3, r3, #16 > vld1.8 {q8}, [r1] vld1 allows a misaligned load. > cmp r3, r0 > vst1.64 {d16-d17}, [r2:64]! > bne .L14 > > Afterwards it is: > > .L14: > vldr d16, [r3, #-16] > vldr d17, [r3, #-8] > add r3, r3, #16 > cmp r3, r1 > vst1.64 {d16-d17}, [r2:64]! > bne .L14 > > and the second VLDR instruction traps with SIGILL (the value in R3 is > 0x10b29, odd as you'd expect, pointing into `ib'). I don't know why and > especially why only the second of the two (regrettably I've been unable to > track down an instruction reference that'd be detailed enough to specify > what exceptions VLDR can produce and under what conditions). vldr will cause an unaligned access fault if the address is misaligned. The question is why is the address misaligned in this case. > > Is there a fix that needs backporting to 4.8 or is this an issue that was > unknown so far? I haven't seen an issue with this so far. Ramana > > Hardware and Linux used: > > $ cat /proc/cpuinfo > Processor : ARMv7 Processor rev 2 (v7l) > processor : 0 > BogoMIPS : 2013.49 > > processor : 1 > BogoMIPS : 1963.08 > > Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x1 > CPU part : 0xc09 > CPU revision : 2 > > Hardware : OMAP4430 Panda Board > Revision : 0020 > Serial : 0000000000000000 > $ uname -a > Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC > 2010 armv7l GNU/Linux > $ > > Maciej