On 27/05/14 15:31, Maciej W. Rozycki wrote:
> On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:
> 
>>> On 08/09/13 11:01, Julian Brown wrote:
>>>> On Thu, 8 Aug 2013 15:44:17 +0100
>>>> Kyrylo Tkachov <kyrylo.tkac...@arm.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> The recently added gcc.target/arm/pr58041.c test exposed a bug in the
>>>>> backend. When compiling for NEON and with -mno-unaligned-access we
>>>>> end up generating the vld1.64 and vst1.64 instructions instead of
>>>>> doing the accesses one byte at a time like -mno-unaligned-access
>>>>> expects. This patch fixes that by enabling the NEON expander and
>>>>> insns that produce these instructions only when unaligned accesses
>>>>> are allowed.
>>>>>
>>>>> Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.
>>>>>
>>>>> Ok for trunk and 4.8?
>>>>
>>>> I'm not sure if this is right, FWIW -- do the instructions in question
>>>> trap if the CPU is set to disallow unaligned accesses? I thought that
>>>> control bit only affected ARM core loads & stores, not NEON ones.
>>>
>>> Thinking again - the ARM-ARM says - the alignment check is for element
>>> size, so an alternative might be to use vld1.8 instead to allow for this
>>> at which point we might as well do something else with the test. I note
>>> that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
>>> be a better alternative than completely disabling it.
>>
>> Looking at the section on unaligned accesses, it seems that the
>> ldrb/strb-class instructions are the only ones that are unaffected by the
>> SCTLR.A bit and do not produce alignment faults in any case.
>> The NEON load/store instructions, including vld1.8 can still cause an
>> alignment fault when SCTLR.A is set. So it seems we can only use the 
>> byte-wise
>> core memory instructions for unaligned data.
> 
>  This change however has regressed gcc.dg/vect/vect-72.c on the 
> arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
> 
>  Beforehand the code fragment in question produced was:
> 
> .L14:
>       sub     r1, r3, #16
>       add     r3, r3, #16
>       vld1.8  {q8}, [r1]
>       cmp     r3, r0
>       vst1.64 {d16-d17}, [r2:64]!
>       bne     .L14
> 
> Afterwards it is:
> 
> .L14:
>       vldr    d16, [r3, #-16]
>       vldr    d17, [r3, #-8]
>       add     r3, r3, #16
>       cmp     r3, r1
>       vst1.64 {d16-d17}, [r2:64]!
>       bne     .L14
> 
> and the second VLDR instruction traps with SIGILL (the value in R3 is 
> 0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and 
> especially why only the second of the two (regrettably I've been unable to 
> track down an instruction reference that'd be detailed enough to specify 
> what exceptions VLDR can produce and under what conditions).
> 

SIGILL means "undefined instruction".  It does not mean misaligned
address (unless the kernel is messed up) -- that would be SIGBUS.


>  Interestingly enough the trap does not happen when the program is 
> single-stepped under GDB (via gdbserver), however it then aborts once this 
> copy loop has completed as `ia' contains rubbish and fails the test.
> 
>  Is there a fix that needs backporting to 4.8 or is this an issue that was 
> unknown so far?
> 
>  Hardware and Linux used:
> 
> $ cat /proc/cpuinfo
> Processor     : ARMv7 Processor rev 2 (v7l)
> processor     : 0
> BogoMIPS      : 2013.49
> 
> processor     : 1
> BogoMIPS      : 1963.08
> 
> Features      : swp half thumb fastmult vfp edsp thumbee neon vfpv3
> CPU implementer       : 0x41
> CPU architecture: 7
> CPU variant   : 0x1
> CPU part      : 0xc09
> CPU revision  : 2
> 
> Hardware      : OMAP4430 Panda Board
> Revision      : 0020
> Serial                : 0000000000000000
> $ uname -a
> Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 
> 2010 armv7l GNU/Linux
> $ 
> 
>   Maciej
> 


Reply via email to