Re: Unaligned NEON memory access on ARMv7 phones

Makoto Kato Thu, 05 Apr 2018 08:08:37 -0700

 > To my surprise, clang 6.0 is willing to generate vld1.8 when no
> particular CPU model is specified:
> https://godbolt.org/g/i5PqcQ


This sample for vld1.8 will be valid due to element size aligned.

Also, although this generator generates hardfp abi as default, if using gcc
7 (with -march=armv7-a -mfpu=neon -mfloat-abi=hard -O3
-mno-unaligned-access -mthumb) , generated code is following.

00000000 <aligned>:
   0:   f920 0adf       vld1.64 {d0-d1}, [r0 :64]
   4:   4770            bx      lr
   6:   bf00            nop

00000008 <unaligned>:
   8:   b500            push    {lr}
   a:   b085            sub     sp, #20
   c:   4601            mov     r1, r0
   e:   2210            movs    r2, #16
  10:   4668            mov     r0, sp
  12:   f7ff fffe       bl      0 <memcpy>
  16:   f92d 0adf       vld1.64 {d0-d1}, [sp :64]
  1a:   b005            add     sp, #20
  1c:   f85d fb04       ldr.w   pc, [sp], #4

Although gcc doesn't optimize memcpy with -mno-unaligned-access, if using
-munaligned-acces, it uses ldr, stmia and vld1.64, not vld1.8.

ARM has big endian support, all case cannot replace *.16/*.32/*.64 with *.8
to support both endians.


> Is unaligned NEON allowed on any ARMv7 CPU without trapping after all
> even if unaligned ALU loads/stores might not be?

If unaligned access with alignment identifier, it will cause trap.  And it
will depend on element size.


-- Makoto Kato


On Thu, Mar 29, 2018 at 8:38 PM, Henri Sivonen <hsivo...@hsivonen.fi> wrote:

> On Thu, Mar 29, 2018 at 4:09 AM, Makoto Kato <mk...@mozilla.com> wrote:
> > Since SCTLR isn't allowed on userland, there is no way to detect
> unalignment
> > access support without trap.  Generally, unalignement access causes
> SIGBUS,
> > so we might get a data from crash reporter.  Android armv7-a ABI doesn't
> > define that hardware configuration has to set alignment bit of SCTLR, so
> we
> > should consider both unfortunately.
>
> To my surprise, clang 6.0 is willing to generate vld1.8 when no
> particular CPU model is specified:
> https://godbolt.org/g/i5PqcQ
>
> Is unaligned NEON allowed on any ARMv7 CPU without trapping after all
> even if unaligned ALU loads/stores might not be?
>
> > ARM document of Cortex-A8 says [*1], alignment identifier is 64
> > (VLD2.16@64), it requires 2 cycles, but alignment identifier is 128
> > (VLD2.16@128), it is 1 cycle.  And on Cortex-A9, unalignment access
> requires
> > additional cycles [*2].
> ...
> > [*1]
> > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.
> doc.ddi0344h/ch16s06s07.html
> > [*2]
> > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.
> doc.ddi0344h/ch16s06s07.html
>
> Thank you. Was [*2] meant to be a different URL?
>
> On Wed, Mar 28, 2018 at 6:36 PM, Gregory Szorc <g...@mozilla.com> wrote:
> > Is
> > http://fastcompression.blogspot.fr/2015/08/accessing-
> unaligned-memory.html
> > and/or the comments for MEM_FORCE_MEMORY_ACCESS at
> > https://github.com/facebook/zstd/blob/dev/lib/common/mem.h useful?
>
> Thanks, but unfortunately these don't address my issue. These are
> about getting GCC to perform an unaligned load efficiently when the
> programmer has already decided to want an unaligned load.
>
> I'm trying to figure out whether it's worthwhile to spend cycles to
> move pointers to alignment if possible or whether it makes sense to
> just use unaligned operations unconditionally. (Also, GCC doesn't
> matter in my case, since I'm planning Rust code.)
>
> In non-ARMv7 cases my findings are that moving to alignment doesn't
> look empirically worthwhile on aarch64 (tested RPi3 and ThunderX,
> which both have in-order cores; should test an out-of-order core, but
> documentation supports the empirical results) or on Haswell
> (documentation indicates that the key is Nehalem or newer). On Core 2
> Duo, moving to alignment is worthwhile.
>
> --
> Henri Sivonen
> hsivo...@hsivonen.fi
> https://hsivonen.fi/
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Unaligned NEON memory access on ARMv7 phones

Reply via email to