https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65459

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ramana at gcc dot gnu.org
           Severity|normal                      |enhancement

--- Comment #4 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to ktkachov from comment #2)
> FWIU ARMv6 and later supports unaligned accesses.
> I guess the performance of unaligned accesses differs between cores.
> The documentation for SLOW_UNALIGNED_ACCESS says:
> "Define this macro to be the value 1 if memory accesses described by the
> @var{mode} and @var{alignment} parameters have a cost many times greater
> than aligned accesses, for example if they are emulated in a trap
> handler."
> 
> It seems to me that on some modern cores even if unaligned access is more
> expensive than normal it wouldn't be 'many times greater' so we should
> definitely investigate setting this in a more intelligent way.

(In reply to ktkachov from comment #2)
> FWIU ARMv6 and later supports unaligned accesses.
> I guess the performance of unaligned accesses differs between cores.
> The documentation for SLOW_UNALIGNED_ACCESS says:
> "Define this macro to be the value 1 if memory accesses described by the
> @var{mode} and @var{alignment} parameters have a cost many times greater
> than aligned accesses, for example if they are emulated in a trap
> handler."
> 
> It seems to me that on some modern cores even if unaligned access is more
> expensive than normal it wouldn't be 'many times greater' so we should
> definitely investigate setting this in a more intelligent way.

It is designed this way for a reason. It's not a bug.

On the ARM architecture only the ldr and vld1.8 instructions can handle
unaligned accesses (not aligned to element size). Most of the other
instructions would trap on unaligned access i.e. vldr / vldm / ldrd / ldm. 

So you could get some "so-called" benefit by turning off alignment checking and
then have the world start trapping left right and centre because you've allowed
a free for all without any alignment checks. So, then you might ask what
happens if SLOW_UNALIGNED_ACCESS was true for FP and Vector style operations
and we disable this for SImode values.

The counter against that is well,

1. It is possible (though very unlikely) for SImode values to be in FP
registers when someone needs to do fp conversions. So you've now got a problem
where a type punned case goes into a vldr instruction and then you've now got
an unaligned access which will trap with the vldr instruction. Getting the
phase ordering right in the compiler to prevent something like this will be
interesting.


2. If we set SLOW_UNALIGNED_ACCESS for SImode values to 0, we'd have a
situation where the mid-end was free to spit out

(set (reg) (mem)) style instructions to deal with this.

and then we have all the other work done in ldm / ldrd generation which start
throwing up alignment faults that need to be trapped in the kernel. So by lying
to the compiler that these aren't slow, we never make them fast ! So, if you
wanted to make this change, you also need to prevent ldm / stm and ldrd / strd
instructions from appearing in the code stream for unaligned access and I'm not
sure if we can achieve that purely on memory alignment information from the
compiler i.e. you need to validate that MEM_ALIGN information is sane and
doesn't cause additional alignment faults in larger programs.

Reply via email to