On 16/08/15 20:01, Mike Stump wrote:
On Jun 15, 2015, at 7:30 AM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
On 29/05/15 11:15, Kyrill Tkachov wrote:
On 29/05/15 10:08, Kyrill Tkachov wrote:
Hi Mike,

On 28/05/15 22:15, Mike Stump wrote:
So, the arm memcpy code of aligned data isn’t as good as it can be.

void *memcpy(void *dest, const void *src, unsigned int n);

void foo(char *dst, int i) {
     memcpy (dst, &i, sizeof (i));
}

generates horrible code, but, it we are willing to notice the src or the 
destination are aligned, we can do much better:

$ ./cc1 -fschedule-fusion -fdump-tree-all-all -da -march=armv7ve 
-mcpu=cortex-m4 -fomit-frame-pointer -quiet -O2 /tmp/t.c -o t.s
$ cat t.s
[ … ]
foo:
        @ args = 0, pretend = 0, frame = 4
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        sub     sp, sp, #4
        str     r1, [r0]        @ unaligned
        add     sp, sp, #4
I think there's something to do with cpu tuning here as well.
That being said, I do think this is a good idea.
I'll give it a test.
The patch passes bootstrap and testing ok and I've seen it
improve codegen in a few places in SPEC.
I've added a testcase all marked up.

Mike, I'll commit the attached patch in 24 hours unless somebody objects.
Was this ever applied?

Sorry, slipped through the cracks.
Committed with r226935.

Thanks,
Kyrill



Reply via email to