On 16/08/15 20:01, Mike Stump wrote:
On Jun 15, 2015, at 7:30 AM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
On 29/05/15 11:15, Kyrill Tkachov wrote:
On 29/05/15 10:08, Kyrill Tkachov wrote:
Hi Mike,
On 28/05/15 22:15, Mike Stump wrote:
So, the arm memcpy code of aligned data isn’t as good as it can be.
void *memcpy(void *dest, const void *src, unsigned int n);
void foo(char *dst, int i) {
memcpy (dst, &i, sizeof (i));
}
generates horrible code, but, it we are willing to notice the src or the
destination are aligned, we can do much better:
$ ./cc1 -fschedule-fusion -fdump-tree-all-all -da -march=armv7ve
-mcpu=cortex-m4 -fomit-frame-pointer -quiet -O2 /tmp/t.c -o t.s
$ cat t.s
[ … ]
foo:
@ args = 0, pretend = 0, frame = 4
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #4
str r1, [r0] @ unaligned
add sp, sp, #4
I think there's something to do with cpu tuning here as well.
That being said, I do think this is a good idea.
I'll give it a test.
The patch passes bootstrap and testing ok and I've seen it
improve codegen in a few places in SPEC.
I've added a testcase all marked up.
Mike, I'll commit the attached patch in 24 hours unless somebody objects.
Was this ever applied?
Sorry, slipped through the cracks.
Committed with r226935.
Thanks,
Kyrill