On 12/06/13 07:17, Richard Biener wrote:
On Fri, Dec 6, 2013 at 2:52 PM, Konstantin Vladimirov
wrote:
Hi,
Richard, I tried to add LSHIFT_EXPR case to tree-scalar-evolution.c
and now it yields code like (x86 again):
.L5:
movzbl 4(%esi,%eax,4), %edx
movb %dl, 4(%ebx,%eax,4)
addl $1, %eax
cmpl %
On Fri, Dec 6, 2013 at 2:52 PM, Konstantin Vladimirov
wrote:
> Hi,
>
> Richard, I tried to add LSHIFT_EXPR case to tree-scalar-evolution.c
> and now it yields code like (x86 again):
>
> .L5:
> movzbl 4(%esi,%eax,4), %edx
> movb %dl, 4(%ebx,%eax,4)
> addl $1, %eax
> cmpl %ecx, %eax
> jne .L5
>
> So
Hi,
Richard, I tried to add LSHIFT_EXPR case to tree-scalar-evolution.c
and now it yields code like (x86 again):
.L5:
movzbl 4(%esi,%eax,4), %edx
movb %dl, 4(%ebx,%eax,4)
addl $1, %eax
cmpl %ecx, %eax
jne .L5
So, excessive lea is gone. It is great, thank you so much. But I
wonder what else can I
On Fri, 6 Dec 2013, Konstantin Vladimirov wrote:
Consider code:
int foo(char *t, char *v, int w)
{
int i;
for (i = 1; i != w; ++i)
{
int x = i << 2;
A side note, but something too few people seem to be aware of: writing
i<<2 can pessimize code compared to i*4 (and it is never faster). That
On Fri, Dec 6, 2013 at 2:25 AM, Richard Biener
wrote:
> On Fri, Dec 6, 2013 at 11:19 AM, Konstantin Vladimirov
> wrote:
>> Hi,
>>
>> nothing changes if everything is unsigned and we are guaranteed to not
>> raise UB on overflow:
>>
>> unsigned foo(unsigned char *t, unsigned char *v, unsigned w)
>
On Fri, Dec 6, 2013 at 11:19 AM, Konstantin Vladimirov
wrote:
> Hi,
>
> nothing changes if everything is unsigned and we are guaranteed to not
> raise UB on overflow:
>
> unsigned foo(unsigned char *t, unsigned char *v, unsigned w)
> {
> unsigned i;
>
> for (i = 1; i != w; ++i)
> {
> unsigned x =
Hi,
nothing changes if everything is unsigned and we are guaranteed to not
raise UB on overflow:
unsigned foo(unsigned char *t, unsigned char *v, unsigned w)
{
unsigned i;
for (i = 1; i != w; ++i)
{
unsigned x = i << 2;
v[x + 4] = t[x + 4];
}
return 0;
}
yields:
.L5:
leal 0(,%eax,4), %edx
add
On Fri, Dec 6, 2013 at 9:30 AM, Konstantin Vladimirov
wrote:
> Hi,
>
> Consider code:
>
> int foo(char *t, char *v, int w)
> {
> int i;
>
> for (i = 1; i != w; ++i)
> {
> int x = i << 2;
> v[x + 4] = t[x + 4];
> }
>
> return 0;
> }
>
> Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with o
On Fri, Dec 06, 2013 at 12:30:54PM +0400, Konstantin Vladimirov wrote:
> Consider code:
>
> int foo(char *t, char *v, int w)
> {
> int i;
>
> for (i = 1; i != w; ++i)
> {
> int x = i << 2;
> v[x + 4] = t[x + 4];
> }
>
> return 0;
> }
This is either job of ivopts pass, dunno why it doesn't consi
Hi,
Example from x86 code was only for ease of reproduction. I am pretty
sure, this is architecture-independent issue. Say on ARM:
.L2:
mov ip, r3, asl #2
add ip, ip, #4
add r3, r3, #1
ldrb r4, [r0, ip] @ zero_extendqisi2
cmp r3, r2
strb r4, [r1, ip]
bne .L2
May be improved to:
.L2:
add r3, r3,
On 06/12/13 09:30, Konstantin Vladimirov wrote:
> Hi,
>
> Consider code:
>
> int foo(char *t, char *v, int w)
> {
> int i;
>
> for (i = 1; i != w; ++i)
> {
> int x = i << 2;
> v[x + 4] = t[x + 4];
> }
>
> return 0;
> }
>
> Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with options:
>
11 matches
Mail list logo