Redundant limit check for switch

2005-08-22 Thread Piotr Fusik
The following code:

#include 

void Switch4(int x) {
switch (x & 7) {
case 0: printf("0\n"); break;
case 1: printf("1\n"); break;
case 2: printf("2\n"); break;
case 3: printf("3\n"); break;
case 4: printf("4\n"); break;
case 5: printf("5\n"); break;
case 6: printf("6\n"); break;
case 7: printf("7\n"); break;
}
}

void Switch256(int x) {
switch ((unsigned char) x) {
case 0: printf("0\n"); break;
case 1: printf("1\n"); break;
case 2: printf("2\n"); break;
// ... (all 256 cases)
}
}

when compiled with:
gcc -S -O3 fullswitch.c

produces the following:

...
.globl _Switch4
.def_Switch4;   .scl2;  .type   32; .endef
_Switch4:
pushl   %ebp
movl%esp, %ebp
movl8(%ebp), %eax
andl$7, %eax
cmpl$7, %eax
ja  L12
jmp *L11(,%eax,4)
...
.globl _Switch256
.def_Switch256; .scl2;  .type   32; .endef
_Switch256:
pushl   %ebp
movl%esp, %ebp
movzbl  8(%ebp), %eax
cmpl$255, %eax
ja  L273
jmp *L272(,%eax,4)


cmpl+ja are redundant in both cases.
Do you think it is possible for gcc to optimize them away?

An example of a real program with all 256 cases for unsigned char is Atari800. 
We use table-driven goto * there for better performance.


Re: Redundant limit check for switch

2005-08-27 Thread Piotr Fusik
I know nothing about GCC internals, but it appears that it knows
which bits are used in expressions:

unsigned char foo(int x) {
 return (x + 1) & 0x0f & 0x0c & 0x3ff;
}

 .file "test.c"
 .section .text
 .p2align 4,,15
.globl _foo
_foo:
 pushl %ebp
 movl %esp, %ebp
 movb 8(%ebp), %al
 popl %ebp
 incl %eax
 andl $12, %eax
 ret
 .ident "GCC: (GNU) 4.0.0"

Why can't this information be used to optimize comparisons?
Does it work only for consecutive ands?
Or is it just an early constant folding:
 return (x + 1) & (0x0f & 0x0c & 0x3ff & 0xff);
?

Piotr

- Original Message - 
From: "Paolo Bonzini" <[EMAIL PROTECTED]>
To: "GCC Development" ; <[EMAIL PROTECTED]>; "Diego Novillo" 
<[EMAIL PROTECTED]>; "Giovanni Bajo" <[EMAIL PROTECTED]>
Sent: Monday, August 22, 2005 2:24 PM
Subject: Re: Redundant limit check for switch


> >>void Switch4(int x) {
> >>switch (x & 7) {
> >>
> >>}
> >>}
>  >>
> >>.globl _Switch4
> >>.def _Switch4; .scl 2; .type 32; .endef
> >>_Switch4:
> >>pushl %ebp
> >>movl %esp, %ebp
> >>movl 8(%ebp), %eax
> >>andl $7, %eax
> >>cmpl $7, %eax
> >>ja L12
> >>jmp *L11(,%eax,4)
> > 
> > 
> >>cmpl+ja are redundant in both cases.
> >>Do you think it is possible for gcc to optimize them away?
> > 
> > I believe VRP could be taught about inferring ranges from bit_and_expr and
> > similar operations. Right?
> 
> Yes, but the range check is not emitted until trees are expanded to RTL. 
>   combine does a lot of simplifications, but unfortunately not this one. 
>   It is also quite hard to teach combine to *remove* jumps, though it 
> has some ability to turn conditional jumps into unconditional.
> 
> The attached patch would at least cause simplify-rtx.c to realize that 
> (gtu (reg:SI 61) (const_int 7)) is false, but not cause any code 
> generation improvement.
> 
> Paolo
>