Re: reverse conditionnal jump

2012-01-06 Thread Zdenek Dvorak
Hi,

> I'm still developping a new private target backend (gcc4.5.2) and I noticed
> something strange in the assembler generated for conditionnal jump.
> 
> 
> The compiled C code source is :
> 
> void funct (int c) {
> int a;
> a = 7;
> if (c < 0)
>   a = 4;
> return a;
> }
> 
> 
> The assembler generated is :
> 
> [...]
>   mov 7,a
>   cmp 0,c  #set the CC status
>   jmpif LT .L2 #conditionnal jump using CC status
> .L1
>   ret a #return to callee
> .L2
>   mov 4,a
>   jmp .L1 #unconditionnal jump

this could actually be intentional.  gcc tries to lay out the
code so that forward conditional jumps are usually not taken,
and it has a heuristic that numbers are usually non-negative.
I am just guessing, though, there is really no way to tell what
is happening without debugging the issue.  To check whether
this theory could be true, you can try changing the condition
to "if (c > 0)", in which case the code should be generated
the way you prefer,

Zdenek


RE: Does neon_vset_lane expand wrong code when BYTES_BIG_ENDIAN?

2012-01-06 Thread Joseph S. Myers
On Thu, 5 Jan 2012, Xinyu Qi wrote:

> > No, see  where I
> > explain this at greater length.
> 
> I see.
> Would you mind to take a look at the test case neon-vset_lanes8.c under 
> gcc.target/arm/ If BYTE_BIG_ENDIAN, would the x be {1,2,3,4,16,6,7,8} 
> after vset_lane_s8 (16,x,3) is called? And if so, it is obviously not 
> equal to y and test fails.

See the thread starting at 
 for discussion 
of that test.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: I think that may be a bug...

2012-01-06 Thread Ian Lance Taylor
two2the8th_power_is...@yahoo.co.jp writes:

> I am making a OS for a PC/AT compatible machine with gcc on ubuntu.
> The code , whose extension is .c , to task switch in the handler for 
> PIT(timer) interrupt is...
> int tr=(a selector(segment number));
> farjmp(0,tr);
>
> the function prototype is
> farjmp(int eip,int cs);
>
> farjmp() is defined in another source file whose extension is .S.
> the definition is... ('_' means tab.)
> farjmp:
> _jmp far [esp+4]
> _ret
>
> The compile option for both is 'gcc -pipe -ffreestanding -fno-common 
> -fno-builtin -fomit-frame-pointer -c -masm=intel'
>
> But,That wouldn't work well.
> I compared machine code of the result of above with that made by nasm.
> the results are...
> gcc said  FF A4 24 0A FF 00 00
> nasm said FF 6C 24 04
>
> The machine code made by nasm worked as I want.
>
> Is that bug ,or my mistake?

This message should really be on the gcc-help mailing list, not the gcc
mailing list.  The gcc mailing list is for the development of gcc
itself.  Please take any followups to gcc-help.  Thanks.

In any case, as far as I can tell, this is not a gcc question at all.
You seem to be asking about how the instruction
jmp far [%esp+4]
should be assembled.  That is a question about the assembler, which is
not a part of gcc.

In any case, I agree that the GNU assembler appears to misassemble this
instruction in Intel syntax.  I can get the right results in AT&T syntax
by using
ljmp*4(%esp)

Please file a bug report for the GNU assembler at
http://sourceware.org/bugzilla/ .  Thanks.

Ian


[pph] Merge from trunk rev 182919

2012-01-06 Thread Diego Novillo
No new surprises with this merge.

Tested on x86_64.


Diego.


IRA issue with shuffle copies...

2012-01-06 Thread Peter Bergner
Hi Vlad,

While debugging a slightly modified version of the test case in PR16458:

  int
  foo (unsigned int a, unsigned int b)
  {
if (a == b) return 1;
if (a > b)  return 2;
if (a < b)  return 3;
if (a != b) return 4;
return 0;
  }

I noticed a couple of ugly code gen warts which I tracked back to IRA.
Namely, compiling the above with -O2 -m32 on powerpc64-linux, I'm seeing:

li 9,3
mr 3,9
blr
and:
li 9,1
mr 3,9
blr

If we look at the rtl just before IRA, we have the following:

BB2:
  (set (reg/v:SI 122 [ a ]) (reg:SI 3 3 [ a ])) 
REG_DEAD (reg:SI 3 3 [ a ])
  (set (reg/v:SI 123 [ b ]) (reg:SI 4 4 [ b ])) 
REG_DEAD (reg:SI 4 4 [ b ])
  (set (reg:CC 124) (compare:CC (reg/v:SI 122 [ a ]) (reg/v:SI 123 [ b ])))
  (if_then_else (eq (reg:CC 124) (const_int 0 [0]))
goto BB6;

BB3:
  (set (reg:CCUNS 125) (compare:CCUNS (reg/v:SI 122 [ a ]) (reg/v:SI 123 [ b 
]))) REG_DEAD (reg/v:SI 123 [ b ])

REG_DEAD (reg/v:SI 122 [ a ])
  (set (reg:SI 120 [ D.1379 ]) (const_int 2 [0x2]))
  (if_then_else (gtu (reg:CC 124) (const_int 0 [0]))
goto BB8;

BB4:
  (if_then_else (geu (reg:CC 124) (const_int 0 [0]))
goto BB7;

BB5:
  (set (reg:SI 120 [ D.1379 ]) (const_int 3 [0x3]))
  goto BB8;

BB6:
  (set (reg:SI 120 [ D.1379 ]) (const_int 1 [0x1]))
  goto BB8;

BB7:
  (set (reg:SI 120 [ D.1379 ]) (const_int 4 [0x4]))

BB8:
  (set (reg/i:SI 3 3) (reg:SI 120 [ D.1379 ])) REG_DEAD (reg:SI 120 [ D.1379 ])
  (use (reg/i:SI 3 3))
  return

When we start coloring the allocnos, we get the following:

Pass 1 for finding pseudo/allocno costs

r125: preferred CR_REGS, ...
r124: preferred CR_REGS, ...
r123: preferred GENERAL_REGS, ...
r122: preferred GENERAL_REGS, ...
r120: preferred GENERAL_REGS, ...

...

  Popping a3(r122,l0)  -- assign reg 3
  Popping a2(r123,l0)  -- assign reg 4
  Popping a0(r120,l0)  -- assign reg 9
  Popping a4(r124,l0)  -- assign reg 75
  Popping a1(r125,l0)  -- assign reg 3
Assigning 75 to a1r125

This looks a little startling, since we're initially assigning r125 to r3,
even though it's preferred class is CR_REGS before improve_allocation()
saves us and reassigns r125 to r75 (a real CR reg).  The reason r125
ends up initially in r3 is that we detect a "shuffle" copy during the
set of r125, because r122 (and r123) dies in the insn r125 is defined in.
This ends up preferencing the costs for r125, such that it wants r3.
This in turn via ALLOCNO_UPDATED_HARD_REG_COSTS() increases the cost
of assigning r120 to r3, such that r120 ends up with r9 instead, when
we really really want it to get r3.

Your comments about the "shuffle" copies seem to infer that they're being
used to try and help insns with two operand contraints, but in the case
above, they're over preferencing things.  As an experiment, I disabled all
shuffle copies and the code gen for the test case above is much improved.

Do we really need or want to create shuffle copies for insns that do not
have a two operand constraint?  If not, do you know how we can test for that?
If you think we do need that for non two operand contraint insns, can we
at least disable creating shuffle copies for allocnos that have different
preferred classes, since they're probably not going to be assigned the
same hard reg?  Ala:

Index: ira-conflicts.c
===
--- ira-conflicts.c (revision 182936)
+++ ira-conflicts.c (working copy)
@@ -397,6 +397,11 @@ process_regs_for_copy (rtx reg1, rtx reg
   enum machine_mode mode;
   ira_copy_t cp;
 
+  if (!constraint_p
+  && reg_preferred_class (REGNO (reg1))
+!= reg_preferred_class (REGNO (reg2)))
+return false;
+
   gcc_assert (REG_SUBREG_P (reg1) && REG_SUBREG_P (reg2));
   only_regs_p = REG_P (reg1) && REG_P (reg2);
   reg1 = go_through_subreg (reg1, &offset1);


Your thoughts?


Peter