RE: postreload problem using reload_inm and SECONDARY_RELOAD macros

2011-09-15 Thread BELBACHIR Selim
further hints :

The immediate which gcc wants to move in R_REGS is a (const_int 8) as described 
in the error message below :

arithmetic.c:197: error: insn does not satisfy its constraints:
(insn 1505 903 1506 0 (set (reg:SI 25 $R9)
(const_int 8 [0x8])) 0 {movsi_internal} (nil)
(nil))
arithmetic.c:197: internal compiler error: in 
reload_cse_simplify_operands, at postreload.c:391


So I added some debug trace in function 'my_secondary_input_reload_class' to 
understand why (const_int 8) is not reloaded to C_REGS.

'my_secondary_input_reload_class' is called several times but never with rtl 
expression (const_int 8).

I also checked 'my_preferred_reload_class' function and I don't see expression 
(const_int 8).

Selim



Are there plans to maintain the CFG througout the compilation?

2011-09-15 Thread Peter Garbett
I would like to analyse code using the GCC CFG, however,
some steps invalidate it notably delay slot scheduling.

Are there plans to move toward "how it ought to work" as outlined in :

http://gcc.gnu.org/wiki/basic_block_graph



Peter.


CCmode size

2011-09-15 Thread Paulo J. Matos

Hi,

genmodes.c has the following comment:


  /* Again, nothing more need be said.  For historical reasons, 



 the size of a CC mode is four units.  */
  validate_mode (m, UNSET, UNSET, UNSET, UNSET, UNSET);

  m->bytesize = 4;


Now, this is probably ok for _most_ archs but for my arch where a word 
== byte == 16 bits, this causes a lot of pain. Is there a way (macro?) 
to change this to m->bytesize = 1; in the backend without hardcoding it 
into genmodes.c?


Cheers,

--
PMatos



RE: CCmode size

2011-09-15 Thread Paul_Koning
>genmodes.c has the following comment:
>
>
>   /* Again, nothing more need be said.  For historical reasons, 
> 
>
>  the size of a CC mode is four units.  */
>   validate_mode (m, UNSET, UNSET, UNSET, UNSET, UNSET);
>
>   m->bytesize = 4;
>
>
>Now, this is probably ok for _most_ archs but for my arch where a word == byte 
>== 16 bits, this causes a lot of pain. Is there a way (macro?) to change this 
>to m->bytesize = 1; in the backend without hardcoding it into genmodes.c?

It would seem that making it equal to word size (whatever that is on the 
platform) or size of the int type would be a way to make this better.  Would 
that have any bad consequences for other platforms?

paul


RE: CCmode size

2011-09-15 Thread Joern Rennecke

Quoting paul_kon...@dell.com:

It would seem that making it equal to word size (whatever that is on  
 the platform) or size of the int type would be a way to make this   
better.  Would that have any bad consequences for other platforms?


For MXP (16-bit word addressed, with 128 bit vector registers) I had this:

2008-04-25  J"orn Rennecke  
* genmodes.c (vector_class):
(complete_mode): Allow bytesize to have been set for MODE_CC.


 case MODE_CC:
   /* Again, nothing more need be said.  For historical reasons,
-the size of a CC mode is four units.  */
-  validate_mode (m, UNSET, UNSET, UNSET, UNSET, UNSET);
+the size of a CC mode defaults to four units.  */
+  if (m->bytesize != blank_mode.bytesize)
+   validate_mode (m, UNSET, SET, UNSET, UNSET, UNSET);
+  else
+   {
+ validate_mode (m, UNSET, UNSET, UNSET, UNSET, UNSET);
+ m->bytesize = 4;
+   }

-  m->bytesize = 4;
   m->ncomponents = 1;
   m->component = 0;
   break;
Then there was also:

ChangeLog.ARC:  (SIZED_CC_MODE): New macro.
genmodes.c:#define SIZED_CC_MODE(N, Y) (CC_MODE (N)->bytesize = (Y))

Of course, MXP needed a few more patches to support MODE_VECTOR_PARTIAL_INT
and MODE_VECTOR_CC modes.

In mxp-modes.def, I had then:

VECTOR_MODES (INT, 4); /* V4QI V2HI */
VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
PARTIAL_INT_MODE (SI);  /* Needed to make V2PSI / V4PSI.  */
VECTOR_MODE (PARTIAL_INT, PSI, 2); /* V2PSI, flags for DImode arithmetic. */
VECTOR_MODE (PARTIAL_INT, PSI, 4); /* V4PSI, flags for V2DImode  
arithmetic.  */

VECTOR_MODES (FLOAT, 8); /* V2SF */
VECTOR_MODES (FLOAT, 16); /* V4SF V2DF */
#define CC_MODES(N) SIZED_CC_MODE (N, 2); \
  VECTOR_MODE (CC, N, 2); VECTOR_MODE (CC, N, 4); VECTOR_MODE (CC, N, 8)
CC_MODES (CCI); /* Ordinary integer flags.  */
CC_MODES (CCZN); /* Only zero / negative flag relevant.  */
CC_MODES (CCZ); /* Only zero flag relevant.  */
VECTOR_MODE (CC, CC, 2); /* V2CCmode - flag clobber for DI arithmetic.  */
VECTOR_MODE (CC, CC, 4); /* V4CCmode - flag clobber for V2DI arithmetic.  */


IRA misses register range overlap

2011-09-15 Thread Peter Bigot
In the msp430 back end, hard registers 4 through 15 are HImode, with
adjacent register sequences used for SImode and DImode.  In preparation for
a library call, I'm emitting RTL that assigns values directly to reg:SI 4.

Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
for a pseudo-register for a preceding assignment, and does nothing to
preserve the value over the span where the register is part of an SI value.
The subsequence:

  (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
  (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
   (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
  (nil)))

  (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

  (note 6 3 10 2 NOTE_INSN_DELETED)

  (insn 10 6 11 2 (set (reg:SI 8 r8)
  (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]  ) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
   (nil))

  (insn 11 10 12 2 (set (reg:SI 4 r4)
  (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
   (nil))

with:

  insn=2, live_throughout: 1, dead_or_set: 15, 38
  insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
  insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
  insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15

becomes:

  (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
  (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
   (nil))

  (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

  (note 6 3 10 2 NOTE_INSN_DELETED)

  (insn 10 6 11 2 (set (reg:SI 8 r8)
  (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]  ) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
   (nil))

  (insn 11 10 12 2 (set (reg:SI 4 r4)
  (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
   (nil))

and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
33614 instead of the user's parameter.

Could somebody suggest where should I look to understand why this is
happening and how should it be fixed?

Thanks.

Peter


Re: IRA misses register range overlap

2011-09-15 Thread Vladimir Makarov

On 09/15/2011 11:16 AM, Peter Bigot wrote:

In the msp430 back end, hard registers 4 through 15 are HImode, with
adjacent register sequences used for SImode and DImode.  In preparation for
a library call, I'm emitting RTL that assigns values directly to reg:SI 4.

Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
for a pseudo-register for a preceding assignment, and does nothing to
preserve the value over the span where the register is part of an SI value.
The subsequence:

   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
   (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
(expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
   (nil)))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
   (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
(nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
   (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
(nil))

with:

   insn=2, live_throughout: 1, dead_or_set: 15, 38
   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15

becomes:

   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
   (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
(nil))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
   (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
(nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
   (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
(nil))

and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
33614 instead of the user's parameter.

Could somebody suggest where should I look to understand why this is
happening and how should it be fixed?

The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You 
should submit the test (the smaller the test, the better) and how to 
reproduce it: how to build gcc (configure options) and how to call the 
built gcc to reproduce results.


I think you could look at ira dump file and check that allocno for p38 
conflicting with hard reg 4 *and* 5.  If it is not, the problem is in 
conflict calculation.  Otherwise, it might be IRA hard register 
assignment or in reload (the worst case).


But having only info you provided it is very hard to say what is wrong.



Re: [google] Merge trunk into google/integration

2011-09-15 Thread Ollie Wild
LGTM

Ollie

On Wed, Sep 14, 2011 at 3:29 PM, Diego Novillo  wrote:
>
> This merge brings google/integration up to rev 178783.  I also
> merged rev 178833 to get the testsuite validation script I
> committed to trunk yesterday.
>
> Simon, Ollie, I expect our internal builder to fail until I
> incorporate validate_failures.py into it.  It's a catch-22, but
> it is easier to keep the local changes to the builder than the
> whole merge.
>
> I have reverted all the xfail/skip markers we used to have.  I
> moved the ones that still fail to the new xfail manifest file in
> contrib/testsuite-management (we'll likely need manifests for
> other platforms as well).
>
> Tested on x86_64.  Committed to google/integration.
>
>
> 2011-09-14   Diego Novillo  
>
>        Mainline merge rev 178783.
>        Cherry pick mainline rev 178833.
>
> 2011-09-14   Diego Novillo  
>
> contrib/ChangeLog.google-integration
>
>        * testsuite-management/x86_64-unknown-linux-gnu.xfail: New.
>
> gcc/testsuite/ChangeLog.google-integration
>
>        * g++.dg/tree-prof/partition2.C: Revert to mainline variant.
>        * g++.dg/tree-ssa/pr41186.C: Likewise.
>        * gcc.dg/cproj-fails-with-broken-glibc.c: Likewise.
>        * gcc.dg/guality/sra-1.c: Likewise.
>        * gcc.dg/guality/vla-1.c: Likewise.
>        * gcc.dg/guality/vla-2.c: Likewise.
>        * gcc.dg/inline_3.c: Likewise.
>        * gcc.dg/inline_4.c: Likewise.
>        * gcc.dg/tree-ssa/vrp47.c: Likewise.
>        * gcc.dg/uninit-B.c: Likewise.
>        * gcc.dg/uninit-pr19430.c: Likewise.
>        * gcc.dg/unroll_2.c: Likewise.
>        * gcc.dg/unroll_3.c: Likewise.
>        * gcc.dg/unroll_4.c: Likewise.
>        * gcc.target/i386/pr27827.c: Likewise.
>        * gcc.target/i386/sse4_1-blendps-2.c: Likewise.
>        * gcc.target/i386/sse4_1-blendps.c: Likewise.
>
> libmudflap/ChangeLog.google-integration
>
>        * testsuite/libmudflap.c++/pass55-frag.cxx: Revert to
>        mainline variant.
>
> libstdc++-v3/ChangeLog.google-integration:
>
>        * testsuite/23_containers/vector/requirements/dr438/assign_neg.cc:
>        Revert to mainline variant.
>        * 
> testsuite/23_containers/vector/requirements/dr438/constructor_1_neg.cc: 
> Likewise.
>        * 
> testsuite/23_containers/vector/requirements/dr438/constructor_2_neg.cc: 
> Likewise.
>        * testsuite/23_containers/vector/requirements/dr438/insert_neg.cc: 
> Likewise.
>
> diff --git 
> a/svnclient/contrib/testsuite-management/x86_64-unknown-linux-gnu.xfail 
> b/svnclient/contrib/testsuite-management/x86_64-unknown-linux-gnu.xfail
> new file mode 100644
> index 000..b3e86a5
> --- /dev/null
> +++ b/svnclient/contrib/testsuite-management/x86_64-unknown-linux-gnu.xfail
> @@ -0,0 +1,59 @@
> +# These tests fail in trunk in all configurations.
> +FAIL: 23_containers/vector/requirements/dr438/assign_neg.cc (test for 
> errors, line 1222)
> +FAIL: 23_containers/vector/requirements/dr438/assign_neg.cc (test for excess 
> errors)
> +FAIL: 23_containers/vector/requirements/dr438/constructor_1_neg.cc (test for 
> excess errors)
> +FAIL: 23_containers/vector/requirements/dr438/constructor_1_neg.cc (test for 
> errors, line 1152)
> +FAIL: 23_containers/vector/requirements/dr438/constructor_2_neg.cc (test for 
> excess errors)
> +FAIL: 23_containers/vector/requirements/dr438/constructor_2_neg.cc (test for 
> errors, line 1152)
> +FAIL: 23_containers/vector/requirements/dr438/insert_neg.cc (test for excess 
> errors)
> +FAIL: 23_containers/vector/requirements/dr438/insert_neg.cc (test for 
> errors, line 1263)
> +FAIL: gcc.dg/cproj-fails-with-broken-glibc.c execution test
> +XPASS: gcc.dg/inline_3.c (test for excess errors)
> +XPASS: gcc.dg/inline_4.c (test for excess errors)
> +FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times dom1 "x[^ ]* & y" 1
> +XPASS: gcc.dg/uninit-B.c uninit i warning (test for warnings, line 12)
> +XPASS: gcc.dg/uninit-pr19430.c uninitialized (test for warnings, line 41)
> +XPASS: gcc.dg/uninit-pr19430.c (test for warnings, line 32)
> +XPASS: gcc.dg/unroll_2.c (test for excess errors)
> +XPASS: gcc.dg/unroll_3.c (test for excess errors)
> +XPASS: gcc.dg/unroll_4.c (test for excess errors)
> +FAIL: libmudflap.c++/pass55-frag.cxx ( -O) execution test
> +
> +# The following tests are failing with gold.  The LTO plugin is not resolving
> +# names properly.  Only builds configured to use gold will show these.
> +UNRESOLVED: gcc.c-torture/execute/20010209-1.c execution,  -O2 -flto 
> -flto-partition=none
> +UNRESOLVED: gcc.c-torture/execute/20010209-1.c execution,  -O2 -flto
> +FAIL: gcc.c-torture/execute/20010209-1.c compilation,  -O2 -flto  (internal 
> compiler error)
> +FAIL: gcc.c-torture/execute/20010209-1.c compilation,  -O2 -flto 
> -flto-partition=none  (internal compiler error)
> +
> +# These tests fail in trunk when compiled with -m32.
> +FAIL: boehm-gc.c/thread_leak_test.c -O2 (test for excess errors)
> +FAIL: gcc.target/i386/pr27827.c scan-assembler fmul[ \t]*%st
> +FAIL: gfort

Re: should sync builtins be full optimization barriers?

2011-09-15 Thread Richard Henderson
> > I'd say they should be optimization barriers too (and at the tree level
> > they I think work that way, being represented as function calls), so if
> > they don't act as memory barriers in RTL, the *.md patterns should be
> > fixed.  The only exception should be IMHO the __SYNC_MEM_RELAXED
> > variants - if the CPU can reorder memory accesses across them at will,
> > why shouldn't the compiler be able to do the same as well?
> 
> Agreed, so we have a bug in all released versions of GCC. :(

I wouldn't go that far.  They *used* to be compiler barriers,
but clearly something broke at some point without anyone noticing.
We don't know how many versions are affected until we debug it.
For all we know it broke in 4.5 and 4.4 is fine.

There's no reference to a GCC bug report about this in the thread.
Did the folks over at the libdispatch project never think to file one?
Or does a bug report exist and my search skills are weak?


r~


Re: should sync builtins be full optimization barriers?

2011-09-15 Thread Paolo Bonzini

On 09/15/2011 06:19 PM, Richard Henderson wrote:

I wouldn't go that far.  They *used* to be compiler barriers,
but clearly something broke at some point without anyone noticing.
We don't know how many versions are affected until we debug it.
For all we know it broke in 4.5 and 4.4 is fine.


4.4 is not necessarily fine, it may also be that an unrelated 4.5 change 
exposed a latent bug.


But indeed Richard Sandiford mentioned offlist that perhaps 
ALIAS_SET_MEMORY_BARRIER machinery broke.  Fixing the bug in 4.5/4.6/4.7 
will definitely shed more light.



There's no reference to a GCC bug report about this in the thread.
Did the folks over at the libdispatch project never think to file one?


I asked them to attach a preprocessed testcase somewhere, but they 
haven't done so yet. :(


Paolo


[google] Merged gcc-4_6-branch -> google/gcc-4_6

2011-09-15 Thread Diego Novillo
This merge adds the testsuite validation script to
google/gcc-4_6.  Merged up to rev 178854.

Validated on x86_64.


Diego.


Re: IRA misses register range overlap

2011-09-15 Thread Peter Bigot
On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov  wrote:
> On 09/15/2011 11:16 AM, Peter Bigot wrote:
>>
>> In the msp430 back end, hard registers 4 through 15 are HImode, with
>> adjacent register sequences used for SImode and DImode.  In preparation
>> for
>> a library call, I'm emitting RTL that assigns values directly to reg:SI 4.
>>
>> Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
>> for a pseudo-register for a preceding assignment, and does nothing to
>> preserve the value over the span where the register is part of an SI
>> value.
>> The subsequence:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
>>           (nil)))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> with:
>>
>>   insn=2, live_throughout: 1, dead_or_set: 15, 38
>>   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
>>   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
>>   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
>> 11, 12, 13, 14, 15
>>
>> becomes:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (nil))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
>> 33614 instead of the user's parameter.
>>
>> Could somebody suggest where should I look to understand why this is
>> happening and how should it be fixed?
>>
> The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You should
> submit the test (the smaller the test, the better) and how to reproduce it:
> how to build gcc (configure options) and how to call the built gcc to
> reproduce results.

Unfortunately, the former msp430 maintainers never pushed the back-end
upstream, so filing a bug on a target that isn't part of gcc is
unlikely to get much attention.  It's also pretty specific to the
machine description, so I doubt it could be reproduced on another
target.

I was hoping for more of a "yes, that happens if you don't [missed
back-end requirement here]", or even a "no, that shouldn't be
happening".

It looks almost like the fact that I'm generating RTL that references
the hard registers directly is ignored by IRA for conflict resolution,
which seems to only occur among the registers that it's responsible
for assigning.  I'll look again through the docs to see if there's
some hints that I'm missing a step.

If anybody else has further suggestions or insights I'd appreciate
them.  Thanks.

Peter

> I think you could look at ira dump file and check that allocno for p38
> conflicting with hard reg 4 *and* 5.  If it is not, the problem is in
> conflict calculation.  Otherwise, it might be IRA hard register assignment
> or in reload (the worst case).
>
> But having only info you provided it is very hard to say what is wrong.
>
>


Re: Are there plans to maintain the CFG througout the compilation?

2011-09-15 Thread Peter Garbett
Thanks Ian.

Any idea what the size of the problem would be , perhaps first for
backends  that don't chose to vandalise things themselves 1st?


Re: IRA misses register range overlap

2011-09-15 Thread Vladimir Makarov

On 09/15/2011 03:06 PM, Peter Bigot wrote:

On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov  wrote:

On 09/15/2011 11:16 AM, Peter Bigot wrote:

In the msp430 back end, hard registers 4 through 15 are HImode, with
adjacent register sequences used for SImode and DImode.  In preparation
for
a library call, I'm emitting RTL that assigns values directly to reg:SI 4.

Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
for a pseudo-register for a preceding assignment, and does nothing to
preserve the value over the span where the register is part of an SI
value.
The subsequence:

   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
   (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
(expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
   (nil)))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
   (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
(nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
   (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
(nil))

with:

   insn=2, live_throughout: 1, dead_or_set: 15, 38
   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15

becomes:

   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
   (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
(nil))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
   (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
(nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
   (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
(nil))

and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
33614 instead of the user's parameter.

Could somebody suggest where should I look to understand why this is
happening and how should it be fixed?


The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You should
submit the test (the smaller the test, the better) and how to reproduce it:
how to build gcc (configure options) and how to call the built gcc to
reproduce results.

Unfortunately, the former msp430 maintainers never pushed the back-end
upstream, so filing a bug on a target that isn't part of gcc is
unlikely to get much attention.  It's also pretty specific to the
machine description, so I doubt it could be reproduced on another
target.

I was hoping for more of a "yes, that happens if you don't [missed
back-end requirement here]", or even a "no, that shouldn't be
happening".

It should not be happening.  It is a bug.  It should be fixed in RA (IRA 
or reload).  IRA/reload works for many targets where the same situation 
occurs.  So it is hard to say what is wrong without more info.


Although RA is directed by many machine-dependent macros and one macro 
might return a wrong value (e.g.  number registers needed to hold value 
of a mode).  But it is less probable.




Re: IRA misses register range overlap

2011-09-15 Thread Peter Bigot
On Thu, Sep 15, 2011 at 4:09 PM, Vladimir Makarov  wrote:
> On 09/15/2011 03:06 PM, Peter Bigot wrote:
>>
>> On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov
>>  wrote:
>>>
>>> On 09/15/2011 11:16 AM, Peter Bigot wrote:

 In the msp430 back end, hard registers 4 through 15 are HImode, with
 adjacent register sequences used for SImode and DImode.  In preparation
 for
 a library call, I'm emitting RTL that assigns values directly to reg:SI
 4.

 Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
 for a pseudo-register for a preceding assignment, and does nothing to
 preserve the value over the span where the register is part of an SI
 value.
 The subsequence:

   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
        (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
           (nil)))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]>>> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
        (nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
        (nil))

 with:

   insn=2, live_throughout: 1, dead_or_set: 15, 38
   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
 11, 12, 13, 14, 15

 becomes:

   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
        (nil))

   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

   (note 6 3 10 2 NOTE_INSN_DELETED)

   (insn 10 6 11 2 (set (reg:SI 8 r8)
           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]>>> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
        (nil))

   (insn 11 10 12 2 (set (reg:SI 4 r4)
           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
        (nil))

 and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has
 value
 33614 instead of the user's parameter.

 Could somebody suggest where should I look to understand why this is
 happening and how should it be fixed?

>>> The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You
>>> should
>>> submit the test (the smaller the test, the better) and how to reproduce
>>> it:
>>> how to build gcc (configure options) and how to call the built gcc to
>>> reproduce results.
>>
>> Unfortunately, the former msp430 maintainers never pushed the back-end
>> upstream, so filing a bug on a target that isn't part of gcc is
>> unlikely to get much attention.  It's also pretty specific to the
>> machine description, so I doubt it could be reproduced on another
>> target.
>>
>> I was hoping for more of a "yes, that happens if you don't [missed
>> back-end requirement here]", or even a "no, that shouldn't be
>> happening".
>>
> It should not be happening.  It is a bug.  It should be fixed in RA (IRA or
> reload).  IRA/reload works for many targets where the same situation occurs.
>  So it is hard to say what is wrong without more info.

Based on what you've said I've provided source and the before/after
IRA dump files in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50427.
I'll continue to dig into this; suggestions welcome.

Peter

> Although RA is directed by many machine-dependent macros and one macro might
> return a wrong value (e.g.  number registers needed to hold value of a
> mode).  But it is less probable.


gcc-4.5-20110915 is now available

2011-09-15 Thread gccadmin
Snapshot gcc-4.5-20110915 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110915/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch 
revision 178897

You'll find:

 gcc-4.5-20110915.tar.bz2 Complete GCC

  MD5=92277bf6896948d5ede50ad1210aa9c8
  SHA1=baf856ececfd00d192d330f1d1f56687f4a928cf

Diffs from 4.5-20110908 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


RE: A case that PRE optimization hurts performance

2011-09-15 Thread Jiangning Liu
Hi Richard,

I slightly changed the case to be like below,

int f(char *t) {
int s=0;

while (*t && s != 1) {
switch (s) {
case 0:   /* path 1 */
s = 2;
break;
case 2:   /* path 2 */
s = 3; /* changed */
break;
default:  /* path 3 */
if (*t == '-') 
s = 2;
break;
}
t++;
}

return s;
}

"-O2" is still worse than "-O2 -fno-tree-pre". 

"-O2 -fno-tree-pre" result is 

f:
pushl   %ebp
xorl%eax, %eax
movl%esp, %ebp
movl8(%ebp), %edx
movzbl  (%edx), %ecx
jmp .L14
.p2align 4,,7
.p2align 3
.L5:
movl$2, %eax
.L7:
addl$1, %edx
cmpl$1, %eax
movzbl  (%edx), %ecx
je  .L3
.L14:
testb   %cl, %cl
je  .L3
testl   %eax, %eax
je  .L5
cmpl$2, %eax
.p2align 4,,5
je  .L17
cmpb$45, %cl
.p2align 4,,5
je  .L5
addl$1, %edx
cmpl$1, %eax
movzbl  (%edx), %ecx
jne .L14
.p2align 4,,7
.p2align 3
.L3:
popl%ebp
.p2align 4,,2
ret
.p2align 4,,7
.p2align 3
.L17:
movb$3, %al
.p2align 4,,3
jmp .L7

While "-O2" result is 

f:
pushl   %ebp
xorl%eax, %eax
movl%esp, %ebp
movl8(%ebp), %edx
pushl   %ebx
movzbl  (%edx), %ecx
jmp .L14
.p2align 4,,7
.p2align 3
.L5:
movl$1, %ebx
movl$2, %eax
.L7:
addl$1, %edx
testb   %bl, %bl
movzbl  (%edx), %ecx
je  .L3
.L14:
testb   %cl, %cl
je  .L3
testl   %eax, %eax
je  .L5
cmpl$2, %eax
.p2align 4,,5
je  .L16
cmpb$45, %cl
.p2align 4,,5
je  .L5
cmpl$1, %eax
setne   %bl
addl$1, %edx
testb   %bl, %bl
movzbl  (%edx), %ecx
jne .L14
.p2align 4,,7
.p2align 3
.L3:
popl%ebx
popl%ebp
ret
.p2align 4,,7
.p2align 3
.L16:
movl$1, %ebx
movb$3, %al
jmp .L7

You may notice that register ebx is introduced, and some more instructions
around ebx are generated as well. i.e.

setne   %bl
testb   %bl, %bl

I agree with you that in theory PRE does the right thing to minimize the
computation cost on gimple level. However, the problem is the cost of
converting comparison result to a bool value is not considered, so it
actually makes binary code worse. For this case, as I summarized below, to
complete the same functionality "With PRE" is worse than "Without PRE" for
all three paths,

* Without PRE,

Path1:
movl$2, %eax
cmpl$1, %eax
je  .L3

Path2:
movb$3, %al
cmpl$1, %eax
je  .L3

Path3:
cmpl$1, %eax
jne .L14

* With PRE,

Path1:
movl$1, %ebx
movl$2, %eax
testb   %bl, %bl
je  .L3

Path2:
movl$1, %ebx
movb$3, %al
testb   %bl, %bl
je  .L3

Path3:
cmpl$1, %eax
setne   %bl
testb   %bl, %bl
jne .L14

Do you have any more thoughts?

Thanks,
-Jiangning

> -Original Message-
> From: Richard Guenther [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, August 02, 2011 5:23 PM
> To: Jiangning Liu
> Cc: gcc@gcc.gnu.org
> Subject: Re: A case that PRE optimization hurts performance
> 
> On Tue, Aug 2, 2011 at 4:37 AM, Jiangning Liu 
> wrote:
> > Hi,
> >
> > For the following simple test case, PRE optimization hoists
> computation
> > (s!=1) into the default branch of the switch statement, and finally
> causes
> > very poor code generation. This problem occurs in both X86 and ARM,
> and I
> > believe it is also a problem for other targets.
> >
> > int f(char *t) {
> >    int s=0;
> >
> >    while (*t && s != 1) {
> >        switch (s) {
> >        case 0:
> >            s = 2;
> >            break;
> >        case 2:
> >            s = 1;
> >            break;
> >        default:
> >            if (*t == '-')
> >                s = 1;
> >            break;
> >        }
> >        t++;
> >    }
> >
> >    return s;
> > }
> >
> > Taking X86 as an example, with option "-O2" you may find 52
> instructions
> > generated like below,
> >
> >  :
> >   0:   55                      push   %ebp
> >   1:   31 c0                   xor    %eax,%eax
> >   3:   89 e5                   mov    %esp,%ebp
> >   5:   57                      push   %edi
> >   6:   56                      push   %esi
> >   7:   53                      push   %ebx
> >   8:   8b 55 08                mov    0x8(%ebp),%edx