Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Richard Guenther
On Sun, Mar 9, 2008 at 1:29 AM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> Hi,
>
>  I just noticed an error in a part of the code that I converted, that
>  looks this way:
>
>  switch (gimple_assign_subcode (stmt))
>  {
>   case SSA_NAME:
> handle_ssa_name ();
> break;
>
>   case PLUS_EXPR:
> handle_plus ();
> break;
>
>   default:
> something ();
>  }
>
>  The problem of course is that for GIMPLE_SINGLE_RHS, we do not maintain
>  the invariant that
>
>  gimple_assign_subcode (stmt) == TREE_CODE (gimple_assign_rhs1 (stmt)),
>
>  so gimple_assign_subcode typically will not be SSA_NAME, but VAR_DECL.
>  Enforcing this invariant might be hard and probably asking for more
>  trouble than it is worth.  However, perhaps it would make sense
>  to use some special tree code to indicate GIMPLE_SINGLE_RHS, in order
>  to avoid confusion?

What is GIMPLE_SINGLE_RHS after all?

Thanks,
Richard.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo
On Sat, Mar 8, 2008 at 19:29, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:

>  The problem of course is that for GIMPLE_SINGLE_RHS, we do not maintain
>  the invariant that
>
>  gimple_assign_subcode (stmt) == TREE_CODE (gimple_assign_rhs1 (stmt)),
>
>  so gimple_assign_subcode typically will not be SSA_NAME, but VAR_DECL.
>  Enforcing this invariant might be hard and probably asking for more
>  trouble than it is worth.  However, perhaps it would make sense
>  to use some special tree code to indicate GIMPLE_SINGLE_RHS, in order
>  to avoid confusion?

Hmm, yeah.  I remember talking about this with Bill a few days ago.
He had a similar idea.

At the time I was a bit reluctant to add a new tree code just for
this, but now you've run into it and I was also looking at some code
that would run into the same problem.  Now I agree that adding a new
tree code will be the cleanest way.  We will be removing several tree
codes when tuple merges, so the net result in tree.def will be a
reduction of tree codes.

In this case, the renamer was at fault.  When it replaced the USE
operand with an SSA_NAME it failed to change the subcode.

The other problem with not using a special code for GIMPLE_SINGLE_RHS
is confusion

a_3 = b_5

is (if we fixed the renamer bug)

GIMPLE_ASSIGN , SSA_NAME >

So, the subcode seems to duplicate the code on the RHS.  That looks odd.

So, what about adding a GIMPLE_COPY code?  The code would have 0
operands and used only for its numeric value.


Diego.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo
On Sun, Mar 9, 2008 at 08:15, Richard Guenther
<[EMAIL PROTECTED]> wrote:

>  What is GIMPLE_SINGLE_RHS after all?

Represents a "copy" operation, an operand with no operator (e.g., a = 3, b = c)

'3' and 'c' are "single" operands.  There is no operator involved in
the assignment.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Richard Guenther
On Sun, Mar 9, 2008 at 2:17 PM, Diego Novillo <[EMAIL PROTECTED]> wrote:
> On Sun, Mar 9, 2008 at 08:15, Richard Guenther
>  <[EMAIL PROTECTED]> wrote:
>
>  >  What is GIMPLE_SINGLE_RHS after all?
>
>  Represents a "copy" operation, an operand with no operator (e.g., a = 3, b = 
> c)
>
>  '3' and 'c' are "single" operands.  There is no operator involved in
>  the assignment.

So as opposed to a unary operation which would look exactly the same apart
from having a subcode?  So what does gimple_assign_subcode () return
for the GIMPLE_SINGLE_RHS case?  Some random garbage?

Seems I need to look at the tuples branch closer.  Is there some overview
documentation available that explains all the above?

Thanks,
Richard.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo

On 3/9/08 9:24 AM, Richard Guenther wrote:


So as opposed to a unary operation which would look exactly the same apart
from having a subcode?  So what does gimple_assign_subcode () return
for the GIMPLE_SINGLE_RHS case?  Some random garbage?


No.  A unary operation is different.  For instance, NEG_EXPR:

a = -b

The operator is NEG_EXPR, with the single operand VAR_DECL.

When we deal with a unary operator T, the RHS will not have T, it will 
have TREE_OPERAND (T, 0). However, for a "single" or "copy" operand T 
(say a VAR_DECL or CONST_INT or SSA_NAME), the RHS will contain T itself.




Seems I need to look at the tuples branch closer.  Is there some overview
documentation available that explains all the above?


Yes, Aldy and I are working on the tutorial for the summit.  It's still 
very crude (I'm not sure if the different RHS classes are discussed 
yet). The function you want to look at is get_gimple_rhs_class()


http://docs.google.com/Doc?id=dgfkmttj_107hcr98sg3



Diego.


Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
Hi,
based on the discussion, this is change I would like to do to the
passmanager.  I am sending the header change only first, because the
actual change will need updating all PM datastructure initializers and
compensate testsuite and documentation for the removal of RTL dump
letters so I would rather do that just once.  Does this seem OK?

The patch include the read/write methods that will be just placeholders
on mainline.  Naturally I can remove them for time being at least as
long as we think the RTL_PASS/TREE_PASS macros are good idea.  I can
quite easilly see that those are stepping back from plan not making
passmanager poluted by ugly macros, but on the other hand since the PM
is now doing RTL/IPA/tree passes it needs at least a little of
flexibility to be able update API of one without affecting others.

Alternative would be some simple inheriatance scheme: have the common
fields in one structure and derrived tree/ipa/RTL pass structures.
It complicates passes.c somewhat but I can definitly do that.

Jan

Index: tree-pass.h
===
*** tree-pass.h (revision 133036)
--- tree-pass.h (working copy)
*** extern const char *dump_file_name;
*** 88,93 
--- 88,97 
  /* Return the dump_file_info for the given phase.  */
  extern struct dump_file_info *get_dump_file_info (enum tree_dump_index);
  
+ /* Forward declare so we don't need to bring in cgraph and varpool include.  
*/
+ struct cgraph_node;
+ struct varpool_node;
+ 
  /* Describe one pass.  */
  struct tree_opt_pass
  {
*** struct tree_opt_pass
*** 98,108 
--- 102,129 
   the function returns true.  */
bool (*gate) (void);
  
+   /* IPA passes can analyze function body and variable initializers using this
+  hook and produce summary.  */
+   void (*function_generate_summary) (struct cgraph_node *);
+   void (*variable_generate_summary) (struct varpool_node *);
+ 
+   /* These hooks will be used to serialize IPA summaries on disk.  For a 
moment
+  they are just placeholders.  */
+   void (*function_write_summary) (struct cgraph_node *);
+   void (*variable_write_summary) (struct varpool_node *);
+   void (*function_read_summary) (struct cgraph_node *);
+   void (*variable_read_summary) (struct varpool_node *);
+ 
/* This is the code to run.  If null, then there should be sub-passes
   otherwise this pass does nothing.  The return value contains
   TODOs to execute in addition to those in TODO_flags_finish.   */
unsigned int (*execute) (void);
  
+   /* Results of interprocedural propagation of an IPA pass is applied to
+  function body via this hook.  */
+   void (*function_transform) (struct cgraph_node *);
+   void (*variable_transform) (struct varpool_node *);
+ 
/* A list of sub-passes to run, dependent on gate predicate.  */
struct tree_opt_pass *sub;
  
*** struct tree_opt_pass
*** 124,134 
/* Flags indicating common sets things to do before and after.  */
unsigned int todo_flags_start;
unsigned int todo_flags_finish;
- 
-   /* Letter for RTL dumps.  */
-   char letter;
  };
  
  /* Define a tree dump switch.  */
  struct dump_file_info
  {
--- 145,166 
/* Flags indicating common sets things to do before and after.  */
unsigned int todo_flags_start;
unsigned int todo_flags_finish;
  };
  
+ /* RTL and tree passes are using just some of the hooks.  The macros makes
+it easier to add more hooks for different passes in future.  */
+ #define RTL_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
+ #define TREE_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
+ /* IPA passes not using the generate_summary/execute/transform scheme and thus
+not scalable for whole program optimization can use just execute hook.  */
+ #define SIMPLE_IPA_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
+ 
+ #define IPA_PASS_GENERATE_SUMMARY(fun,var) fun,var,
+ #define IPA_PASS_WRITE(fun,var) fun,var,
+ #define IPA_PASS_READ(fun,var) fun,var,
+ #define IPA_PASS_EXECUTE(execute) execute
+ #define IPA_PASS_TRANSFORM(fun,var) fun,var,
+ 
  /* Define a tree dump switch.  */
  struct dump_file_info
  {
*** struct dump_file_info
*** 138,144 
int flags;/* user flags */
int state;/* state of play */
int num;  /* dump file number */
-   int letter;   /* enabling letter for RTL dumps */
  };
  
  /* Pass properties.  */
--- 170,175 


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi,

> On Sun, Mar 9, 2008 at 2:17 PM, Diego Novillo <[EMAIL PROTECTED]> wrote:
> > On Sun, Mar 9, 2008 at 08:15, Richard Guenther
> >  <[EMAIL PROTECTED]> wrote:
> >
> >  >  What is GIMPLE_SINGLE_RHS after all?
> >
> >  Represents a "copy" operation, an operand with no operator (e.g., a = 3, b 
> > = c)
> >
> >  '3' and 'c' are "single" operands.  There is no operator involved in
> >  the assignment.
> 
> So as opposed to a unary operation which would look exactly the same apart
> from having a subcode?  So what does gimple_assign_subcode () return
> for the GIMPLE_SINGLE_RHS case?  Some random garbage?

just now, it will return the TREE_CODE of the rhs of the assignment
(e.g. VAR_DECL and INTEGER_CST in the cases mentioned above).
The problem we discuss in this thread is that actually the code
is the TREE_CODE of rhs in the moment that the statement was built;
so after ssa form creation, the gimple_assign_subcode of b_1 = c_2
will still be VAR_DECL; and after constant propagation of c_2 = 4,
the gimple_assign_subcode of b_1 = 4 will still be VAR_DECL.
While this is essentially harmless (only the fact that the code
is in GIMPLE_SINGLE_RHS class is important for the semantics),
it seems likely to cause some confusion,

Zdenek


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi,

> So, what about adding a GIMPLE_COPY code?  The code would have 0
> operands and used only for its numeric value.

another possibility would be to make GIMPLE_COPY an unary operator, and
get rid of the SINGLE_RHS case altogether (of course, unlike any other
unary operator, it would not require its operand to be gimple_val, so
special handling might still be necessary at some places. but it might
be less cumbersome),

Zdenek


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Richard Guenther
On Sun, Mar 9, 2008 at 3:46 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> Hi,
>
>
>  > So, what about adding a GIMPLE_COPY code?  The code would have 0
>  > operands and used only for its numeric value.
>
>  another possibility would be to make GIMPLE_COPY an unary operator, and
>  get rid of the SINGLE_RHS case altogether (of course, unlike any other
>  unary operator, it would not require its operand to be gimple_val, so
>  special handling might still be necessary at some places. but it might
>  be less cumbersome),

I like that.  Maybe I even suggested this some time ago after some
beer in some IRC
session.  (I don't see how unary should be fundamentally different from single).

Richard.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo

On 3/9/08 10:46 AM, Zdenek Dvorak wrote:

Hi,


So, what about adding a GIMPLE_COPY code?  The code would have 0
operands and used only for its numeric value.


another possibility would be to make GIMPLE_COPY an unary operator, and
get rid of the SINGLE_RHS case altogether (of course, unlike any other
unary operator, it would not require its operand to be gimple_val, so
special handling might still be necessary at some places. but it might
be less cumbersome),


Remember that GIMPLE_COPY *never* needs to be built as a tree node. 
When gimplifying something MODIFY_EXPR , CONST_INT> we do 
not need to create a GIMPLE_COPY tree node.


We merely need to set the subcode for GIMPLE_ASSIGN to be GIMPLE_COPY:

GIMPLE_ASSIGN , CONST_INT>

So, making it a one operand operator is not really necessary.


Diego.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo

On 3/9/08 11:31 AM, Richard Guenther wrote:

On Sun, Mar 9, 2008 at 3:46 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:

Hi,


 > So, what about adding a GIMPLE_COPY code?  The code would have 0
 > operands and used only for its numeric value.

 another possibility would be to make GIMPLE_COPY an unary operator, and
 get rid of the SINGLE_RHS case altogether (of course, unlike any other
 unary operator, it would not require its operand to be gimple_val, so
 special handling might still be necessary at some places. but it might
 be less cumbersome),


I like that.  Maybe I even suggested this some time ago after some
beer in some IRC
session.  (I don't see how unary should be fundamentally different from single).


The major difference is that GIMPLE_COPY never exists as a GENERIC tree 
node.  Unary operators do exist in GENERIC.


When converting to GIMPLE, a unary operator will be flattened by putting 
its tree code in gimple_subcode() and its TREE_OPERAND (t, 0) in the RHS.


Instead, a single operand will be put in its entirety in the RHS.  The 
subcode in that case becomes GIMPLE_COPY.  No need to build a one 
operand operator in order to flatten it immediately.



Diego.


Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Kenneth Zadeck
Jan Hubicka wrote:

This looks mostly fine to me.  note that i added you to pr35094 since
this patch will resolve that issue. 

I guess that one of the questions that i would have is why not have
there be a base structure for the core passmanager fields, and then a
union that contains a one of the tree, rtl or ipa specific fields. 

My motivation on this that the passes for ipa/tree/rtl really do behave
differently.  Especially the ipa passes.   Right now you are doing this
with only macros and just throwing all of the info into one big
structure.   This is not a problem from a space point of vied because
there are not enough passes to worry about, but it seems that the world
would actually be simpler if the structure knew if it was a ipa, tree or
rtl pass rather than just looking for the nulls.  

Kenny

> Hi,
> based on the discussion, this is change I would like to do to the
> passmanager.  I am sending the header change only first, because the
> actual change will need updating all PM datastructure initializers and
> compensate testsuite and documentation for the removal of RTL dump
> letters so I would rather do that just once.  Does this seem OK?
>
> The patch include the read/write methods that will be just placeholders
> on mainline.  Naturally I can remove them for time being at least as
> long as we think the RTL_PASS/TREE_PASS macros are good idea.  I can
> quite easilly see that those are stepping back from plan not making
> passmanager poluted by ugly macros, but on the other hand since the PM
> is now doing RTL/IPA/tree passes it needs at least a little of
> flexibility to be able update API of one without affecting others.
>
> Alternative would be some simple inheriatance scheme: have the common
> fields in one structure and derrived tree/ipa/RTL pass structures.
> It complicates passes.c somewhat but I can definitly do that.
>
> Jan
>
> Index: tree-pass.h
> ===
> *** tree-pass.h   (revision 133036)
> --- tree-pass.h   (working copy)
> *** extern const char *dump_file_name;
> *** 88,93 
> --- 88,97 
>   /* Return the dump_file_info for the given phase.  */
>   extern struct dump_file_info *get_dump_file_info (enum tree_dump_index);
>   
> + /* Forward declare so we don't need to bring in cgraph and varpool include. 
>  */
> + struct cgraph_node;
> + struct varpool_node;
> + 
>   /* Describe one pass.  */
>   struct tree_opt_pass
>   {
> *** struct tree_opt_pass
> *** 98,108 
> --- 102,129 
>the function returns true.  */
> bool (*gate) (void);
>   
> +   /* IPA passes can analyze function body and variable initializers using 
> this
> +  hook and produce summary.  */
> +   void (*function_generate_summary) (struct cgraph_node *);
> +   void (*variable_generate_summary) (struct varpool_node *);
> + 
> +   /* These hooks will be used to serialize IPA summaries on disk.  For a 
> moment
> +  they are just placeholders.  */
> +   void (*function_write_summary) (struct cgraph_node *);
> +   void (*variable_write_summary) (struct varpool_node *);
> +   void (*function_read_summary) (struct cgraph_node *);
> +   void (*variable_read_summary) (struct varpool_node *);
> + 
> /* This is the code to run.  If null, then there should be sub-passes
>otherwise this pass does nothing.  The return value contains
>TODOs to execute in addition to those in TODO_flags_finish.   */
> unsigned int (*execute) (void);
>   
> +   /* Results of interprocedural propagation of an IPA pass is applied to
> +  function body via this hook.  */
> +   void (*function_transform) (struct cgraph_node *);
> +   void (*variable_transform) (struct varpool_node *);
> + 
> /* A list of sub-passes to run, dependent on gate predicate.  */
> struct tree_opt_pass *sub;
>   
> *** struct tree_opt_pass
> *** 124,134 
> /* Flags indicating common sets things to do before and after.  */
> unsigned int todo_flags_start;
> unsigned int todo_flags_finish;
> - 
> -   /* Letter for RTL dumps.  */
> -   char letter;
>   };
>   
>   /* Define a tree dump switch.  */
>   struct dump_file_info
>   {
> --- 145,166 
> /* Flags indicating common sets things to do before and after.  */
> unsigned int todo_flags_start;
> unsigned int todo_flags_finish;
>   };
>   
> + /* RTL and tree passes are using just some of the hooks.  The macros makes
> +it easier to add more hooks for different passes in future.  */
> + #define RTL_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
> + #define TREE_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
> + /* IPA passes not using the generate_summary/execute/transform scheme and 
> thus
> +not scalable for whole program optimization can use just execute hook.  
> */
> + #define SIMPLE_IPA_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL
> + 
> + #define IPA_PASS_GENERATE_SUMMARY(

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi,

> >>So, what about adding a GIMPLE_COPY code?  The code would have 0
> >>operands and used only for its numeric value.
> >
> >another possibility would be to make GIMPLE_COPY an unary operator, and
> >get rid of the SINGLE_RHS case altogether (of course, unlike any other
> >unary operator, it would not require its operand to be gimple_val, so
> >special handling might still be necessary at some places. but it might
> >be less cumbersome),
> 
> Remember that GIMPLE_COPY *never* needs to be built as a tree node. 
> When gimplifying something MODIFY_EXPR , CONST_INT> we do 
> not need to create a GIMPLE_COPY tree node.
> 
> We merely need to set the subcode for GIMPLE_ASSIGN to be GIMPLE_COPY:
> 
> GIMPLE_ASSIGN , CONST_INT>
> 
> So, making it a one operand operator is not really necessary.

however, it would make things simpler.  Now, we need to distiguish
three cases -- SINGLE, UNARY and BINARY; if we pretended that
GIMPLE_COPY is an unary operator, this would be reduced just
to UNARY and BINARY.  Of course, GIMPLE_COPY would never be used
in a tree expression.

Zdenek


Possible gcc-4.3 regression wrt bootstrapping the toolchain

2008-03-09 Thread Jonas Meyer
Hello
I tried building a crosstoolchain with gcc-4.3. The way it worked with
earlier versions doesn't work with 4.3 anymore because make all-gcc
doesn't build libgcc at all. It doesn't fail, it just doesn't build
it.
If this is intended behaviour I'd be glad for anyhints how to
bootstrap a toolchain now.
Thanks, Jonas


Re: Possible gcc-4.3 regression wrt bootstrapping the toolchain

2008-03-09 Thread Andrew Pinski
On Sun, Mar 9, 2008 at 12:28 PM, Jonas Meyer <[EMAIL PROTECTED]> wrote:
> Hello
>  I tried building a crosstoolchain with gcc-4.3. The way it worked with
>  earlier versions doesn't work with 4.3 anymore because make all-gcc
>  doesn't build libgcc at all. It doesn't fail, it just doesn't build
>  it.
>  If this is intended behaviour I'd be glad for anyhints how to
>  bootstrap a toolchain now.

make all .

-- Pinski


Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Alexandre Oliva
On Mar  8, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:

> On Sat, 8 Mar 2008, Alexandre Oliva wrote:
>> > the object referenced is of integral type

>> This would break the use of SRA to extract sub-objects of non-integral
>> type.  IIRC Ada does such things.

> No frontend generates BIT_FIELD_REF.  But yes, SRA does with
> structures as base object.

I meant Ada can have non-integral members that are do not occupy an
integral number of bytes.

> We don't yet enforce this, MEM_REF will enforce BIT_FIELD_REF
> operates on registers only.

Do you have any plans to recover the performance loss for machines
that have bit-field instructions that operate directly in memory?
Especially for writes, I don't see how this is going to work.


>> > the result type is of the same type as the operand zero type
>> 
>> Err, this doesn't make much sense to me.  Consider:

> Right.  By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now
> as specified above.
 
This doesn't make the change that went in 

> The least conversion is removed.  You can look at the MEM_REF branch
> and see that the load from memory is done with a MEM_REF expression
> from the register result the bits are extracted with BIT_FIELD_REF.

Extracting (or replacing) bits from registers of integral type with
BIT_FIELD_REF is not very useful.  You're better off expanding the
BIT_FIELD_REF into mask and shift operations to get better
optimization.

It is for memory access that BIT_FIELD_REF makes the most sense,
because you can't create temporaries or duplicates for memory refs as
easily, especially when memory is volatile.

This new definition of BIT_FIELD_REF you came up with appears to be
redundant and useless to me.  All the interesting uses are ruled out,
and only uninteresting (and harmful) uses remain.  Might as well just
get rid of it, and later on realize it is needed for the interesting
cases and re-introduce it with the previous semantics.

Unless MEM_REF can be used to refer to arbitrary bit ranges.  In this
case, BIT_FIELD_REF is completely obviated.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Richard Guenther
On Sun, 9 Mar 2008, Alexandre Oliva wrote:

> On Mar  8, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:
> 
> > On Sat, 8 Mar 2008, Alexandre Oliva wrote:
> >> > the object referenced is of integral type
> 
> >> This would break the use of SRA to extract sub-objects of non-integral
> >> type.  IIRC Ada does such things.
> 
> > No frontend generates BIT_FIELD_REF.  But yes, SRA does with
> > structures as base object.
> 
> I meant Ada can have non-integral members that are do not occupy an
> integral number of bytes.

After layouting the type?

> > We don't yet enforce this, MEM_REF will enforce BIT_FIELD_REF
> > operates on registers only.
> 
> Do you have any plans to recover the performance loss for machines
> that have bit-field instructions that operate directly in memory?
> Especially for writes, I don't see how this is going to work.

Do you have an example target in mind?

> >> > the result type is of the same type as the operand zero type
> >> 
> >> Err, this doesn't make much sense to me.  Consider:
> 
> > Right.  By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now
> > as specified above.
>  
> This doesn't make the change that went in 

?

> > The least conversion is removed.  You can look at the MEM_REF branch
> > and see that the load from memory is done with a MEM_REF expression
> > from the register result the bits are extracted with BIT_FIELD_REF.
> 
> Extracting (or replacing) bits from registers of integral type with
> BIT_FIELD_REF is not very useful.  You're better off expanding the
> BIT_FIELD_REF into mask and shift operations to get better
> optimization.

You only generate lots of IL this way.

> It is for memory access that BIT_FIELD_REF makes the most sense,
> because you can't create temporaries or duplicates for memory refs as
> easily, especially when memory is volatile.

Volatile bit-field ref stores won't work on most targets anyway,
so this is a non-issue.  Why do you think you cannot create temporaries
for memory refs?  You did the same with SRA.

> This new definition of BIT_FIELD_REF you came up with appears to be
> redundant and useless to me.  All the interesting uses are ruled out,
> and only uninteresting (and harmful) uses remain.  Might as well just
> get rid of it, and later on realize it is needed for the interesting
> cases and re-introduce it with the previous semantics.

See above, it is mainly to ease combining with BIT_FIELD_EXPR.

> Unless MEM_REF can be used to refer to arbitrary bit ranges.  In this
> case, BIT_FIELD_REF is completely obviated.

MEM_REF can only access byte aligned and sized storage.

Richard.


Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Eric Botcazou
> I meant Ada can have non-integral members that are do not occupy an
> integral number of bytes.

Right, but the middle-end shouldn't need to touch them globally, the front-end 
is supposed to break up the accesses.

-- 
Eric Botcazou


Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
> Jan Hubicka wrote:
> 
> This looks mostly fine to me.  note that i added you to pr35094 since
> this patch will resolve that issue. 
> 
> I guess that one of the questions that i would have is why not have
> there be a base structure for the core passmanager fields, and then a
> union that contains a one of the tree, rtl or ipa specific fields. 

Unions are impossible to initialize by static initializers as we use
them now.  One can have the common part as contained structure and do
some "poor mans inheritance" in C instead.  That is have "pass"
structure containing the basic stuff and then "tree_pass", "rtl_pass"
and "ipa_pass" with the specific bits added contianing "pass" structure
as first field.  The initializers then would have extra brackets around
and the generic bits of passmanager can use pointers to "pass" that
would be casted to "tree_pass"/"rtl_pass" and such. It would also imply
that the "execute" hook of IPA pass would appear earlier than the
"generate_summary"/"read"/"write" that are logically executed before it,
but it is not big deal.

I can do that way too.  For start the tree_pass/rtl_pass probably would
contain nothing, as I think I've killed last different field (the
letter bit)
> 
> My motivation on this that the passes for ipa/tree/rtl really do behave
> differently.  Especially the ipa passes.   Right now you are doing this
> with only macros and just throwing all of the info into one big
> structure.   This is not a problem from a space point of vied because
> there are not enough passes to worry about, but it seems that the world
> would actually be simpler if the structure knew if it was a ipa, tree or
> rtl pass rather than just looking for the nulls.  

Some way to differntitate the pass types would be good.  At the moment
we can do that based on properties structure and the context, but it is
a bit weak.
I can definitly add an enum specifying the type (RTL/TREE/SMALL_IPA/IPA_PASS)

Honza
> 
> Kenny
> 
> > Hi,
> > based on the discussion, this is change I would like to do to the
> > passmanager.  I am sending the header change only first, because the
> > actual change will need updating all PM datastructure initializers and
> > compensate testsuite and documentation for the removal of RTL dump
> > letters so I would rather do that just once.  Does this seem OK?
> >
> > The patch include the read/write methods that will be just placeholders
> > on mainline.  Naturally I can remove them for time being at least as
> > long as we think the RTL_PASS/TREE_PASS macros are good idea.  I can
> > quite easilly see that those are stepping back from plan not making
> > passmanager poluted by ugly macros, but on the other hand since the PM
> > is now doing RTL/IPA/tree passes it needs at least a little of
> > flexibility to be able update API of one without affecting others.
> >
> > Alternative would be some simple inheriatance scheme: have the common
> > fields in one structure and derrived tree/ipa/RTL pass structures.
> > It complicates passes.c somewhat but I can definitly do that.
> >
> > Jan
> >
> > Index: tree-pass.h
> > ===
> > *** tree-pass.h (revision 133036)
> > --- tree-pass.h (working copy)
> > *** extern const char *dump_file_name;
> > *** 88,93 
> > --- 88,97 
> >   /* Return the dump_file_info for the given phase.  */
> >   extern struct dump_file_info *get_dump_file_info (enum tree_dump_index);
> >   
> > + /* Forward declare so we don't need to bring in cgraph and varpool 
> > include.  */
> > + struct cgraph_node;
> > + struct varpool_node;
> > + 
> >   /* Describe one pass.  */
> >   struct tree_opt_pass
> >   {
> > *** struct tree_opt_pass
> > *** 98,108 
> > --- 102,129 
> >the function returns true.  */
> > bool (*gate) (void);
> >   
> > +   /* IPA passes can analyze function body and variable initializers using 
> > this
> > +  hook and produce summary.  */
> > +   void (*function_generate_summary) (struct cgraph_node *);
> > +   void (*variable_generate_summary) (struct varpool_node *);
> > + 
> > +   /* These hooks will be used to serialize IPA summaries on disk.  For a 
> > moment
> > +  they are just placeholders.  */
> > +   void (*function_write_summary) (struct cgraph_node *);
> > +   void (*variable_write_summary) (struct varpool_node *);
> > +   void (*function_read_summary) (struct cgraph_node *);
> > +   void (*variable_read_summary) (struct varpool_node *);
> > + 
> > /* This is the code to run.  If null, then there should be sub-passes
> >otherwise this pass does nothing.  The return value contains
> >TODOs to execute in addition to those in TODO_flags_finish.   */
> > unsigned int (*execute) (void);
> >   
> > +   /* Results of interprocedural propagation of an IPA pass is applied to
> > +  function body via this hook.  */
> > +   void (*function_transform) (struct cgraph_node *);
> > +   voi

Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Alexandre Oliva
On Mar  9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:

> On Sun, 9 Mar 2008, Alexandre Oliva wrote:
>> Do you have any plans to recover the performance loss for machines
>> that have bit-field instructions that operate directly in memory?
>> Especially for writes, I don't see how this is going to work.

> Do you have an example target in mind?

AM33/2.0 and H8SX come to mind, although it's been a while since I
dealt with the memory bit-field operations of these two ports to have
the details handy.

>> > Right.  By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now
>> > as specified above.
>> 
>> This doesn't make the change that went in 

> ?

... desirable.

>> > The least conversion is removed.  You can look at the MEM_REF branch
>> > and see that the load from memory is done with a MEM_REF expression
>> > from the register result the bits are extracted with BIT_FIELD_REF.
>> 
>> Extracting (or replacing) bits from registers of integral type with
>> BIT_FIELD_REF is not very useful.  You're better off expanding the
>> BIT_FIELD_REF into mask and shift operations to get better
>> optimization.

> You only generate lots of IL this way.

You make for much better optimization, at least with current
handling of BIT_FIELD_REF.

OTOH, the much constrained version of BIT_FIELD_REF you propose will
indeed only generate lots of IL for the most common case of
BIT_FIELD_REF.

>> It is for memory access that BIT_FIELD_REF makes the most sense,
>> because you can't create temporaries or duplicates for memory refs as
>> easily, especially when memory is volatile.

> Volatile bit-field ref stores won't work on most targets anyway,
> so this is a non-issue.

What do you mean, won't work?  You get a read followed by a write,
even if for a wider unit of the object, which AFAIK is legitimate
implementation-defined behavior for volatile bit-fields.

> Why do you think you cannot create temporaries for memory refs?  You
> did the same with SRA.

You can.  It's just not as easy.  You see the amount of IL for
replacing stuff.  And then, SRA does not really create temporaries for
MEMs that remain as the primary copy of an object, which is the case
I'm really talking about.  The temporaries created by SRA mostly
replace the corresponding MEMs.


Besides, in SRA, for example, we'll often create a temporary that
holds multiple bit-fields that didn't make sense to scalarize on their
own.  If I'd worked harder on it, they wouldn't even have to be
contiguous, as long as they were in the same word.  BIT_FIELD_REF
couldn't represent this, though.

>> This new definition of BIT_FIELD_REF you came up with appears to be
>> redundant and useless to me.  All the interesting uses are ruled out,
>> and only uninteresting (and harmful) uses remain.  Might as well just
>> get rid of it, and later on realize it is needed for the interesting
>> cases and re-introduce it with the previous semantics.

> See above, it is mainly to ease combining with BIT_FIELD_EXPR.

BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers.
For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and
it may even generate better code without as much effort.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Diego Novillo

On 3/9/08 3:24 PM, Zdenek Dvorak wrote:


however, it would make things simpler.  Now, we need to distiguish
three cases -- SINGLE, UNARY and BINARY; if we pretended that
GIMPLE_COPY is an unary operator, this would be reduced just
to UNARY and BINARY.  Of course, GIMPLE_COPY would never be used
in a tree expression.


That's not the problem, the problem is in functions like 
extract_ops_from_tree.  Suppose that get_gimple_rhs_class returns 
GIMPLE_UNARY_RHS for SSA_NAME, we would then proceed to retrieve 
TREE_OPERAND (SSA_NAME, 0) for operand 1, which is wrong.


So, at some point we have to distinguish trees that are unary operators, 
which we want to flatten by extracting their operand, and trees that are 
to be used whole (like *_REF, SSA_NAME, *_DECL, CONST_*, etc). 
Particularly when optimizers use things like 
gimple_assign_set_rhs_from_tree.


Having said that, I think it may be possible to make this distinction 
internally in the low-level GIMPLE manipulators without exposing this to 
the optimizers.


I need to introduce GIMPLE_TERNARY_RHS (for ASSERT_EXPR) and 
GIMPLE_QUATERNARY_RHS (for COND_EXPR), so we will have at least four 
classes to deal with in the future.  I could add GIMPLE_COPY then, 
unless you want to work on it now.



Diego.



Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Alexandre Oliva
On Mar  9, 2008, Eric Botcazou <[EMAIL PROTECTED]> wrote:

>> I meant Ada can have non-integral members that are do not occupy an
>> integral number of bytes.

> Right, but the middle-end shouldn't need to touch them globally, the
> front-end is supposed to break up the accesses.

My point is that the front-end could just generate BIT_FIELD_REF,
without having to extract integral-typed words from the struct-typed
object to then be able to apply BIT_FIELD_REF on them, and then go
back to put the modified value back in place.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Richard Guenther
On Sun, 9 Mar 2008, Alexandre Oliva wrote:

> On Mar  9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:
> 
> > On Sun, 9 Mar 2008, Alexandre Oliva wrote:
> >> Do you have any plans to recover the performance loss for machines
> >> that have bit-field instructions that operate directly in memory?
> >> Especially for writes, I don't see how this is going to work.
> 
> > Do you have an example target in mind?
> 
> AM33/2.0 and H8SX come to mind, although it's been a while since I
> dealt with the memory bit-field operations of these two ports to have
> the details handy.

Ok, I would expect it a size benefit at most.

> >> > Right.  By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now
> >> > as specified above.
> >> 
> >> This doesn't make the change that went in 
> 
> > ?
> 
> ... desirable.

To recap, what went in is a restriction to constant args 2 and 3 and
a result TYPE_PRECISION (in case of intergal type) or MODE_PRECISION
(in other cases) matching to the bit-field size in argument 2.

The latter change made BIT_FIELD_REF_UNSIGNED redundant and (finally)
allows for constant folding of BIT_FIELD_REF in fold_ternary.

> >> > The least conversion is removed.  You can look at the MEM_REF branch
> >> > and see that the load from memory is done with a MEM_REF expression
> >> > from the register result the bits are extracted with BIT_FIELD_REF.
> >> 
> >> Extracting (or replacing) bits from registers of integral type with
> >> BIT_FIELD_REF is not very useful.  You're better off expanding the
> >> BIT_FIELD_REF into mask and shift operations to get better
> >> optimization.
> 
> > You only generate lots of IL this way.
> 
> You make for much better optimization, at least with current
> handling of BIT_FIELD_REF.
>
> OTOH, the much constrained version of BIT_FIELD_REF you propose will
> indeed only generate lots of IL for the most common case of
> BIT_FIELD_REF.

It generates much less IL than what the SRA lowering currently does.

> >> It is for memory access that BIT_FIELD_REF makes the most sense,
> >> because you can't create temporaries or duplicates for memory refs as
> >> easily, especially when memory is volatile.
> 
> > Volatile bit-field ref stores won't work on most targets anyway,
> > so this is a non-issue.
> 
> What do you mean, won't work?  You get a read followed by a write,
> even if for a wider unit of the object, which AFAIK is legitimate
> implementation-defined behavior for volatile bit-fields.

Oh, ok.  But this is no different in lowered form, or maybe I don't
understand your concern.  (volatile is specified as doing as many
accesses as written, which is where (IMHO) read + write for a
bit-field store is a mismatch, thus the "doesn't work")

> > Why do you think you cannot create temporaries for memory refs?  You
> > did the same with SRA.
> 
> You can.  It's just not as easy.  You see the amount of IL for
> replacing stuff.  And then, SRA does not really create temporaries for
> MEMs that remain as the primary copy of an object, which is the case
> I'm really talking about.  The temporaries created by SRA mostly
> replace the corresponding MEMs.
>
> 
> Besides, in SRA, for example, we'll often create a temporary that
> holds multiple bit-fields that didn't make sense to scalarize on their
> own.

This all works fine with the lowering MEM_REF does.  With less IL
than SRA creates (because of BIT_FIELD_REF).

>  If I'd worked harder on it, they wouldn't even have to be
> contiguous, as long as they were in the same word.  BIT_FIELD_REF
> couldn't represent this, though.

Represent yes, obviously you'd still need something like SRA to
to the re-ordering / packing.

> >> This new definition of BIT_FIELD_REF you came up with appears to be
> >> redundant and useless to me.  All the interesting uses are ruled out,
> >> and only uninteresting (and harmful) uses remain.  Might as well just
> >> get rid of it, and later on realize it is needed for the interesting
> >> cases and re-introduce it with the previous semantics.
> 
> > See above, it is mainly to ease combining with BIT_FIELD_EXPR.
> 
> BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers.
> For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and
> it may even generate better code without as much effort.

BIT_FIELD_REF doesn't play well with our scalar optimizers.  You
can't work on registers, as partial writes break SSA form, so you
need to keep a temporary in memory which is bad.  BIT_FIELD_EXPR
makes the new value created available, so you can keep SSA form
and work on registers. So,

  BIT_FIELD_REF  = foo_2;

(bogus of course) would instead be written as

  reg_2 = BIT_FIELD_EXPR ;

Thus, BIT_FIELD_EXPR is necessary to move bitfields from memory
to registers (unless you want to expose all the shifting and masking
directly).

Richard.


Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi,

> On 3/9/08 3:24 PM, Zdenek Dvorak wrote:
> 
> >however, it would make things simpler.  Now, we need to distiguish
> >three cases -- SINGLE, UNARY and BINARY; if we pretended that
> >GIMPLE_COPY is an unary operator, this would be reduced just
> >to UNARY and BINARY.  Of course, GIMPLE_COPY would never be used
> >in a tree expression.
> 
> That's not the problem, the problem is in functions like 
> extract_ops_from_tree.

extract_ops_from_tree would return GIMPLE_COPY as subcode and
the whole expression as op1, where's the problem?

> I need to introduce GIMPLE_TERNARY_RHS (for ASSERT_EXPR) and 
> GIMPLE_QUATERNARY_RHS (for COND_EXPR),

How are you going to represent the COND_EXPRs (i.e., where are you
going to put their comparison operator)?

> so we will have at least four 
> classes to deal with in the future.  I could add GIMPLE_COPY then, 
> unless you want to work on it now.

I think it can wait till then,

Zdenek



Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Alexandre Oliva
On Mar  9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:

> On Sun, 9 Mar 2008, Alexandre Oliva wrote:
>> AM33/2.0 and H8SX come to mind, although it's been a while since I
>> dealt with the memory bit-field operations of these two ports to have
>> the details handy.

> Ok, I would expect it a size benefit at most.

I wouldn't think so.  Turning a single hardware-optimized instruction
into a series of load, shift, mask, combine, store is unlikely to
benefit just size.  Especially considering that these optimized
instructions were introduced in revisions of the processor.

And then, I've failed to see a compelling reason to justify such a
change, that would have the effect of disabling this optimization,
even if it was "just" a size optimization.

>> >> > Right.  By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now
>> >> > as specified above.
>> >> 
>> >> This doesn't make the change that went in 

>> > ?

>> ... desirable.

> To recap, what went in is a restriction to constant args 2 and 3 and
> a result TYPE_PRECISION (in case of intergal type) or MODE_PRECISION
> (in other cases) matching to the bit-field size in argument 2.

> The latter change made BIT_FIELD_REF_UNSIGNED redundant and (finally)
> allows for constant folding of BIT_FIELD_REF in fold_ternary.

Good, then you didn't put in the requirement that the first argument
must be a gimple reg, and that its type must be the same as that of
the bit-field.  Those are the changes I object to, but you said they
had gone in.  I couldn't find the patch that did it, but I took your
word for it.  I'm glad it was just a misunderstanding.

As for removing BIT_FIELD_REF_UNSIGNED, I don't have much of a problem
with it.  Using the unsignedness of the BIT_FIELD_REF type works
AFAICT.

However, requiring a perfect match between the result type and the
extracted bit-width is troublesome, because we don't have
language-independent machinery to generate types with arbitrary widths
or even arbitrary precisions.


>> You make for much better optimization, at least with current
>> handling of BIT_FIELD_REF.

>> OTOH, the much constrained version of BIT_FIELD_REF you propose will
>> indeed only generate lots of IL for the most common case of
>> BIT_FIELD_REF.

> It generates much less IL than what the SRA lowering currently does.

The reason SRA generates more IL is *precisely* to get better
optimization.  The back-ends don't handle BIT_FIELD_REFs very well.
There's a lot of code all over the compiler to simplify masking and
shifting that, when you use BIT_FIELD_REF, doesn't (and can't)
trigger.  BIT_FIELD_REF hides masks and shifts, such that the compiler
can't simplify them without lowering them in the first place.  And the
lowering is profitable.

Unless you plan on extending the infrastructure that simplifies masks
and shifts such that it deals with BIT_FIELD_REFs and lowers them, at
least when operand 0 is a gimple reg, I'd strongly suggest you to
leave the lowering in BIT_FIELD_REFs in SRA, or perhaps even do that
as part of gimplification.

> This all works fine with the lowering MEM_REF does.  With less IL
> than SRA creates (because of BIT_FIELD_REF).

As long as it doesn't pessimize code (which is what the more IL was
about), I'm game.

>> BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers.
>> For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and
>> it may even generate better code without as much effort.

> BIT_FIELD_REF doesn't play well with our scalar optimizers.

Exactly.  That's why I suggest BIT_FIELD_REFs only for addressable
objects.  SRA will scalarize them if it can; if it can't, scalar
optimizers won't have much to do about them anyway.

> You can't work on registers, as partial writes break SSA form, so
> you need to keep a temporary in memory which is bad.

And then, since this is only useful for gimple registers, and you get
better code if you open-code the bit-field operations for them, I
stand by my claim that it's pointless, and BIT_FIELD_REFs lowered into
open-coded shifting and masking during gimplification would get us
better code.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Constrain valid arguments to BIT_FIELD_REF

2008-03-09 Thread Robert Dewar

Alexandre Oliva wrote:

On Mar  9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote:


On Sun, 9 Mar 2008, Alexandre Oliva wrote:

AM33/2.0 and H8SX come to mind, although it's been a while since I
dealt with the memory bit-field operations of these two ports to have
the details handy.



Ok, I would expect it a size benefit at most.


I wouldn't think so.  Turning a single hardware-optimized instruction
into a series of load, shift, mask, combine, store is unlikely to
benefit just size.  Especially considering that these optimized
instructions were introduced in revisions of the processor.

And then, I've failed to see a compelling reason to justify such a
change, that would have the effect of disabling this optimization,
even if it was "just" a size optimization.


Anyway, size and performance are always related because of icache
effects.