Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On Sun, Mar 9, 2008 at 1:29 AM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > Hi, > > I just noticed an error in a part of the code that I converted, that > looks this way: > > switch (gimple_assign_subcode (stmt)) > { > case SSA_NAME: > handle_ssa_name (); > break; > > case PLUS_EXPR: > handle_plus (); > break; > > default: > something (); > } > > The problem of course is that for GIMPLE_SINGLE_RHS, we do not maintain > the invariant that > > gimple_assign_subcode (stmt) == TREE_CODE (gimple_assign_rhs1 (stmt)), > > so gimple_assign_subcode typically will not be SSA_NAME, but VAR_DECL. > Enforcing this invariant might be hard and probably asking for more > trouble than it is worth. However, perhaps it would make sense > to use some special tree code to indicate GIMPLE_SINGLE_RHS, in order > to avoid confusion? What is GIMPLE_SINGLE_RHS after all? Thanks, Richard.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On Sat, Mar 8, 2008 at 19:29, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > The problem of course is that for GIMPLE_SINGLE_RHS, we do not maintain > the invariant that > > gimple_assign_subcode (stmt) == TREE_CODE (gimple_assign_rhs1 (stmt)), > > so gimple_assign_subcode typically will not be SSA_NAME, but VAR_DECL. > Enforcing this invariant might be hard and probably asking for more > trouble than it is worth. However, perhaps it would make sense > to use some special tree code to indicate GIMPLE_SINGLE_RHS, in order > to avoid confusion? Hmm, yeah. I remember talking about this with Bill a few days ago. He had a similar idea. At the time I was a bit reluctant to add a new tree code just for this, but now you've run into it and I was also looking at some code that would run into the same problem. Now I agree that adding a new tree code will be the cleanest way. We will be removing several tree codes when tuple merges, so the net result in tree.def will be a reduction of tree codes. In this case, the renamer was at fault. When it replaced the USE operand with an SSA_NAME it failed to change the subcode. The other problem with not using a special code for GIMPLE_SINGLE_RHS is confusion a_3 = b_5 is (if we fixed the renamer bug) GIMPLE_ASSIGN , SSA_NAME > So, the subcode seems to duplicate the code on the RHS. That looks odd. So, what about adding a GIMPLE_COPY code? The code would have 0 operands and used only for its numeric value. Diego.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On Sun, Mar 9, 2008 at 08:15, Richard Guenther <[EMAIL PROTECTED]> wrote: > What is GIMPLE_SINGLE_RHS after all? Represents a "copy" operation, an operand with no operator (e.g., a = 3, b = c) '3' and 'c' are "single" operands. There is no operator involved in the assignment.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On Sun, Mar 9, 2008 at 2:17 PM, Diego Novillo <[EMAIL PROTECTED]> wrote: > On Sun, Mar 9, 2008 at 08:15, Richard Guenther > <[EMAIL PROTECTED]> wrote: > > > What is GIMPLE_SINGLE_RHS after all? > > Represents a "copy" operation, an operand with no operator (e.g., a = 3, b = > c) > > '3' and 'c' are "single" operands. There is no operator involved in > the assignment. So as opposed to a unary operation which would look exactly the same apart from having a subcode? So what does gimple_assign_subcode () return for the GIMPLE_SINGLE_RHS case? Some random garbage? Seems I need to look at the tuples branch closer. Is there some overview documentation available that explains all the above? Thanks, Richard.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On 3/9/08 9:24 AM, Richard Guenther wrote: So as opposed to a unary operation which would look exactly the same apart from having a subcode? So what does gimple_assign_subcode () return for the GIMPLE_SINGLE_RHS case? Some random garbage? No. A unary operation is different. For instance, NEG_EXPR: a = -b The operator is NEG_EXPR, with the single operand VAR_DECL. When we deal with a unary operator T, the RHS will not have T, it will have TREE_OPERAND (T, 0). However, for a "single" or "copy" operand T (say a VAR_DECL or CONST_INT or SSA_NAME), the RHS will contain T itself. Seems I need to look at the tuples branch closer. Is there some overview documentation available that explains all the above? Yes, Aldy and I are working on the tutorial for the summit. It's still very crude (I'm not sure if the different RHS classes are discussed yet). The function you want to look at is get_gimple_rhs_class() http://docs.google.com/Doc?id=dgfkmttj_107hcr98sg3 Diego.
Re: API for callgraph and IPA passes for whole program optimization
Hi, based on the discussion, this is change I would like to do to the passmanager. I am sending the header change only first, because the actual change will need updating all PM datastructure initializers and compensate testsuite and documentation for the removal of RTL dump letters so I would rather do that just once. Does this seem OK? The patch include the read/write methods that will be just placeholders on mainline. Naturally I can remove them for time being at least as long as we think the RTL_PASS/TREE_PASS macros are good idea. I can quite easilly see that those are stepping back from plan not making passmanager poluted by ugly macros, but on the other hand since the PM is now doing RTL/IPA/tree passes it needs at least a little of flexibility to be able update API of one without affecting others. Alternative would be some simple inheriatance scheme: have the common fields in one structure and derrived tree/ipa/RTL pass structures. It complicates passes.c somewhat but I can definitly do that. Jan Index: tree-pass.h === *** tree-pass.h (revision 133036) --- tree-pass.h (working copy) *** extern const char *dump_file_name; *** 88,93 --- 88,97 /* Return the dump_file_info for the given phase. */ extern struct dump_file_info *get_dump_file_info (enum tree_dump_index); + /* Forward declare so we don't need to bring in cgraph and varpool include. */ + struct cgraph_node; + struct varpool_node; + /* Describe one pass. */ struct tree_opt_pass { *** struct tree_opt_pass *** 98,108 --- 102,129 the function returns true. */ bool (*gate) (void); + /* IPA passes can analyze function body and variable initializers using this + hook and produce summary. */ + void (*function_generate_summary) (struct cgraph_node *); + void (*variable_generate_summary) (struct varpool_node *); + + /* These hooks will be used to serialize IPA summaries on disk. For a moment + they are just placeholders. */ + void (*function_write_summary) (struct cgraph_node *); + void (*variable_write_summary) (struct varpool_node *); + void (*function_read_summary) (struct cgraph_node *); + void (*variable_read_summary) (struct varpool_node *); + /* This is the code to run. If null, then there should be sub-passes otherwise this pass does nothing. The return value contains TODOs to execute in addition to those in TODO_flags_finish. */ unsigned int (*execute) (void); + /* Results of interprocedural propagation of an IPA pass is applied to + function body via this hook. */ + void (*function_transform) (struct cgraph_node *); + void (*variable_transform) (struct varpool_node *); + /* A list of sub-passes to run, dependent on gate predicate. */ struct tree_opt_pass *sub; *** struct tree_opt_pass *** 124,134 /* Flags indicating common sets things to do before and after. */ unsigned int todo_flags_start; unsigned int todo_flags_finish; - - /* Letter for RTL dumps. */ - char letter; }; /* Define a tree dump switch. */ struct dump_file_info { --- 145,166 /* Flags indicating common sets things to do before and after. */ unsigned int todo_flags_start; unsigned int todo_flags_finish; }; + /* RTL and tree passes are using just some of the hooks. The macros makes +it easier to add more hooks for different passes in future. */ + #define RTL_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL + #define TREE_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL + /* IPA passes not using the generate_summary/execute/transform scheme and thus +not scalable for whole program optimization can use just execute hook. */ + #define SIMPLE_IPA_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL + + #define IPA_PASS_GENERATE_SUMMARY(fun,var) fun,var, + #define IPA_PASS_WRITE(fun,var) fun,var, + #define IPA_PASS_READ(fun,var) fun,var, + #define IPA_PASS_EXECUTE(execute) execute + #define IPA_PASS_TRANSFORM(fun,var) fun,var, + /* Define a tree dump switch. */ struct dump_file_info { *** struct dump_file_info *** 138,144 int flags;/* user flags */ int state;/* state of play */ int num; /* dump file number */ - int letter; /* enabling letter for RTL dumps */ }; /* Pass properties. */ --- 170,175
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
Hi, > On Sun, Mar 9, 2008 at 2:17 PM, Diego Novillo <[EMAIL PROTECTED]> wrote: > > On Sun, Mar 9, 2008 at 08:15, Richard Guenther > > <[EMAIL PROTECTED]> wrote: > > > > > What is GIMPLE_SINGLE_RHS after all? > > > > Represents a "copy" operation, an operand with no operator (e.g., a = 3, b > > = c) > > > > '3' and 'c' are "single" operands. There is no operator involved in > > the assignment. > > So as opposed to a unary operation which would look exactly the same apart > from having a subcode? So what does gimple_assign_subcode () return > for the GIMPLE_SINGLE_RHS case? Some random garbage? just now, it will return the TREE_CODE of the rhs of the assignment (e.g. VAR_DECL and INTEGER_CST in the cases mentioned above). The problem we discuss in this thread is that actually the code is the TREE_CODE of rhs in the moment that the statement was built; so after ssa form creation, the gimple_assign_subcode of b_1 = c_2 will still be VAR_DECL; and after constant propagation of c_2 = 4, the gimple_assign_subcode of b_1 = 4 will still be VAR_DECL. While this is essentially harmless (only the fact that the code is in GIMPLE_SINGLE_RHS class is important for the semantics), it seems likely to cause some confusion, Zdenek
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
Hi, > So, what about adding a GIMPLE_COPY code? The code would have 0 > operands and used only for its numeric value. another possibility would be to make GIMPLE_COPY an unary operator, and get rid of the SINGLE_RHS case altogether (of course, unlike any other unary operator, it would not require its operand to be gimple_val, so special handling might still be necessary at some places. but it might be less cumbersome), Zdenek
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On Sun, Mar 9, 2008 at 3:46 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > Hi, > > > > So, what about adding a GIMPLE_COPY code? The code would have 0 > > operands and used only for its numeric value. > > another possibility would be to make GIMPLE_COPY an unary operator, and > get rid of the SINGLE_RHS case altogether (of course, unlike any other > unary operator, it would not require its operand to be gimple_val, so > special handling might still be necessary at some places. but it might > be less cumbersome), I like that. Maybe I even suggested this some time ago after some beer in some IRC session. (I don't see how unary should be fundamentally different from single). Richard.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On 3/9/08 10:46 AM, Zdenek Dvorak wrote: Hi, So, what about adding a GIMPLE_COPY code? The code would have 0 operands and used only for its numeric value. another possibility would be to make GIMPLE_COPY an unary operator, and get rid of the SINGLE_RHS case altogether (of course, unlike any other unary operator, it would not require its operand to be gimple_val, so special handling might still be necessary at some places. but it might be less cumbersome), Remember that GIMPLE_COPY *never* needs to be built as a tree node. When gimplifying something MODIFY_EXPR , CONST_INT> we do not need to create a GIMPLE_COPY tree node. We merely need to set the subcode for GIMPLE_ASSIGN to be GIMPLE_COPY: GIMPLE_ASSIGN , CONST_INT> So, making it a one operand operator is not really necessary. Diego.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On 3/9/08 11:31 AM, Richard Guenther wrote: On Sun, Mar 9, 2008 at 3:46 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: Hi, > So, what about adding a GIMPLE_COPY code? The code would have 0 > operands and used only for its numeric value. another possibility would be to make GIMPLE_COPY an unary operator, and get rid of the SINGLE_RHS case altogether (of course, unlike any other unary operator, it would not require its operand to be gimple_val, so special handling might still be necessary at some places. but it might be less cumbersome), I like that. Maybe I even suggested this some time ago after some beer in some IRC session. (I don't see how unary should be fundamentally different from single). The major difference is that GIMPLE_COPY never exists as a GENERIC tree node. Unary operators do exist in GENERIC. When converting to GIMPLE, a unary operator will be flattened by putting its tree code in gimple_subcode() and its TREE_OPERAND (t, 0) in the RHS. Instead, a single operand will be put in its entirety in the RHS. The subcode in that case becomes GIMPLE_COPY. No need to build a one operand operator in order to flatten it immediately. Diego.
Re: API for callgraph and IPA passes for whole program optimization
Jan Hubicka wrote: This looks mostly fine to me. note that i added you to pr35094 since this patch will resolve that issue. I guess that one of the questions that i would have is why not have there be a base structure for the core passmanager fields, and then a union that contains a one of the tree, rtl or ipa specific fields. My motivation on this that the passes for ipa/tree/rtl really do behave differently. Especially the ipa passes. Right now you are doing this with only macros and just throwing all of the info into one big structure. This is not a problem from a space point of vied because there are not enough passes to worry about, but it seems that the world would actually be simpler if the structure knew if it was a ipa, tree or rtl pass rather than just looking for the nulls. Kenny > Hi, > based on the discussion, this is change I would like to do to the > passmanager. I am sending the header change only first, because the > actual change will need updating all PM datastructure initializers and > compensate testsuite and documentation for the removal of RTL dump > letters so I would rather do that just once. Does this seem OK? > > The patch include the read/write methods that will be just placeholders > on mainline. Naturally I can remove them for time being at least as > long as we think the RTL_PASS/TREE_PASS macros are good idea. I can > quite easilly see that those are stepping back from plan not making > passmanager poluted by ugly macros, but on the other hand since the PM > is now doing RTL/IPA/tree passes it needs at least a little of > flexibility to be able update API of one without affecting others. > > Alternative would be some simple inheriatance scheme: have the common > fields in one structure and derrived tree/ipa/RTL pass structures. > It complicates passes.c somewhat but I can definitly do that. > > Jan > > Index: tree-pass.h > === > *** tree-pass.h (revision 133036) > --- tree-pass.h (working copy) > *** extern const char *dump_file_name; > *** 88,93 > --- 88,97 > /* Return the dump_file_info for the given phase. */ > extern struct dump_file_info *get_dump_file_info (enum tree_dump_index); > > + /* Forward declare so we don't need to bring in cgraph and varpool include. > */ > + struct cgraph_node; > + struct varpool_node; > + > /* Describe one pass. */ > struct tree_opt_pass > { > *** struct tree_opt_pass > *** 98,108 > --- 102,129 >the function returns true. */ > bool (*gate) (void); > > + /* IPA passes can analyze function body and variable initializers using > this > + hook and produce summary. */ > + void (*function_generate_summary) (struct cgraph_node *); > + void (*variable_generate_summary) (struct varpool_node *); > + > + /* These hooks will be used to serialize IPA summaries on disk. For a > moment > + they are just placeholders. */ > + void (*function_write_summary) (struct cgraph_node *); > + void (*variable_write_summary) (struct varpool_node *); > + void (*function_read_summary) (struct cgraph_node *); > + void (*variable_read_summary) (struct varpool_node *); > + > /* This is the code to run. If null, then there should be sub-passes >otherwise this pass does nothing. The return value contains >TODOs to execute in addition to those in TODO_flags_finish. */ > unsigned int (*execute) (void); > > + /* Results of interprocedural propagation of an IPA pass is applied to > + function body via this hook. */ > + void (*function_transform) (struct cgraph_node *); > + void (*variable_transform) (struct varpool_node *); > + > /* A list of sub-passes to run, dependent on gate predicate. */ > struct tree_opt_pass *sub; > > *** struct tree_opt_pass > *** 124,134 > /* Flags indicating common sets things to do before and after. */ > unsigned int todo_flags_start; > unsigned int todo_flags_finish; > - > - /* Letter for RTL dumps. */ > - char letter; > }; > > /* Define a tree dump switch. */ > struct dump_file_info > { > --- 145,166 > /* Flags indicating common sets things to do before and after. */ > unsigned int todo_flags_start; > unsigned int todo_flags_finish; > }; > > + /* RTL and tree passes are using just some of the hooks. The macros makes > +it easier to add more hooks for different passes in future. */ > + #define RTL_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL > + #define TREE_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL > + /* IPA passes not using the generate_summary/execute/transform scheme and > thus > +not scalable for whole program optimization can use just execute hook. > */ > + #define SIMPLE_IPA_PASS(execute) NULL, NULL, NULL, NULL, execute, NULL, NULL > + > + #define IPA_PASS_GENERATE_SUMMARY(
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
Hi, > >>So, what about adding a GIMPLE_COPY code? The code would have 0 > >>operands and used only for its numeric value. > > > >another possibility would be to make GIMPLE_COPY an unary operator, and > >get rid of the SINGLE_RHS case altogether (of course, unlike any other > >unary operator, it would not require its operand to be gimple_val, so > >special handling might still be necessary at some places. but it might > >be less cumbersome), > > Remember that GIMPLE_COPY *never* needs to be built as a tree node. > When gimplifying something MODIFY_EXPR , CONST_INT> we do > not need to create a GIMPLE_COPY tree node. > > We merely need to set the subcode for GIMPLE_ASSIGN to be GIMPLE_COPY: > > GIMPLE_ASSIGN , CONST_INT> > > So, making it a one operand operator is not really necessary. however, it would make things simpler. Now, we need to distiguish three cases -- SINGLE, UNARY and BINARY; if we pretended that GIMPLE_COPY is an unary operator, this would be reduced just to UNARY and BINARY. Of course, GIMPLE_COPY would never be used in a tree expression. Zdenek
Possible gcc-4.3 regression wrt bootstrapping the toolchain
Hello I tried building a crosstoolchain with gcc-4.3. The way it worked with earlier versions doesn't work with 4.3 anymore because make all-gcc doesn't build libgcc at all. It doesn't fail, it just doesn't build it. If this is intended behaviour I'd be glad for anyhints how to bootstrap a toolchain now. Thanks, Jonas
Re: Possible gcc-4.3 regression wrt bootstrapping the toolchain
On Sun, Mar 9, 2008 at 12:28 PM, Jonas Meyer <[EMAIL PROTECTED]> wrote: > Hello > I tried building a crosstoolchain with gcc-4.3. The way it worked with > earlier versions doesn't work with 4.3 anymore because make all-gcc > doesn't build libgcc at all. It doesn't fail, it just doesn't build > it. > If this is intended behaviour I'd be glad for anyhints how to > bootstrap a toolchain now. make all . -- Pinski
Re: Constrain valid arguments to BIT_FIELD_REF
On Mar 8, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > On Sat, 8 Mar 2008, Alexandre Oliva wrote: >> > the object referenced is of integral type >> This would break the use of SRA to extract sub-objects of non-integral >> type. IIRC Ada does such things. > No frontend generates BIT_FIELD_REF. But yes, SRA does with > structures as base object. I meant Ada can have non-integral members that are do not occupy an integral number of bytes. > We don't yet enforce this, MEM_REF will enforce BIT_FIELD_REF > operates on registers only. Do you have any plans to recover the performance loss for machines that have bit-field instructions that operate directly in memory? Especially for writes, I don't see how this is going to work. >> > the result type is of the same type as the operand zero type >> >> Err, this doesn't make much sense to me. Consider: > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now > as specified above. This doesn't make the change that went in > The least conversion is removed. You can look at the MEM_REF branch > and see that the load from memory is done with a MEM_REF expression > from the register result the bits are extracted with BIT_FIELD_REF. Extracting (or replacing) bits from registers of integral type with BIT_FIELD_REF is not very useful. You're better off expanding the BIT_FIELD_REF into mask and shift operations to get better optimization. It is for memory access that BIT_FIELD_REF makes the most sense, because you can't create temporaries or duplicates for memory refs as easily, especially when memory is volatile. This new definition of BIT_FIELD_REF you came up with appears to be redundant and useless to me. All the interesting uses are ruled out, and only uninteresting (and harmful) uses remain. Might as well just get rid of it, and later on realize it is needed for the interesting cases and re-introduce it with the previous semantics. Unless MEM_REF can be used to refer to arbitrary bit ranges. In this case, BIT_FIELD_REF is completely obviated. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Constrain valid arguments to BIT_FIELD_REF
On Sun, 9 Mar 2008, Alexandre Oliva wrote: > On Mar 8, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > > > On Sat, 8 Mar 2008, Alexandre Oliva wrote: > >> > the object referenced is of integral type > > >> This would break the use of SRA to extract sub-objects of non-integral > >> type. IIRC Ada does such things. > > > No frontend generates BIT_FIELD_REF. But yes, SRA does with > > structures as base object. > > I meant Ada can have non-integral members that are do not occupy an > integral number of bytes. After layouting the type? > > We don't yet enforce this, MEM_REF will enforce BIT_FIELD_REF > > operates on registers only. > > Do you have any plans to recover the performance loss for machines > that have bit-field instructions that operate directly in memory? > Especially for writes, I don't see how this is going to work. Do you have an example target in mind? > >> > the result type is of the same type as the operand zero type > >> > >> Err, this doesn't make much sense to me. Consider: > > > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now > > as specified above. > > This doesn't make the change that went in ? > > The least conversion is removed. You can look at the MEM_REF branch > > and see that the load from memory is done with a MEM_REF expression > > from the register result the bits are extracted with BIT_FIELD_REF. > > Extracting (or replacing) bits from registers of integral type with > BIT_FIELD_REF is not very useful. You're better off expanding the > BIT_FIELD_REF into mask and shift operations to get better > optimization. You only generate lots of IL this way. > It is for memory access that BIT_FIELD_REF makes the most sense, > because you can't create temporaries or duplicates for memory refs as > easily, especially when memory is volatile. Volatile bit-field ref stores won't work on most targets anyway, so this is a non-issue. Why do you think you cannot create temporaries for memory refs? You did the same with SRA. > This new definition of BIT_FIELD_REF you came up with appears to be > redundant and useless to me. All the interesting uses are ruled out, > and only uninteresting (and harmful) uses remain. Might as well just > get rid of it, and later on realize it is needed for the interesting > cases and re-introduce it with the previous semantics. See above, it is mainly to ease combining with BIT_FIELD_EXPR. > Unless MEM_REF can be used to refer to arbitrary bit ranges. In this > case, BIT_FIELD_REF is completely obviated. MEM_REF can only access byte aligned and sized storage. Richard.
Re: Constrain valid arguments to BIT_FIELD_REF
> I meant Ada can have non-integral members that are do not occupy an > integral number of bytes. Right, but the middle-end shouldn't need to touch them globally, the front-end is supposed to break up the accesses. -- Eric Botcazou
Re: API for callgraph and IPA passes for whole program optimization
> Jan Hubicka wrote: > > This looks mostly fine to me. note that i added you to pr35094 since > this patch will resolve that issue. > > I guess that one of the questions that i would have is why not have > there be a base structure for the core passmanager fields, and then a > union that contains a one of the tree, rtl or ipa specific fields. Unions are impossible to initialize by static initializers as we use them now. One can have the common part as contained structure and do some "poor mans inheritance" in C instead. That is have "pass" structure containing the basic stuff and then "tree_pass", "rtl_pass" and "ipa_pass" with the specific bits added contianing "pass" structure as first field. The initializers then would have extra brackets around and the generic bits of passmanager can use pointers to "pass" that would be casted to "tree_pass"/"rtl_pass" and such. It would also imply that the "execute" hook of IPA pass would appear earlier than the "generate_summary"/"read"/"write" that are logically executed before it, but it is not big deal. I can do that way too. For start the tree_pass/rtl_pass probably would contain nothing, as I think I've killed last different field (the letter bit) > > My motivation on this that the passes for ipa/tree/rtl really do behave > differently. Especially the ipa passes. Right now you are doing this > with only macros and just throwing all of the info into one big > structure. This is not a problem from a space point of vied because > there are not enough passes to worry about, but it seems that the world > would actually be simpler if the structure knew if it was a ipa, tree or > rtl pass rather than just looking for the nulls. Some way to differntitate the pass types would be good. At the moment we can do that based on properties structure and the context, but it is a bit weak. I can definitly add an enum specifying the type (RTL/TREE/SMALL_IPA/IPA_PASS) Honza > > Kenny > > > Hi, > > based on the discussion, this is change I would like to do to the > > passmanager. I am sending the header change only first, because the > > actual change will need updating all PM datastructure initializers and > > compensate testsuite and documentation for the removal of RTL dump > > letters so I would rather do that just once. Does this seem OK? > > > > The patch include the read/write methods that will be just placeholders > > on mainline. Naturally I can remove them for time being at least as > > long as we think the RTL_PASS/TREE_PASS macros are good idea. I can > > quite easilly see that those are stepping back from plan not making > > passmanager poluted by ugly macros, but on the other hand since the PM > > is now doing RTL/IPA/tree passes it needs at least a little of > > flexibility to be able update API of one without affecting others. > > > > Alternative would be some simple inheriatance scheme: have the common > > fields in one structure and derrived tree/ipa/RTL pass structures. > > It complicates passes.c somewhat but I can definitly do that. > > > > Jan > > > > Index: tree-pass.h > > === > > *** tree-pass.h (revision 133036) > > --- tree-pass.h (working copy) > > *** extern const char *dump_file_name; > > *** 88,93 > > --- 88,97 > > /* Return the dump_file_info for the given phase. */ > > extern struct dump_file_info *get_dump_file_info (enum tree_dump_index); > > > > + /* Forward declare so we don't need to bring in cgraph and varpool > > include. */ > > + struct cgraph_node; > > + struct varpool_node; > > + > > /* Describe one pass. */ > > struct tree_opt_pass > > { > > *** struct tree_opt_pass > > *** 98,108 > > --- 102,129 > >the function returns true. */ > > bool (*gate) (void); > > > > + /* IPA passes can analyze function body and variable initializers using > > this > > + hook and produce summary. */ > > + void (*function_generate_summary) (struct cgraph_node *); > > + void (*variable_generate_summary) (struct varpool_node *); > > + > > + /* These hooks will be used to serialize IPA summaries on disk. For a > > moment > > + they are just placeholders. */ > > + void (*function_write_summary) (struct cgraph_node *); > > + void (*variable_write_summary) (struct varpool_node *); > > + void (*function_read_summary) (struct cgraph_node *); > > + void (*variable_read_summary) (struct varpool_node *); > > + > > /* This is the code to run. If null, then there should be sub-passes > >otherwise this pass does nothing. The return value contains > >TODOs to execute in addition to those in TODO_flags_finish. */ > > unsigned int (*execute) (void); > > > > + /* Results of interprocedural propagation of an IPA pass is applied to > > + function body via this hook. */ > > + void (*function_transform) (struct cgraph_node *); > > + voi
Re: Constrain valid arguments to BIT_FIELD_REF
On Mar 9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > On Sun, 9 Mar 2008, Alexandre Oliva wrote: >> Do you have any plans to recover the performance loss for machines >> that have bit-field instructions that operate directly in memory? >> Especially for writes, I don't see how this is going to work. > Do you have an example target in mind? AM33/2.0 and H8SX come to mind, although it's been a while since I dealt with the memory bit-field operations of these two ports to have the details handy. >> > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now >> > as specified above. >> >> This doesn't make the change that went in > ? ... desirable. >> > The least conversion is removed. You can look at the MEM_REF branch >> > and see that the load from memory is done with a MEM_REF expression >> > from the register result the bits are extracted with BIT_FIELD_REF. >> >> Extracting (or replacing) bits from registers of integral type with >> BIT_FIELD_REF is not very useful. You're better off expanding the >> BIT_FIELD_REF into mask and shift operations to get better >> optimization. > You only generate lots of IL this way. You make for much better optimization, at least with current handling of BIT_FIELD_REF. OTOH, the much constrained version of BIT_FIELD_REF you propose will indeed only generate lots of IL for the most common case of BIT_FIELD_REF. >> It is for memory access that BIT_FIELD_REF makes the most sense, >> because you can't create temporaries or duplicates for memory refs as >> easily, especially when memory is volatile. > Volatile bit-field ref stores won't work on most targets anyway, > so this is a non-issue. What do you mean, won't work? You get a read followed by a write, even if for a wider unit of the object, which AFAIK is legitimate implementation-defined behavior for volatile bit-fields. > Why do you think you cannot create temporaries for memory refs? You > did the same with SRA. You can. It's just not as easy. You see the amount of IL for replacing stuff. And then, SRA does not really create temporaries for MEMs that remain as the primary copy of an object, which is the case I'm really talking about. The temporaries created by SRA mostly replace the corresponding MEMs. Besides, in SRA, for example, we'll often create a temporary that holds multiple bit-fields that didn't make sense to scalarize on their own. If I'd worked harder on it, they wouldn't even have to be contiguous, as long as they were in the same word. BIT_FIELD_REF couldn't represent this, though. >> This new definition of BIT_FIELD_REF you came up with appears to be >> redundant and useless to me. All the interesting uses are ruled out, >> and only uninteresting (and harmful) uses remain. Might as well just >> get rid of it, and later on realize it is needed for the interesting >> cases and re-introduce it with the previous semantics. > See above, it is mainly to ease combining with BIT_FIELD_EXPR. BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers. For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and it may even generate better code without as much effort. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
On 3/9/08 3:24 PM, Zdenek Dvorak wrote: however, it would make things simpler. Now, we need to distiguish three cases -- SINGLE, UNARY and BINARY; if we pretended that GIMPLE_COPY is an unary operator, this would be reduced just to UNARY and BINARY. Of course, GIMPLE_COPY would never be used in a tree expression. That's not the problem, the problem is in functions like extract_ops_from_tree. Suppose that get_gimple_rhs_class returns GIMPLE_UNARY_RHS for SSA_NAME, we would then proceed to retrieve TREE_OPERAND (SSA_NAME, 0) for operand 1, which is wrong. So, at some point we have to distinguish trees that are unary operators, which we want to flatten by extracting their operand, and trees that are to be used whole (like *_REF, SSA_NAME, *_DECL, CONST_*, etc). Particularly when optimizers use things like gimple_assign_set_rhs_from_tree. Having said that, I think it may be possible to make this distinction internally in the low-level GIMPLE manipulators without exposing this to the optimizers. I need to introduce GIMPLE_TERNARY_RHS (for ASSERT_EXPR) and GIMPLE_QUATERNARY_RHS (for COND_EXPR), so we will have at least four classes to deal with in the future. I could add GIMPLE_COPY then, unless you want to work on it now. Diego.
Re: Constrain valid arguments to BIT_FIELD_REF
On Mar 9, 2008, Eric Botcazou <[EMAIL PROTECTED]> wrote: >> I meant Ada can have non-integral members that are do not occupy an >> integral number of bytes. > Right, but the middle-end shouldn't need to touch them globally, the > front-end is supposed to break up the accesses. My point is that the front-end could just generate BIT_FIELD_REF, without having to extract integral-typed words from the struct-typed object to then be able to apply BIT_FIELD_REF on them, and then go back to put the modified value back in place. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Constrain valid arguments to BIT_FIELD_REF
On Sun, 9 Mar 2008, Alexandre Oliva wrote: > On Mar 9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > > > On Sun, 9 Mar 2008, Alexandre Oliva wrote: > >> Do you have any plans to recover the performance loss for machines > >> that have bit-field instructions that operate directly in memory? > >> Especially for writes, I don't see how this is going to work. > > > Do you have an example target in mind? > > AM33/2.0 and H8SX come to mind, although it's been a while since I > dealt with the memory bit-field operations of these two ports to have > the details handy. Ok, I would expect it a size benefit at most. > >> > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now > >> > as specified above. > >> > >> This doesn't make the change that went in > > > ? > > ... desirable. To recap, what went in is a restriction to constant args 2 and 3 and a result TYPE_PRECISION (in case of intergal type) or MODE_PRECISION (in other cases) matching to the bit-field size in argument 2. The latter change made BIT_FIELD_REF_UNSIGNED redundant and (finally) allows for constant folding of BIT_FIELD_REF in fold_ternary. > >> > The least conversion is removed. You can look at the MEM_REF branch > >> > and see that the load from memory is done with a MEM_REF expression > >> > from the register result the bits are extracted with BIT_FIELD_REF. > >> > >> Extracting (or replacing) bits from registers of integral type with > >> BIT_FIELD_REF is not very useful. You're better off expanding the > >> BIT_FIELD_REF into mask and shift operations to get better > >> optimization. > > > You only generate lots of IL this way. > > You make for much better optimization, at least with current > handling of BIT_FIELD_REF. > > OTOH, the much constrained version of BIT_FIELD_REF you propose will > indeed only generate lots of IL for the most common case of > BIT_FIELD_REF. It generates much less IL than what the SRA lowering currently does. > >> It is for memory access that BIT_FIELD_REF makes the most sense, > >> because you can't create temporaries or duplicates for memory refs as > >> easily, especially when memory is volatile. > > > Volatile bit-field ref stores won't work on most targets anyway, > > so this is a non-issue. > > What do you mean, won't work? You get a read followed by a write, > even if for a wider unit of the object, which AFAIK is legitimate > implementation-defined behavior for volatile bit-fields. Oh, ok. But this is no different in lowered form, or maybe I don't understand your concern. (volatile is specified as doing as many accesses as written, which is where (IMHO) read + write for a bit-field store is a mismatch, thus the "doesn't work") > > Why do you think you cannot create temporaries for memory refs? You > > did the same with SRA. > > You can. It's just not as easy. You see the amount of IL for > replacing stuff. And then, SRA does not really create temporaries for > MEMs that remain as the primary copy of an object, which is the case > I'm really talking about. The temporaries created by SRA mostly > replace the corresponding MEMs. > > > Besides, in SRA, for example, we'll often create a temporary that > holds multiple bit-fields that didn't make sense to scalarize on their > own. This all works fine with the lowering MEM_REF does. With less IL than SRA creates (because of BIT_FIELD_REF). > If I'd worked harder on it, they wouldn't even have to be > contiguous, as long as they were in the same word. BIT_FIELD_REF > couldn't represent this, though. Represent yes, obviously you'd still need something like SRA to to the re-ordering / packing. > >> This new definition of BIT_FIELD_REF you came up with appears to be > >> redundant and useless to me. All the interesting uses are ruled out, > >> and only uninteresting (and harmful) uses remain. Might as well just > >> get rid of it, and later on realize it is needed for the interesting > >> cases and re-introduce it with the previous semantics. > > > See above, it is mainly to ease combining with BIT_FIELD_EXPR. > > BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers. > For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and > it may even generate better code without as much effort. BIT_FIELD_REF doesn't play well with our scalar optimizers. You can't work on registers, as partial writes break SSA form, so you need to keep a temporary in memory which is bad. BIT_FIELD_EXPR makes the new value created available, so you can keep SSA form and work on registers. So, BIT_FIELD_REF = foo_2; (bogus of course) would instead be written as reg_2 = BIT_FIELD_EXPR ; Thus, BIT_FIELD_EXPR is necessary to move bitfields from memory to registers (unless you want to expose all the shifting and masking directly). Richard.
Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS
Hi, > On 3/9/08 3:24 PM, Zdenek Dvorak wrote: > > >however, it would make things simpler. Now, we need to distiguish > >three cases -- SINGLE, UNARY and BINARY; if we pretended that > >GIMPLE_COPY is an unary operator, this would be reduced just > >to UNARY and BINARY. Of course, GIMPLE_COPY would never be used > >in a tree expression. > > That's not the problem, the problem is in functions like > extract_ops_from_tree. extract_ops_from_tree would return GIMPLE_COPY as subcode and the whole expression as op1, where's the problem? > I need to introduce GIMPLE_TERNARY_RHS (for ASSERT_EXPR) and > GIMPLE_QUATERNARY_RHS (for COND_EXPR), How are you going to represent the COND_EXPRs (i.e., where are you going to put their comparison operator)? > so we will have at least four > classes to deal with in the future. I could add GIMPLE_COPY then, > unless you want to work on it now. I think it can wait till then, Zdenek
Re: Constrain valid arguments to BIT_FIELD_REF
On Mar 9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > On Sun, 9 Mar 2008, Alexandre Oliva wrote: >> AM33/2.0 and H8SX come to mind, although it's been a while since I >> dealt with the memory bit-field operations of these two ports to have >> the details handy. > Ok, I would expect it a size benefit at most. I wouldn't think so. Turning a single hardware-optimized instruction into a series of load, shift, mask, combine, store is unlikely to benefit just size. Especially considering that these optimized instructions were introduced in revisions of the processor. And then, I've failed to see a compelling reason to justify such a change, that would have the effect of disabling this optimization, even if it was "just" a size optimization. >> >> > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now >> >> > as specified above. >> >> >> >> This doesn't make the change that went in >> > ? >> ... desirable. > To recap, what went in is a restriction to constant args 2 and 3 and > a result TYPE_PRECISION (in case of intergal type) or MODE_PRECISION > (in other cases) matching to the bit-field size in argument 2. > The latter change made BIT_FIELD_REF_UNSIGNED redundant and (finally) > allows for constant folding of BIT_FIELD_REF in fold_ternary. Good, then you didn't put in the requirement that the first argument must be a gimple reg, and that its type must be the same as that of the bit-field. Those are the changes I object to, but you said they had gone in. I couldn't find the patch that did it, but I took your word for it. I'm glad it was just a misunderstanding. As for removing BIT_FIELD_REF_UNSIGNED, I don't have much of a problem with it. Using the unsignedness of the BIT_FIELD_REF type works AFAICT. However, requiring a perfect match between the result type and the extracted bit-width is troublesome, because we don't have language-independent machinery to generate types with arbitrary widths or even arbitrary precisions. >> You make for much better optimization, at least with current >> handling of BIT_FIELD_REF. >> OTOH, the much constrained version of BIT_FIELD_REF you propose will >> indeed only generate lots of IL for the most common case of >> BIT_FIELD_REF. > It generates much less IL than what the SRA lowering currently does. The reason SRA generates more IL is *precisely* to get better optimization. The back-ends don't handle BIT_FIELD_REFs very well. There's a lot of code all over the compiler to simplify masking and shifting that, when you use BIT_FIELD_REF, doesn't (and can't) trigger. BIT_FIELD_REF hides masks and shifts, such that the compiler can't simplify them without lowering them in the first place. And the lowering is profitable. Unless you plan on extending the infrastructure that simplifies masks and shifts such that it deals with BIT_FIELD_REFs and lowers them, at least when operand 0 is a gimple reg, I'd strongly suggest you to leave the lowering in BIT_FIELD_REFs in SRA, or perhaps even do that as part of gimplification. > This all works fine with the lowering MEM_REF does. With less IL > than SRA creates (because of BIT_FIELD_REF). As long as it doesn't pessimize code (which is what the more IL was about), I'm game. >> BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers. >> For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and >> it may even generate better code without as much effort. > BIT_FIELD_REF doesn't play well with our scalar optimizers. Exactly. That's why I suggest BIT_FIELD_REFs only for addressable objects. SRA will scalarize them if it can; if it can't, scalar optimizers won't have much to do about them anyway. > You can't work on registers, as partial writes break SSA form, so > you need to keep a temporary in memory which is bad. And then, since this is only useful for gimple registers, and you get better code if you open-code the bit-field operations for them, I stand by my claim that it's pointless, and BIT_FIELD_REFs lowered into open-coded shifting and masking during gimplification would get us better code. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Constrain valid arguments to BIT_FIELD_REF
Alexandre Oliva wrote: On Mar 9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: On Sun, 9 Mar 2008, Alexandre Oliva wrote: AM33/2.0 and H8SX come to mind, although it's been a while since I dealt with the memory bit-field operations of these two ports to have the details handy. Ok, I would expect it a size benefit at most. I wouldn't think so. Turning a single hardware-optimized instruction into a series of load, shift, mask, combine, store is unlikely to benefit just size. Especially considering that these optimized instructions were introduced in revisions of the processor. And then, I've failed to see a compelling reason to justify such a change, that would have the effect of disabling this optimization, even if it was "just" a size optimization. Anyway, size and performance are always related because of icache effects.