Re: [RFC] Add new flag to specify output constraint in match.pd
On Wed, 2 Sep 2020, Richard Biener via Gcc wrote: On Mon, Aug 24, 2020 at 8:20 AM Feng Xue OS via Gcc wrote: There is a match-folding issue derived from pr94234. A piece of code like: int foo (int n) { int t1 = 8 * n; int t2 = 8 * (n - 1); return t1 - t2; } It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) * C", and be folded to constant "8". But this folding will fail if both v1 and v2 have multiple uses, as the following code. int foo (int n) { int t1 = 8 * n; int t2 = 8 * (n - 1); use_fn (t1, t2); return t1 - t2; } Given an expression with non-single-use operands, folding it will introduce duplicated computation in most situations, and is deemed to be unprofitable. But it is always beneficial if final result is a constant or existing SSA value. And the rule is: (simplify (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) (if ((!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type) || (INTEGRAL_TYPE_P (type) && tree_expr_nonzero_p (@0) && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION (type) /* If @1 +- @2 is constant require a hard single-use on either original operand (but not on both). */ && (single_use (@3) || single_use (@4))) <- control whether match or not (mult (plusminus @1 @2) @0))) Current matcher only provides a way to check something before folding, but no mechanism to affect decision after folding. If has, for the above case, we can let it go when we find result is a constant. :s already has a counter-measure where it still folds if the output is at most one operation. So this transformation has a counter-counter-measure of checking single_use explicitly. And now we want a counter^3-measure... Counter-measure is key factor to matching-cost. ":s" seems to be somewhat coarse-grained. And here we do need more control over it. But ideally, we could decouple these counter-measures from definitions of match-rule, and let gimple-matcher get a more reasonable match-or-not decision based on these counters. Anyway, it is another story. Like the way to describe input operand using flags, we could also add a new flag to specify this kind of constraint on output that we expect it is a simple gimple value. Proposed syntax is (opcode:v{ condition } ) The char "v" stands for gimple value, if more descriptive, other char is preferred. "condition" enclosed by { } is an optional c-syntax condition expression. If present, only when "condition" is met, matcher will check whether folding result is a gimple value using gimple_simplified_result_is_gimple_val (). Since there is no SSA concept in GENERIC, this is only for GIMPLE-match, not GENERIC-match. With this syntax, the rule is changed to #Form 1: (simplify (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) (if ((!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type) || (INTEGRAL_TYPE_P (type) && tree_expr_nonzero_p (@0) && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION (type)) ( if (!single_use (@3) && !single_use (@4)) (mult:v (plusminus @1 @2) @0))) (mult (plusminus @1 @2) @0) That seems to match what you can do with '!' now (that's very recent). It's also what :s does but a slight bit more "local". When any operand is marked :s and it has more than a single-use we only allow simplifications that do not require insertion of extra stmts. So basically the above pattern doesn't behave any different than if you omit your :v. Only if you'd place :v on an inner expression there would be a difference. Correlating the inner expression we'd not want to insert new expressions for with a specific :s (or multiple ones) would be a more natural extension of what :s provides. Thus, for the above case (Form 1), you do not need :v at all and :s works. Let's consider that multiplication is expensive. We have code like 5*X-3*X, which can be simplified to 2*X. However, if both 5*X and 3*X have other uses, that would increase the number of multiplications. :s would not block a simplification to 2*X, which is a single stmt. So the existing transformation has extra explicit checks for single_use. And those extra checks block the transformation even for 5*X-4*X -> X which does not increase the number of multiplications. Which is where '!' (or :v here) comes in. Or we could decide that the extra multiplication is not that bad if it saves an addition, simplifies the expression, possibly gains more insn parallelism, etc, in which case we could just drop the existing hard single_use check... -- Marc Glisse
Re: [RFC] Add new flag to specify output constraint in match.pd
> >> >> >> There is a match-folding issue derived from pr94234. A piece of code >> >> like: >> >> >> >> int foo (int n) >> >> { >> >> int t1 = 8 * n; >> >> int t2 = 8 * (n - 1); >> >> >> >> return t1 - t2; >> >> } >> >> >> >> It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) * >> >> C", and >> >> be folded to constant "8". But this folding will fail if both v1 and v2 >> >> have >> >> multiple uses, as the following code. >> >> >> >> int foo (int n) >> >> { >> >> int t1 = 8 * n; >> >> int t2 = 8 * (n - 1); >> >> >> >> use_fn (t1, t2); >> >> return t1 - t2; >> >> } >> >> >> >> Given an expression with non-single-use operands, folding it will >> >> introduce >> >> duplicated computation in most situations, and is deemed to be >> >> unprofitable. >> >> But it is always beneficial if final result is a constant or existing >> >> SSA value. >> >> >> >> And the rule is: >> >> (simplify >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) >> >> || TYPE_OVERFLOW_WRAPS (type) >> >> || (INTEGRAL_TYPE_P (type) >> >> && tree_expr_nonzero_p (@0) >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION >> >> (type) >> >>/* If @1 +- @2 is constant require a hard single-use on either >> >> original operand (but not on both). */ >> >>&& (single_use (@3) || single_use (@4))) <- control whether >> >> match or not >> >> (mult (plusminus @1 @2) @0))) >> >> >> >> Current matcher only provides a way to check something before folding, >> >> but no mechanism to affect decision after folding. If has, for the above >> >> case, we can let it go when we find result is a constant. >> > >> > :s already has a counter-measure where it still folds if the output is at >> > most one operation. So this transformation has a counter-counter-measure >> > of checking single_use explicitly. And now we want a counter^3-measure... >> > >> Counter-measure is key factor to matching-cost. ":s" seems to be somewhat >> coarse-grained. And here we do need more control over it. >> >> But ideally, we could decouple these counter-measures from definitions of >> match-rule, and let gimple-matcher get a more reasonable match-or-not >> decision based on these counters. Anyway, it is another story. >> >> >> Like the way to describe input operand using flags, we could also add >> >> a new flag to specify this kind of constraint on output that we expect >> >> it is a simple gimple value. >> >> >> >> Proposed syntax is >> >> >> >> (opcode:v{ condition } ) >> >> >> >> The char "v" stands for gimple value, if more descriptive, other char is >> >> preferred. "condition" enclosed by { } is an optional c-syntax condition >> >> expression. If present, only when "condition" is met, matcher will check >> >> whether folding result is a gimple value using >> >> gimple_simplified_result_is_gimple_val (). >> >> >> >> Since there is no SSA concept in GENERIC, this is only for GIMPLE-match, >> >> not GENERIC-match. >> >> >> >> With this syntax, the rule is changed to >> >> >> >> #Form 1: >> >> (simplify >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) >> >> || TYPE_OVERFLOW_WRAPS (type) >> >> || (INTEGRAL_TYPE_P (type) >> >> && tree_expr_nonzero_p (@0) >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION >> >> (type)) >> >>( if (!single_use (@3) && !single_use (@4)) >> >> (mult:v (plusminus @1 @2) @0))) >> >> (mult (plusminus @1 @2) @0) >> > >> > That seems to match what you can do with '!' now (that's very recent). > > It's also what :s does but a slight bit more "local". When any operand is > marked :s and it has more than a single-use we only allow simplifications > that do not require insertion of extra stmts. So basically the above pattern > doesn't behave any different than if you omit your :v. Only if you'd > place :v on an inner expression there would be a difference. Correlating > the inner expression we'd not want to insert new expressions for with > a specific :s (or multiple ones) would be a more natural extension of what > :s provides. > > Thus, for the above case (Form 1), you do not need :v at all and :s works. Between ":s" and ":v", there is a subtle difference. ":s" only ensures interior transform does not insert any new stmts, but this is not true for final one. Code snippet generated for (A * C) +- (B * C) -> (A+-B) * C: gimple_seq *lseq = seq; if (lseq && (!single_use (captures[0]) || !single_use (captures[3]))) lseq = NULL; if (__builtin_expect (!dbg_cnt (match), 0)) goto next_after_fail621; if (__builtin_expect (dump_file && (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying patte
Re: [RFC] Add new flag to specify output constraint in match.pd
On Wed, Sep 2, 2020 at 9:27 AM Marc Glisse wrote: > > On Wed, 2 Sep 2020, Richard Biener via Gcc wrote: > > > On Mon, Aug 24, 2020 at 8:20 AM Feng Xue OS via Gcc wrote: > >> > There is a match-folding issue derived from pr94234. A piece of code > like: > > int foo (int n) > { > int t1 = 8 * n; > int t2 = 8 * (n - 1); > > return t1 - t2; > } > > It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) > * C", and > be folded to constant "8". But this folding will fail if both v1 and v2 > have > multiple uses, as the following code. > > int foo (int n) > { > int t1 = 8 * n; > int t2 = 8 * (n - 1); > > use_fn (t1, t2); > return t1 - t2; > } > > Given an expression with non-single-use operands, folding it will > introduce > duplicated computation in most situations, and is deemed to be > unprofitable. > But it is always beneficial if final result is a constant or existing > SSA value. > > And the rule is: > (simplify > (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) > (if ((!ANY_INTEGRAL_TYPE_P (type) > || TYPE_OVERFLOW_WRAPS (type) > || (INTEGRAL_TYPE_P (type) > && tree_expr_nonzero_p (@0) > && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION > (type) > /* If @1 +- @2 is constant require a hard single-use on either > original operand (but not on both). */ > && (single_use (@3) || single_use (@4))) <- control whether > match or not > (mult (plusminus @1 @2) @0))) > > Current matcher only provides a way to check something before folding, > but no mechanism to affect decision after folding. If has, for the above > case, we can let it go when we find result is a constant. > >>> > >>> :s already has a counter-measure where it still folds if the output is at > >>> most one operation. So this transformation has a counter-counter-measure > >>> of checking single_use explicitly. And now we want a counter^3-measure... > >>> > >> Counter-measure is key factor to matching-cost. ":s" seems to be somewhat > >> coarse-grained. And here we do need more control over it. > >> > >> But ideally, we could decouple these counter-measures from definitions of > >> match-rule, and let gimple-matcher get a more reasonable match-or-not > >> decision based on these counters. Anyway, it is another story. > >> > Like the way to describe input operand using flags, we could also add > a new flag to specify this kind of constraint on output that we expect > it is a simple gimple value. > > Proposed syntax is > > (opcode:v{ condition } ) > > The char "v" stands for gimple value, if more descriptive, other char is > preferred. "condition" enclosed by { } is an optional c-syntax condition > expression. If present, only when "condition" is met, matcher will check > whether folding result is a gimple value using > gimple_simplified_result_is_gimple_val (). > > Since there is no SSA concept in GENERIC, this is only for GIMPLE-match, > not GENERIC-match. > > With this syntax, the rule is changed to > > #Form 1: > (simplify > (plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) > (if ((!ANY_INTEGRAL_TYPE_P (type) > || TYPE_OVERFLOW_WRAPS (type) > || (INTEGRAL_TYPE_P (type) > && tree_expr_nonzero_p (@0) > && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION > (type)) > ( if (!single_use (@3) && !single_use (@4)) > (mult:v (plusminus @1 @2) @0))) > (mult (plusminus @1 @2) @0) > >>> > >>> That seems to match what you can do with '!' now (that's very recent). > > > > It's also what :s does but a slight bit more "local". When any operand is > > marked :s and it has more than a single-use we only allow simplifications > > that do not require insertion of extra stmts. So basically the above > > pattern > > doesn't behave any different than if you omit your :v. Only if you'd > > place :v on an inner expression there would be a difference. Correlating > > the inner expression we'd not want to insert new expressions for with > > a specific :s (or multiple ones) would be a more natural extension of what > > :s provides. > > > > Thus, for the above case (Form 1), you do not need :v at all and :s works. > > Let's consider that multiplication is expensive. We have code like > 5*X-3*X, which can be simplified to 2*X. However, if both 5*X and 3*X have > other uses, that would increase the number of multiplications. :s would > not block a simplification to 2*X, which is a single stmt. So
Re: [RFC] Add new flag to specify output constraint in match.pd
On Wed, Sep 2, 2020 at 9:35 AM Feng Xue OS wrote: > > > > >> > >> >> There is a match-folding issue derived from pr94234. A piece of code > >> >> like: > >> >> > >> >> int foo (int n) > >> >> { > >> >> int t1 = 8 * n; > >> >> int t2 = 8 * (n - 1); > >> >> > >> >> return t1 - t2; > >> >> } > >> >> > >> >> It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- B) > >> >> * C", and > >> >> be folded to constant "8". But this folding will fail if both v1 and > >> >> v2 have > >> >> multiple uses, as the following code. > >> >> > >> >> int foo (int n) > >> >> { > >> >> int t1 = 8 * n; > >> >> int t2 = 8 * (n - 1); > >> >> > >> >> use_fn (t1, t2); > >> >> return t1 - t2; > >> >> } > >> >> > >> >> Given an expression with non-single-use operands, folding it will > >> >> introduce > >> >> duplicated computation in most situations, and is deemed to be > >> >> unprofitable. > >> >> But it is always beneficial if final result is a constant or existing > >> >> SSA value. > >> >> > >> >> And the rule is: > >> >> (simplify > >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) > >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) > >> >> || TYPE_OVERFLOW_WRAPS (type) > >> >> || (INTEGRAL_TYPE_P (type) > >> >> && tree_expr_nonzero_p (@0) > >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION > >> >> (type) > >> >>/* If @1 +- @2 is constant require a hard single-use on either > >> >> original operand (but not on both). */ > >> >>&& (single_use (@3) || single_use (@4))) <- control > >> >> whether match or not > >> >> (mult (plusminus @1 @2) @0))) > >> >> > >> >> Current matcher only provides a way to check something before folding, > >> >> but no mechanism to affect decision after folding. If has, for the > >> >> above > >> >> case, we can let it go when we find result is a constant. > >> > > >> > :s already has a counter-measure where it still folds if the output is at > >> > most one operation. So this transformation has a counter-counter-measure > >> > of checking single_use explicitly. And now we want a counter^3-measure... > >> > > >> Counter-measure is key factor to matching-cost. ":s" seems to be somewhat > >> coarse-grained. And here we do need more control over it. > >> > >> But ideally, we could decouple these counter-measures from definitions of > >> match-rule, and let gimple-matcher get a more reasonable match-or-not > >> decision based on these counters. Anyway, it is another story. > >> > >> >> Like the way to describe input operand using flags, we could also add > >> >> a new flag to specify this kind of constraint on output that we expect > >> >> it is a simple gimple value. > >> >> > >> >> Proposed syntax is > >> >> > >> >> (opcode:v{ condition } ) > >> >> > >> >> The char "v" stands for gimple value, if more descriptive, other char > >> >> is > >> >> preferred. "condition" enclosed by { } is an optional c-syntax > >> >> condition > >> >> expression. If present, only when "condition" is met, matcher will > >> >> check > >> >> whether folding result is a gimple value using > >> >> gimple_simplified_result_is_gimple_val (). > >> >> > >> >> Since there is no SSA concept in GENERIC, this is only for > >> >> GIMPLE-match, > >> >> not GENERIC-match. > >> >> > >> >> With this syntax, the rule is changed to > >> >> > >> >> #Form 1: > >> >> (simplify > >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) > >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) > >> >> || TYPE_OVERFLOW_WRAPS (type) > >> >> || (INTEGRAL_TYPE_P (type) > >> >> && tree_expr_nonzero_p (@0) > >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION > >> >> (type)) > >> >>( if (!single_use (@3) && !single_use (@4)) > >> >> (mult:v (plusminus @1 @2) @0))) > >> >> (mult (plusminus @1 @2) @0) > >> > > >> > That seems to match what you can do with '!' now (that's very recent). > > > > It's also what :s does but a slight bit more "local". When any operand is > > marked :s and it has more than a single-use we only allow simplifications > > that do not require insertion of extra stmts. So basically the above > > pattern > > doesn't behave any different than if you omit your :v. Only if you'd > > place :v on an inner expression there would be a difference. Correlating > > the inner expression we'd not want to insert new expressions for with > > a specific :s (or multiple ones) would be a more natural extension of what > > :s provides. > > > > Thus, for the above case (Form 1), you do not need :v at all and :s works. > > Between ":s" and ":v", there is a subtle difference. ":s" only ensures > interior > transform does not insert any new stmts, but this is not true for final one. > > Code snippet generated for (A * C) +- (B * C) -> (A+-B) * C: > > gimple_seq *lseq = s
Re: [RFC] Add new flag to specify output constraint in match.pd
>> >> > >> >> >> >> >> There is a match-folding issue derived from pr94234. A piece of >> >> >> code like: >> >> >> >> >> >> int foo (int n) >> >> >> { >> >> >> int t1 = 8 * n; >> >> >> int t2 = 8 * (n - 1); >> >> >> >> >> >> return t1 - t2; >> >> >> } >> >> >> >> >> >> It can be perfectly caught by the rule "(A * C) +- (B * C) -> (A +- >> >> >> B) * C", and >> >> >> be folded to constant "8". But this folding will fail if both v1 and >> >> >> v2 have >> >> >> multiple uses, as the following code. >> >> >> >> >> >> int foo (int n) >> >> >> { >> >> >> int t1 = 8 * n; >> >> >> int t2 = 8 * (n - 1); >> >> >> >> >> >> use_fn (t1, t2); >> >> >> return t1 - t2; >> >> >> } >> >> >> >> >> >> Given an expression with non-single-use operands, folding it will >> >> >> introduce >> >> >> duplicated computation in most situations, and is deemed to be >> >> >> unprofitable. >> >> >> But it is always beneficial if final result is a constant or existing >> >> >> SSA value. >> >> >> >> >> >> And the rule is: >> >> >> (simplify >> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) >> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) >> >> >> || TYPE_OVERFLOW_WRAPS (type) >> >> >> || (INTEGRAL_TYPE_P (type) >> >> >> && tree_expr_nonzero_p (@0) >> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION >> >> >> (type) >> >> >>/* If @1 +- @2 is constant require a hard single-use on either >> >> >> original operand (but not on both). */ >> >> >>&& (single_use (@3) || single_use (@4))) <- control >> >> >> whether match or not >> >> >> (mult (plusminus @1 @2) @0))) >> >> >> >> >> >> Current matcher only provides a way to check something before folding, >> >> >> but no mechanism to affect decision after folding. If has, for the >> >> >> above >> >> >> case, we can let it go when we find result is a constant. >> >> > >> >> > :s already has a counter-measure where it still folds if the output is >> >> > at >> >> > most one operation. So this transformation has a counter-counter-measure >> >> > of checking single_use explicitly. And now we want a >> >> > counter^3-measure... >> >> > >> >> Counter-measure is key factor to matching-cost. ":s" seems to be somewhat >> >> coarse-grained. And here we do need more control over it. >> >> >> >> But ideally, we could decouple these counter-measures from definitions of >> >> match-rule, and let gimple-matcher get a more reasonable match-or-not >> >> decision based on these counters. Anyway, it is another story. >> >> >> >> >> Like the way to describe input operand using flags, we could also add >> >> >> a new flag to specify this kind of constraint on output that we expect >> >> >> it is a simple gimple value. >> >> >> >> >> >> Proposed syntax is >> >> >> >> >> >> (opcode:v{ condition } ) >> >> >> >> >> >> The char "v" stands for gimple value, if more descriptive, other char >> >> >> is >> >> >> preferred. "condition" enclosed by { } is an optional c-syntax >> >> >> condition >> >> >> expression. If present, only when "condition" is met, matcher will >> >> >> check >> >> >> whether folding result is a gimple value using >> >> >> gimple_simplified_result_is_gimple_val (). >> >> >> >> >> >> Since there is no SSA concept in GENERIC, this is only for >> >> >> GIMPLE-match, >> >> >> not GENERIC-match. >> >> >> >> >> >> With this syntax, the rule is changed to >> >> >> >> >> >> #Form 1: >> >> >> (simplify >> >> >>(plusminus (mult:cs@3 @0 @1) (mult:cs@4 @0 @2)) >> >> >>(if ((!ANY_INTEGRAL_TYPE_P (type) >> >> >> || TYPE_OVERFLOW_WRAPS (type) >> >> >> || (INTEGRAL_TYPE_P (type) >> >> >> && tree_expr_nonzero_p (@0) >> >> >> && expr_not_equal_to (@0, wi::minus_one (TYPE_PRECISION >> >> >> (type)) >> >> >>( if (!single_use (@3) && !single_use (@4)) >> >> >> (mult:v (plusminus @1 @2) @0))) >> >> >> (mult (plusminus @1 @2) @0) >> >> > >> >> > That seems to match what you can do with '!' now (that's very recent). >> > >> > It's also what :s does but a slight bit more "local". When any operand is >> > marked :s and it has more than a single-use we only allow simplifications >> > that do not require insertion of extra stmts. So basically the above >> > pattern >> > doesn't behave any different than if you omit your :v. Only if you'd >> > place :v on an inner expression there would be a difference. Correlating >> > the inner expression we'd not want to insert new expressions for with >> > a specific :s (or multiple ones) would be a more natural extension of what >> > :s provides. >> > >> > Thus, for the above case (Form 1), you do not need :v at all and :s works. >> >> Between ":s" and ":v", there is a subtle difference. ":s" only ensures >> interior >> transform does not insert any new stmts, but this is not true for final one. >> >> Code s
Re: [RFC] Add new flag to specify output constraint in match.pd
On Wed, 2 Sep 2020, Richard Biener via Gcc wrote: > > Or we could decide that the extra multiplication is not that bad if it > > saves an addition, simplifies the expression, possibly gains more insn > > parallelism, etc, in which case we could just drop the existing hard > > single_use check... > > ;) > > it's all going to be heuristics because the decision is made > locally. If you consider the other two uses might be > 5*X + 3*X then converting both to 2*X and 8*X should be > a win because we save one addition and one subtraction. > But that would require folding everything tentatively and > then applying some global costing ... Arguably simplification on GIMPLE should fold towards what is 1) declared "canonical form" 2) cheaper in the C abstract machine ;) and let machine-specific passes "expand" the result as needed towards higher ILP or whatever the concrete machine prefers :) Alexander
Question about exporting omputing alias sets
Hello, I am trying to find out all pointers which alias a pointer and place them in a set. I am using `ptr_derefs_may_alias_p` to find out if two pointers may point to the same memory location. I think this yields conservative results (i.e., when it cannot be proven that to pointers may alias, it will give the result "true"). This is what I need. To collect all pointers which alias pointer "p", this means is that I have to collect all pointer variables and all pointer expressions in the program and then call `ptr_derefs_may_alias_p` O(n) times to find the set of pointers which alias pointer "p". If I have to do this for all pointers in the program, that would mean O(n^2). This also implies that in order for my sets to be correct I need to collect all pointer variables and all pointer expressions. I think that a better idea would be to have a list of all bitmap solutions and for every variable with a bitmap if position j is set, assigned to alias set j. In pseudocode: compute_alias_sets(bitmap set) { // create array of sets of same length as varmap // call it alias_map for (i = 0; i < varmap.length (); i++) { bitmap set = get_varinfo(i)->solution; EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi) { varinfo_t v = get_varinfo (j); alias_map[j].insert(v); } } } I think this is a better implementation. I don't need to worry about collecting all pointer expressions and I think varmap also contains declarations. (I think IPA-PTA already has "collected" all pointer expressions/declarations in varmap and created constraint variables for them.) *Would there be an issue with exporting varmap and delaying its deletion to this pass I'm working on?* This pass is intended to be run immediately after IPA-PTA. After this, I think I would still need to do some refining on the "alias_map" since (I think) there are more constraint variables than gimple variables in the partition, but the information I look for should be able to be derived from here. Thanks!
Re: Question about exporting omputing alias sets
On Wed, Sep 2, 2020 at 10:04 AM Erick Ochoa wrote: > > Hello, > > I am trying to find out all pointers which alias a pointer and place > them in a set. > > I am using `ptr_derefs_may_alias_p` to find out if two pointers may > point to the same memory location. I think this yields conservative > results (i.e., when it cannot be proven that to pointers may alias, it > will give the result "true"). This is what I need. > > To collect all pointers which alias pointer "p", this means is that I > have to collect all pointer variables and all pointer expressions in the > program and then call `ptr_derefs_may_alias_p` O(n) times to find the > set of pointers which alias pointer "p". If I have to do this for all > pointers in the program, that would mean O(n^2). > > This also implies that in order for my sets to be correct I need to > collect all pointer variables and all pointer expressions. I think that > a better idea would be to have a list of all bitmap solutions and for > every variable with a bitmap if position j is set, assigned to alias set > j. In pseudocode: > > compute_alias_sets(bitmap set) > { >// create array of sets of same length as varmap >// call it alias_map >for (i = 0; i < varmap.length (); i++) >{ > bitmap set = get_varinfo(i)->solution; > EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi) > { >varinfo_t v = get_varinfo (j); >alias_map[j].insert(v); > } >} > } > > I think this is a better implementation. I don't need to worry about > collecting all pointer expressions and I think varmap also contains > declarations. (I think IPA-PTA already has "collected" all pointer > expressions/declarations in varmap and created constraint variables for > them.) *Would there be an issue with exporting varmap and delaying its > deletion to this pass I'm working on?* This pass is intended to be run > immediately after IPA-PTA. After this, I think I would still need to do > some refining on the "alias_map" since (I think) there are more > constraint variables than gimple variables in the partition, but the > information I look for should be able to be derived from here. I think you should add the code to compute your "alias sets" to IPA-PTA if struct-reorg is enabled and export those like we "export" the IPA escaped set (ipa_escaped_pt) to later passes. I agree that you want to work on the points-to solutions rather than start from pointers and expressions in GIMPLE. Richard. > Thanks!
Re: Future debug options: -f* or -g*?
Hi David, On Mon, 2020-08-31 at 20:10 -0700, David Blaikie wrote: > Hey Mark - saw a little of/bits about your presentation at LPC 2020 GNU > Tools Track (& your thread on on the gdb list about debug_names). Wondering > if you (or anyone else you know who's contributing to debug info in GCC) > have some thoughts on this flag naming issue. It'd be great to get some > alignment between GCC and Clang here, so as we both add new flags going > forward, at least they're at least categorically consistent for users, even > if we aren't necessarily implementing the exact same flags/flag names > (though in the -gsplit-dwarf case, it'd be good for any new semantics/name > to match exactly). To be honest I don't have any deep insights here. Sadly it is just a bit of a mess. Not helped by the fact that some debugging formats, like DWARF have a number in their name (dwarf2) that might not reflect actually reflext the version of that format (except for it being not- dwarf1). That is why I was suggesting to use -fdwarf(32|64) instead of -gdwarf(32|64) to use for 32 bits vs 64 bits offsets in dwarf(2|3|4|5), because there are already too many -gdwarf[number] options. The idea of having any of the -fs not imply -g and having any of the -gs imply -g seems a good one. But in general I think these suboptions are all a bit nonsense though. They are great for tinkering and playing with corner cases of the debugging format. But I don't think we are helping real developers with them. To make any of them useful they should be in one of the standard options that people actually use. Nobody has time (or the specialized knowledge) to really care about all these suboptions. Once people feel they need to use any of these suboptions we already lost or the user has such specialized knowledge that they will know about specific changes in these options or between compilers. So just like we have -Wall or -Wextra has groups of warning options people are likely to use and we have -O1, -O2 and -O3 (I personally believe -flto should just be in -O3 instead of being a separate option), we should simply focus on making -g1, -g2 and -g3 have good defaults. Where -g1 is the minimum needed to do some profiling/tracing/debugging without needing to map all addresses back to source completely but can leave out full type trees for example, -g2 is the default for being able to do a useful interactive debugging session and -g3 adds some (expensive) extras like .debug_macros and maybe .debug_names, etc. With respect to -gsplit-dwarf I think it was (in hindsight) a mistake to put things in a separate (.dwo) file. It is a bit confusing to users and complicates build systems immensely. I am hoping to implement -gsplit-dwarf=single using SHF_EXCLUDED for gcc and then make that the default. Cheers, Mark
Avoiding truncate/sign-extend of POImode on ppc target
PR96791 is happening because DSE is trying to truncate a POImode reg down to DImode. The POImode is created by a structure copy that gets inline expanded using lxvp/stxvp which we have defined using POImode. DSE recognizes that a following load overlaps with the stxvp and can be satisfied by a truncation of the POImode reg used for the stxvp. The intention of using POImode was to avoid having to fully define OImode and it seems like defining these truncations would be the first step along that path. If I haven't misrepresented it, I believe that is Segher's position on this. The function extract_low_bits() in expmed.c does some querying of the target hook modes_tieable_p but not in any way that allows us to tell it not to allow truncation of POImode to a smaller mode. As a result we run into the ICE because the patterns are not provided to do this. So, I put it to the community: Is there an existing way to do this? Or, do we need another target hook of some kind to check this sort of thing? Thanks, Aaron Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain
Re: Avoiding truncate/sign-extend of POImode on ppc target
Meant to CC a few people, oops. Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain > On Sep 2, 2020, at 9:22 AM, Aaron Sawdey via Gcc wrote: > > > PR96791 is happening because DSE is trying to truncate a > POImode reg down to DImode. The POImode is created by a > structure copy that gets inline expanded using lxvp/stxvp > which we have defined using POImode. DSE recognizes that a > following load overlaps with the stxvp and can be satisfied > by a truncation of the POImode reg used for the stxvp. > > The intention of using POImode was to avoid having to fully > define OImode and it seems like defining these truncations > would be the first step along that path. If I haven't > misrepresented it, I believe that is Segher's position on > this. > > The function extract_low_bits() in expmed.c does some > querying of the target hook modes_tieable_p but not in any > way that allows us to tell it not to allow truncation of > POImode to a smaller mode. As a result we run into the ICE > because the patterns are not provided to do this. > > So, I put it to the community: Is there an existing way to do > this? Or, do we need another target hook of some kind to > check this sort of thing? > > Thanks, >Aaron > > > Aaron Sawdey, Ph.D. saw...@linux.ibm.com > IBM Linux on POWER Toolchain > >
Types are confused in inlining
I'm not accusing inlining of having problems but I really need to understand what's going on in this situation so I can fix my optimization. The error given is: main.c: In function ‘main’: main.c:5:1: error: non-trivial conversion in ‘ssa_name’ 5 | main(void) | ^ struct type_t * unsigned long _101 = dedangled_97; during GIMPLE pass: fixup_cfg etc. etc. I put a conditional breakpoint in gdb where both _101 and dedangled_97 were created and low and behold they were both set to "unsigned long". Does anybody have a clue as to how "_101" got changed from "unsigned long" to "struct type_t *"? Note, the later is a meaningful type in my program. I'm trying to replace all instances of the former as part of structure reorganization optimization.) I should mention that this GIMPLE stmt is the one that moves the value computed in an inlined function into the body of code where the inling took place. Thanks, Gary Oblock CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Ampere Computing or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. Any unauthorized review, copying, or distribution of this email (or any attachments thereto) is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
Re: Future debug options: -f* or -g*?
On Wed, Sep 2, 2020 at 4:12 AM Mark Wielaard wrote: > Hi David, > > On Mon, 2020-08-31 at 20:10 -0700, David Blaikie wrote: > > Hey Mark - saw a little of/bits about your presentation at LPC 2020 GNU > > Tools Track (& your thread on on the gdb list about debug_names). > Wondering > > if you (or anyone else you know who's contributing to debug info in GCC) > > have some thoughts on this flag naming issue. It'd be great to get some > > alignment between GCC and Clang here, so as we both add new flags going > > forward, at least they're at least categorically consistent for users, > even > > if we aren't necessarily implementing the exact same flags/flag names > > (though in the -gsplit-dwarf case, it'd be good for any new > semantics/name > > to match exactly). > > To be honest I don't have any deep insights here. Sadly it is just a > bit of a mess. Happen to know any other interested parties (other GCC DWARF contributors who might have an opinion on flag naming) who might want to weigh in/help us converge here? > Not helped by the fact that some debugging formats, like > DWARF have a number in their name (dwarf2) that might not reflect > actually reflext the version of that format (except for it being not- > dwarf1). That is why I was suggesting to use -fdwarf(32|64) instead of > -gdwarf(32|64) to use for 32 bits vs 64 bits offsets in dwarf(2|3|4|5), > because there are already too many -gdwarf[number] options. > > The idea of having any of the -fs not imply -g and having > any of the -gs imply -g seems a good one. > Cool cool. > But in general I think these suboptions are all a bit nonsense though. > They are great for tinkering and playing with corner cases of the > debugging format. But I don't think we are helping real developers with > them. To make any of them useful they should be in one of the standard > options that people actually use. Nobody has time (or the specialized > knowledge) to really care about all these suboptions. Once people feel > they need to use any of these suboptions we already lost or the user > has such specialized knowledge that they will know about specific > changes in these options or between compilers. > Perhaps - I think there are different use cases. A big one in LLVM we call -fstandalone-debug: Are you building everything with debug info, or only one or two files/can the compiler assume other files will be built with debug info. (this affects basically the same things as -femit-class-debug-always). Split DWARF for users with distributed build systems. Several options can weight object size compared to linked executable size - such as using base address selection entries in range and location lists (reducing relocations). Obviously I'm biased here - I live in this stuff & have users/use cases that are... not the norm. > So just like we have -Wall or -Wextra has groups of warning options > people are likely to use and we have -O1, -O2 and -O3 (I personally > believe -flto should just be in -O3 instead of being a separate > option), I imagine quite a few users build with optimization for particular CPUs - though, again, partly my bias of working on a homogenous/controlled environment in data centers, rather than trying to build a binary to send to random users in the world. But I'd guess games and things that want to eek out performance probably want to use modern CPU features if available. > we should simply focus on making -g1, -g2 and -g3 have good > defaults. Where -g1 is the minimum needed to do some > profiling/tracing/debugging without needing to map all addresses back > to source completely but can leave out full type trees for example, -g2 > is the default for being able to do a useful interactive debugging > session and -g3 adds some (expensive) extras like .debug_macros and > maybe .debug_names, etc. > Maybe. I'd guess a fair few users (especially in their continuous integration systems, etc) might have some different tradeoffs in terms of compile/link time and debugging/debugger startup time (indexes are a particularly potent example of this - pay some link time for some debugger startup improvement - but if I don't use a debugger very often, I may not want to pay that link time cost all the time). > With respect to -gsplit-dwarf I think it was (in hindsight) a mistake > to put things in a separate (.dwo) file. It is a bit confusing to users > and complicates build systems immensely. I am hoping to implement > -gsplit-dwarf=single using SHF_EXCLUDED for gcc and then make that the > default. > Yeah, we've got support for that model in Clang - but to be honest, one of the major motivations for implementing Split DWARF was Google's internal use (big distributed build system) & the split mode isn't a mistake in this context - not having to ship all the DWARF bytes to the machine doing the link is a significant savings. (even if we ship all those bytes to another machine to make a dwp) and smaller files to pull down when a user wants to debug fr
git hooks update
I've updated the gcc-reposurgeon-8 test repository (git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-8.git) to use the same copy of the git hooks as used by binutils-gdb, glibc and other repositories on sourceware. All the features from local hook changes are now handled, with this new version, through new configuration variables (in project.config on refs/meta/config) and three new scripts referenced in project.config: commit_checker, commit_email_formatter and update_hook in ~gccadmin/hooks-bin. This version also has various upstream improvements. In particular, the email-new-commits-only variable means that no commit emails should be sent for commits added to refs under refs/heads/devel/, refs/users/ or refs/vendors/ that were already in the repository (so with the new hooks, merges to and rebases of such branches should not generate floods of commit emails). Please test whether the hooks are working as expected with the expected checks properly implemented in gcc-reposurgeon-8. If they're working OK, we can switch over in the production repository by changing the /git/gcc.git/hooks symlink to point to /sourceware/projects/src-home/git-hooks/hooks instead of /home/gccadmin/git-hooks/hooks, and uncommenting the allow-non-fast-forward settings in project.config in the production repository. Once we've switched over, we'll be using a copy of the hooks that is shared by other projects, rather than a GCC-specific copy, and so should avoid making local changes there. -- Joseph S. Myers jos...@codesourcery.com
Re: Types are confused in inlining
On Wed, Sep 2, 2020 at 10:19 PM Gary Oblock via Gcc wrote: > > I'm not accusing inlining of having problems but I really > need to understand what's going on in this situation so I can > fix my optimization. > > The error given is: > main.c: In function ‘main’: > main.c:5:1: error: non-trivial conversion in ‘ssa_name’ > 5 | main(void) > | ^ > struct type_t * > unsigned long > _101 = dedangled_97; > during GIMPLE pass: fixup_cfg > etc. > etc. > > I put a conditional breakpoint in gdb where both > _101 and dedangled_97 were created and low > and behold they were both set to "unsigned long". > Does anybody have a clue as to how "_101" got > changed from "unsigned long" to "struct type_t *"? > Note, the later is a meaningful type in my program. > I'm trying to replace all instances of the former as > part of structure reorganization optimization.) I should > mention that this GIMPLE stmt is the one that moves > the value computed in an inlined function into the body > of code where the inling took place. This is absolutely not enough information to guess at the issue ;) I suggest you break at the return stmt of make_ssa_name_fn looking for t->base.u.version == 101 to see where and with which type _101 is created, from there watch *&t->typed.type in case something adjusts the type. > Thanks, > > Gary Oblock > > > > > > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is > for the sole use of the intended recipient(s) and contains information that > is confidential and proprietary to Ampere Computing or its subsidiaries. It > is to be used solely for the purpose of furthering the parties' business > relationship. Any unauthorized review, copying, or distribution of this email > (or any attachments thereto) is strictly prohibited. If you are not the > intended recipient, please contact the sender immediately and permanently > delete the original and any copies of this email and any attachments thereto.