[Patch, testsuite] Add missing -gdwarf-2 flag in debug/dwarf2 testcase
Hi, global-used-types.c in gcc/testsuite/gcc.dg/debug/dwarf2 only specifies -g in dg-options. For a target that is not configured to generate dwarf-2 by default, the test fails looking for specific DWARF strings in the generated assembly. The patch below changes dg-options to -gdwarf-2. Can someone apply if it is ok? Regards Senthil 2013-03-27 Senthil Kumar Selvaraj * gcc.dg/debug/dwarf2/global-used-types.c: Specify -gdwarf-2 in dg-options diff --git gcc/testsuite/gcc.dg/debug/dwarf2/global-used-types.c gcc/testsuite/gcc.dg/debug/dwarf2/global-used-types.c index 54fa58a..03c6ede 100644 --- gcc/testsuite/gcc.dg/debug/dwarf2/global-used-types.c +++ gcc/testsuite/gcc.dg/debug/dwarf2/global-used-types.c @@ -1,6 +1,6 @@ /* Contributed by Dodji Seketeli - { dg-options "-g -dA -fno-merge-debug-strings" } + { dg-options "-gdwarf-2 -dA -fno-merge-debug-strings" } { dg-do compile } { dg-final { scan-assembler-times "DIE \\(0x\[^\n\]*\\) DW_TAG_enumeration_type" 1 } } { dg-final { scan-assembler-times "DIE \\(0x\[^\n\]*\\) DW_TAG_enumerator" 2 } }
Re: Widening multiplication limitations
On Tue, Mar 26, 2013 at 6:30 PM, Frederic Riss wrote: > I was playing with adding support of the various modes of widening > multiplies on my backend, and hit some restrictions in the expansion > code that I couldn't explain to myself. These restrictions only impact > the signed by unsigned version. > > The first limitation was about the detection of widening multiplies > when one of the operands is a big constant of opposite signedness of > the other. It might very well be the case that nobody cared adding the > support for that. I used the following simple patch to overcome that: > > @@ -2059,16 +2059,30 @@ is_widening_mult_p (gimple stmt, > >if (*type1_out == NULL) > { > - if (*type2_out == NULL || !int_fits_type_p (*rhs1_out, *type2_out)) > - return false; > - *type1_out = *type2_out; > + if (*type2_out == NULL) > +return false; > + if (!int_fits_type_p (*rhs1_out, *type2_out)) { > +tree other_type = signed_or_unsigned_type_for (!TYPE_UNSIGNED > (*type2_out) > + *type2_out); > +if (!int_fits_type_p (*rhs1_out, other_type)) > + return false; > +*type1_out = other_type; > + } else { > +*type1_out = *type2_out; > + } > } > >if (*type2_out == NULL) > { > - if (!int_fits_type_p (*rhs2_out, *type1_out)) > - return false; > - *type2_out = *type1_out; > + if (!int_fits_type_p (*rhs2_out, *type1_out)) { > +tree other_type = signed_or_unsigned_type_for (!TYPE_UNSIGNED > (*type1_out) > + *type1_out); > +if (!int_fits_type_p (*rhs2_out, other_type)) > + return false; > +*type2_out = other_type; > + } else { > +*type2_out = *type1_out; > + } > } > > Is that extension of the logic correct? I didn't look at the details, but certainly this place is correct for extending the range of supported widening operations. > After having done that modification and thus having the middle end > generate widening multiplies of this kind, I hit the second limitation > in expr.c:expand_expr_real_2 : > >/* First, check if we have a multiplication of one signed and one > unsigned operand. */ > if (TREE_CODE (treeop1) != INTEGER_CST > && (TYPE_UNSIGNED (TREE_TYPE (treeop0)) > != TYPE_UNSIGNED (TREE_TYPE (treeop1 > > Here, the code trying to expand a signed by unsigned widening multiply > explicitly checks that the operand isn't a constant. Why is that? I > removed that condition to try to find the failing cases, but the few > million random multiplies that I threw at it didn't fail in any > visible way. Not sure, the limitation does not make sense to me. Probably the code assumes that it would have been easy to convert the constant to the same signedness as treeop0. Simply removing the check seems correct to me. > One difficulty I found was that the widening multiplies are expressed as eg: > > (mult (zero_extend (operand 1)) (zero_extend (operand 2))) > > and that simplify_rtx will ICE when trying to simplify a zero_extend > of a VOIDmode const_int. It forced me to carefully add different > patterns to handle the immediate versions of the operations. But that > doesn't seem like a good reason to limit the code expansion... > > Can anyone explain this condition? The zero_extend needs to be simplified for constant operand 2 - the expansion code probably doesn't do this appropriately. Richard. > Many thanks, > Fred
Re: expmed.c cost calculation limited to word size
On Tue, Mar 26, 2013 at 6:55 PM, Frederic Riss wrote: > While working on having the divisions by constants optimized by my GCC > targeting, I realized that whatever *muldi3_highpart my backend > provides, it would never be used because of the bounds checks that > expmed.c does on the cost arrays. For example: > >choose_multiplier (abs_d, size, size - 1, > &mlr, &post_shift, &lgup); >ml = (unsigned HOST_WIDE_INT) INTVAL (mlr); >if (ml < (unsigned HOST_WIDE_INT) 1 << (size - 1)) > { > rtx t1, t2, t3; > > =>if (post_shift >= BITS_PER_WORD > =>|| size - 1 >= BITS_PER_WORD) > goto fail1; > >extra_cost = (shift_cost[speed][compute_mode][post_shift] > + shift_cost[speed][compute_mode][size - 1] > + add_cost[speed][compute_mode]); > > According to the commit log where these checks where added, they only > serve as to not overflow the cost arrays bellow. Even though a backend > is fully capable of DImode shifts and multiplies, they won't be > considered because of this check. The cost arrays are filled up to > MAX_BITS_PER_WORD, thus as a temporary workaround I have defined > MAX_BITS_PER_WORD to 64, and I have softened the checks to fail only > above MAX_BITS_PER_WORD. This allows my 32bits backend to specify that > it wants these optimizations to take place for 64bits arithmetic. > > What do people think about this approach? does it make sense? Another approach would be to simply use the cost of a BITS_PER_WORD shift for bigger shifts. Adjusting MAX_BITS_PER_WORD sounds like a hack to me. Note that on trunk I see the cost arrays are now inline functions, so things may have changed for the better already. Richard. > Many thanks, > Fred
Re: Problem in understanding points-to analysis
On Wed, Mar 27, 2013 at 3:13 AM, Nikhil Patil wrote: > Hello everyone, > > I am trying to understand the points-to analysis ("pta") ipa pass, but > I am not able to match the information generated by the pass and that > in structure "SSA_NAME_PTR_INFO". > > For the code segment, > > -- > int var1, var2, var3, var4, *ptr1, *ptr2, **ptr3; > > if (var1==10) { > ptr1 = &var1; > ptr2 = &var2; > } > else { > ptr1 = &var3; > ptr2 = &var4; > } > > if (var2==3) { > ptr3 = &ptr1; > } > else { > ptr3 = &ptr2; > } > > printf("\n %d %d \n",*ptr1, **ptr3); > -- > > The points-to information in dump_file of "pta" pass: > ptr1.2_6 = { var1 var3 } > ptr1 = { var1 var3 } same as ptr1.2_6 > > But accessing the structure "SSA_NAME_PTR_INFO" (using API > dump_points_to_info_for(..) ) in a pass AFTER "pta", shows > ptr1.2_6, points-to vars: { var1 var3 } > ptr1, points-to anything > > Why here 'ptr1' is not pointing to '{ var1 var3 }' as found by "pta"? > > Can someone please help me understand this behaviour? Without a compilable testcase I can't explain, but obviously 'ptr1' is not an SSA name and points-to information is only preserved for SSA names, not for any other variables that are part of the solving process. Richard. > > -- > Thanks, > Nikhil Patil.
Re: expmed.c cost calculation limited to word size
On 27 March 2013 10:10, Richard Biener wrote: > On Tue, Mar 26, 2013 at 6:55 PM, Frederic Riss > wrote: >> The cost arrays are filled up to >> MAX_BITS_PER_WORD, thus as a temporary workaround I have defined >> MAX_BITS_PER_WORD to 64, and I have softened the checks to fail only >> above MAX_BITS_PER_WORD. This allows my 32bits backend to specify that >> it wants these optimizations to take place for 64bits arithmetic. >> >> What do people think about this approach? does it make sense? > > Another approach would be to simply use the cost of a BITS_PER_WORD > shift for bigger shifts. Adjusting MAX_BITS_PER_WORD sounds like a hack > to me. That's what I had done at first. Works also for me, but I thought it was a bit too artificial compared to using a computed value (Even though I agree that a DImode 33 bits shift is more than likely to have the same cost as a 30 bits one). > Note that on trunk I see the cost arrays are now inline functions, so things > may have changed for the better already. I'll have a look at the new code. Thanks for the comments ! Fred
Re: Widening multiplication limitations
On 27 March 2013 10:05, Richard Biener wrote: > On Tue, Mar 26, 2013 at 6:30 PM, Frederic Riss > wrote: >> Here, the code trying to expand a signed by unsigned widening multiply >> explicitly checks that the operand isn't a constant. Why is that? I >> removed that condition to try to find the failing cases, but the few >> million random multiplies that I threw at it didn't fail in any >> visible way. > > Not sure, the limitation does not make sense to me. Probably the > code assumes that it would have been easy to convert the constant > to the same signedness as treeop0. Simply removing the check > seems correct to me. Thanks for the confirmation. I'll try to see if these modifications pass regstrap on a primary target and then maybe submit a patch. Fred
Re: cond_exec no-ops in RTL optimisations
> While working with some splitters I noticed that the RTL optimisation > passes do not optimise away a no-op wrapped in a cond_exec. > > So for example, if my splitter generates something like: > (cond_exec (lt:SI (reg:CC CC_REGNUM) (const_int 0)) >(set (match_dup 1) > (match_dup 2))) > > and operand 1 and 2 are the same register (say r0), this persists through > all the optimisation passes and results on ARM in a redundant > movlt r0, r0 > > I noticed that if I generate an unconditional SET it gets optimised away in > the cases when it's a no-op. > > I can work around this by introducing a peephole2 that lifts the SET out of > the cond_exec like so: > > (define_peephole2 > [(cond_exec (match_operator 0 "comparison_operator" > [(reg:CC CC_REGNUM) (const_int 0)]) >(set (match_operand:SI 1 "register_operand" "") > (match_dup 1)))] > "" > [(set (match_dup 1) (match_dup 1))]) > > and the optimisers will catch it and remove it but this seems like a hack. > What if it was a redundant ADD (with 0) or an AND (and r0, r0, r0)? > Doesn't seem right to add peepholes for each of those cases. > > Is that something the RTL optimisers should be able to remove? Maybe, but it is hardly doable to recognize every RTL variant of a no-op, so I'd suggest fixing the pass that generates it instead. -- Eric Botcazou
_Alignas attribute and HOST_BITS_PER_INT
Hi, I was looking at why gcc.dg/c1x-align-3.c (test for errors, line 15) is failing for the AVR target, and I found that the test expects _Alignas with -__INT_MAX__ - 1 to fail with a "too large" error. I looked at the code responsible for generating the error (c-common.c, check_user_alignment) and found that it checks if the number of bits in the user provided alignment is more than HOST_BITS_PER_INT - BITS_PER_UNIT_LOG. For AVR, integer size is 16 bits, and therefore __INT_MAX__ is 2^15 - 1. HOST_BITS_PER_INT, however, is 32 bits, and hence the error is not triggered. Is it right to check against HOST_BITS_PER_INT, when the alignment attribute only applies to the target? If the check is indeed correct, should the test be modified to handle targets whose __INT__MAX__ is less than 2^HOST_BITS_PER_INT - 1 ? Regards Senthil
Identifying Compiler Options to Minimise Energy Consumption for Embedded Platforms
Hi, I posted a while ago about a project between Embecosm and the University of Bristol. In this project we explored the effect that a wide range of optimisations had on the execution time and energy consumption of a variety of benchmarks. The project has been written up, and results and conclusions due to be published soon (preprint paper available on arXiv http://arxiv.org/abs/1303.6485). A few of the conclusions (see the paper for more detail): - Execution time and energy consumption are correlated - The amount of correlation differs depending on the complexity of the pipeline - The structure of the benchmark has a large impact on which optimisations have an effect. - The architecture of the processor has some impact on effective optimisations. I plan to present this work at GCC cauldron. Also Embecosm will be talking about future research that follows on from this. Thanks, James - http://arxiv.org/abs/1303.6485 "Identifying Compiler Options to Minimise Energy Consumption for Embedded Platforms" ABSTRACT This paper presents an innovative technique to explore the effect on energy consumption of an extensive number of the optimisations a compiler can perform. We evaluate a set of ten carefully selected benchmarks for five different embedded platforms. A fractional factorial design is used to systematically explore the large optimisation space (2^82 possible combinations), whilst still accurately determining the effects of optimisations and optimisation combinations. Hardware power measurements on each platform are taken to ensure all architectural effects on the energy consumption are captured. In the majority of cases, execution time and energy consumption are highly correlated. However, predicting the effect a particular optimisation may have is non-trivial due to its interactions with other optimisations. This validates long standing community beliefs, but for the first time provides concrete evidence of the effect and its magnitude. A further conclusion of this study is the structure of the benchmark has a larger effect than the hardware architecture on whether the optimisation will be effective, and that no single optimisation is universally beneficial for execution time or energy consumption.
Re: _Alignas attribute and HOST_BITS_PER_INT
On Wed, 27 Mar 2013, Senthil Kumar Selvaraj wrote: > Hi, > > I was looking at why gcc.dg/c1x-align-3.c (test for errors, line 15) is > failing for the AVR target, and I found that the test expects _Alignas > with -__INT_MAX__ - 1 to fail with a "too large" error. It expects either an error either about being too large, or about not being a power of 2. > Is it right to check against HOST_BITS_PER_INT, when the alignment A check against HOST_BITS_PER_INT would be because of code inside GCC that uses host "int" to store alignment values. Ideally there wouldn't be such code - ideally any alignment up to and including the size of the whole target address space could be used. (For example, alignments could always be represented internally as a base-2 log.) But given the existence of such code, such a check is needed. However, a size that is not a power of 2 (such as this one, minus a power of 2) should still be detected and get an error that this test accepts, whether or not that size is also too large for host int. So look at why you don't get the "requested alignment is not a power of 2" error for this code with a negative alignment. -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch, testsuite] Add missing -gdwarf-2 flag in debug/dwarf2 testcase
On Mar 27, 2013, at 1:02 AM, Senthil Kumar Selvaraj wrote: > global-used-types.c in gcc/testsuite/gcc.dg/debug/dwarf2 only specifies > -g in dg-options. For a target that is not configured to generate > dwarf-2 by default, the test fails looking for specific DWARF strings in > the generated assembly. > > The patch below changes dg-options to -gdwarf-2. Can someone > apply if it is ok? Ok. [ that clears the way for application. ]
Re: Debugging C++ Function Calls
> "Lawrence" == Lawrence Crowl writes: Lawrence> Are the symbol searches specific to the scope context, or does it Lawrence> search all globally defined symbols? I am not totally certain in this case, but in gdb many searches are global, so that "print something" works even if "something" is not locally visible. Lawrence> There is a weakness in the patch, int that the following is legal. [...] Thanks. Tom
gcc build on FC18 and automake 1.11
The latest Fedora Core 18 comes with automake 1.12.1 and perl 5.16.2. I installed and tried to use automake 1.11.1 for one of the GCC libraries, but got a warning from aclocal: main::scan_file() called too early to check prototype at /usr/local/bin/aclocal line 617. The latest 1.11.6 has the same issue, but automake 1.12.1 fixed that warning by having subroutine prototypes early in the code. What would be the right way to go about this? Just put up with the warning? Nenad
Typo in GCC 4.8 release page
This page: http://gcc.gnu.org/gcc-4.8/ under "release history" says GCC 4.8 was released on March 22, 2012. This should be 2013, not 2012.
Re: gcc build on FC18 and automake 1.11
On Wed, Mar 27, 2013 at 2:04 PM, Nenad Vukicevic wrote: > The latest Fedora Core 18 comes with automake 1.12.1 and perl 5.16.2. I > installed and tried to use automake 1.11.1 for one of the GCC libraries, but > got a warning from aclocal: > > main::scan_file() called too early to check prototype at > /usr/local/bin/aclocal line 617. > > The latest 1.11.6 has the same issue, but automake 1.12.1 fixed that warning > by having subroutine prototypes early in the code. > > What would be the right way to go about this? Just put up with the warning? You could install autoconf 2.64, which is the version used to build the configure files in the GCC tree. Or you could move the GCC tree forward to newer versions of these tools. Ian