Re: Question on documentation about RTL PRE in gccint
Bin. Cheng wrote: > Quoting from GCCINT, section "9.5 RTL passes": > "When optimizing for size, GCSE is done using Morel-Renvoise Partial > Redundancy Elimination, with the exception that it does not try to > move invariants out of loops—that is left to the loop optimization > pass. If MR PRE GCSE is done, code hoisting (aka unification) is also > done, as well as load motion." > > While the pass gate function is as below: > static bool > gate_rtl_pre (void) > { > return optimize > 0 && flag_gcse > && !cfun->calls_setjmp > && optimize_function_for_speed_p (cfun) > && dbg_cnt (pre); > } > > It seems the PRE pass is disabled when not optimizing for speed. > Doesn't this conflict with the documentation, which says > Morel-Renvoise PRE will be used when optimizing for size. The documentation should say "When *not* optimizing for size..." . But this piece of documentation seems to be in need of some TLC anyway: * hoisting is not enabled (or used to be not enabled, not sure what it's like now) when not optimizing for size. hoisting is enabled with -Os, PRE is disabled. * MR PRE was replaced with edge-based lazy code motion even before GCC 3.0 * loop code motion is now done on GIMPLE way before RTL PRE * ... (probably half a dozen more issues) ... If you file a PR, I'll update the documentation for the old gcse.c passes (HOIST, PRE, CPROP). Ciao! Steven
Re: Question on updating DF info during code hoisting
Bin.Cheng wrote: > It is possible to have register pressure decreased when hoisting an > expression up in flow graph because of shrunk live range of input > register operands. > To accurately simulating the change of register pressure, I have to > check the change of live range of input operands during hoisting. For > example, to hoist "x+y" through a basic block B up in flow graph, I > have to: > 1. Check whether x/y is in DF_LR_OUT(B), it yes, the live range won't be > shrunk. in df_get_live_out(B). That uses DF_LIVE instead of DF_LR if DF_LIVE is available. That's the more optimistic liveness definition where something that's used but not defined is not considered live. > 2. Check whether x/y is referred by any other insns in B, if yes, the > live range won't be shrunk. Basically I have two methods to do this: > a) Iterate over insns in B reversely, checking whether x/y is > referred by the insn. > b) Iterate over all references of x/y(using > DF_REG_USE_CHAIN(REGNO(x/y))), checking whether the reference is made > in B. > > Method A) is simple, but I guess it would be expensive and I am > thinking about using method B). Method B should be fine for pseudo-registers, they tend to have few uses i.e. short chains. > The problem is code hoisting itself create/modify/delete insns in > basic block, I have to update DF cache info each time an expression is > hoisted thus the info can be used to check the change of live range > when hoisting other expressions later. > > Though not familiar with DF in GCC, I think I can use > df_insn_delete/df_insn_rescan to update DF caches for newly > modified/deleted instructions. If you delete an insn, most of this happens automatically, unless you use the DF_DEFER_INSN_RESCAN or DF_NO_INSN_RESCAN flags. But these are not used in gcse.c so the DF caches are updated on-the-fly. No need to use df_insn_delete/df_insn_rescan manually unless you change an insn in-place without going through recog.c (validate_change and friends). > What I am not sure are: > 1. Basically I only need to update the reference information(i.e., > DF_REG_USE_CHAIN(REGNO(x/y))) for each input operand and don't > need to > update global DF_LR_IN/OUT information(because this will be done > manually when trying to hoist an expression). Could this be done in > current DF infrastructure? This should already be happening. But you should update the stuff you get back from df_get_live_{in,out}, not DF_LR_{IN,OUT}. > 2. I did not find any DF interface to calculate reference information > for newly created insn, so how could I do this? Should also already be happening on-the-fly. Ciao! Steven
Re: cse_process_notes_1 issue ?
>> In the following RTL, the hardware (reg:HI r2), whose natural mode is >> HImode, is set to 0, but when analysing the REG_EQUAL notes of the MULT >> insn during CSE pass, the (reg:SI r2) is computed to be equivalent to 0, >> which is wrong (the target is big endian). >> >> (insn 51 9 52 3 (set (reg:HI 2 r2) >> (const_int 0 [0])) gcc.c-torture/execute/pr27364.c:5 18 {*movhi1} >> (expr_list:REG_DEAD (reg:HI 31) >> (expr_list:REG_EQUAL (const_int 0 [0]) >> (nil >> >> (insn 52 51 12 3 (set (reg:HI 3 r3 [orig:2+2 ] [2]) >> (reg/v:HI 20 [ number_of_digits_to_use ])) >> gcc.c-torture/execute/pr27364.c:5 18 {*movhi1} >> (expr_list:REG_DEAD (reg/v:HI 20 [ number_of_digits_to_use ]) >> (nil))) >> >> (insn 12 52 13 3 (set (reg:SI 0 r0) >> (const_int 3321928 [0x32b048])) >> gcc.c-torture/execute/pr27364.c:5 19 {movsi} >> (nil)) >> >> (insn 13 12 16 3 (parallel [ >> (set (reg:SI 0 r0) >> (mult:SI (reg:SI 2 r2) >> (reg:SI 0 r0))) >> (clobber (reg:SI 2 r2)) >> ]) gcc.c-torture/execute/pr27364.c:5 54 {*mulsi3_call} >> (expr_list:REG_EQUAL (mult:SI (reg:SI 2 r2) >> (const_int 3321928 [0x32b048])) >> (expr_list:REG_DEAD (reg:HI 3 r3) >> (expr_list:REG_UNUSED (reg:SI 2 r2) >> (nil) >> >> >> I think a mode size check is missing when processing REG code in >> cse_process_notes_1. Adding such a check prevents the CSE pass from >> elimintating the MULT instruction. > > It looks like such a check is indeed missing in cse_process_notes_1 (and > probably equiv_constant as well). There is one in insert_regs with a comment > explaining the issue with hard registers. > OK. I will file a bug and propose a patch ASAP. >> But then this MULT insn is simplified during the combine pass: >> >> Trying 12 -> 13: >> ... >> Successfully matched this instruction: >> (set (reg:SI 0 r0) >> (const_int 0 [0])) >> deferring deletion of insn with uid = 12. >> deferring deletion of insn with uid = 52. >> modifying insn i313 r0:SI=0 >> deferring rescan insn with uid = 13. >> >> >> So double middle-end bug or do I miss something? > > Probably a similar issue. I guess the code expects to have subregs of > pseudos > here and isn't prepared for (arithmetic) operations on double-word hard regs. > I will try to track this one down too. Thank you for your reply. Aurélien
AIX trunk build fail #3
In stage 3, libatomic's configure fails. The config.log file is here: https://gist.github.com/3931504 I've recreated the conftest.c and ran the same command. The output is fine and executes with a 0 status. The clue (that I can't figure out) is cc1 is a 32 bit program but it tried to load the 64 bit version of libstdc++. I can't figure out why it tried to do that and I can't recreate it. I also added the output of dump -H /usr/work/build/gcc.git/./gcc/cc1 to the gist. Any suggestions? Thank you, Perry Smith
Re: AIX trunk build fail #3
On 10/22/2012 03:49 PM, Perry Smith wrote: > In stage 3, libatomic's configure fails. The config.log file is here: > https://gist.github.com/3931504 > > I've recreated the conftest.c and ran the same command. The output is fine > and executes with a 0 status. > > The clue (that I can't figure out) is cc1 is a 32 bit program but it tried to > load the 64 bit version of libstdc++. > I can't figure out why it tried to do that and I can't recreate it. This one is (similar to) http://gcc.gnu.org/PR52623 /haubi/
Help: ICE in variable_post_merge_new_vals()
Hi Guys, The RX port is not currently building in the mainline sources because of the following ICE: libgcc/unwind-dw2-fde.c: In function 'add_fdes': libgcc/unwind-dw2-fde.c:721:1: internal compiler error: in variable_post_merge_new_vals, at var-tracking.c:4303 } ^ 0x86f0bed variable_post_merge_new_vals /work/sources/gcc/current/gcc/var-tracking.c:4301 0x885c774 htab_traverse_noresize /work/sources/gcc/current/libiberty/hashtab.c:784 0x86f4a1c dataflow_post_merge_adjust /work/sources/gcc/current/gcc/var-tracking.c:4413 0x86f4a1c vt_find_locations /work/sources/gcc/current/gcc/var-tracking.c:6821 0x86fd34a variable_tracking_main_1 /work/sources/gcc/current/gcc/var-tracking.c:10034 0x86fd34a variable_tracking_main() /work/sources/gcc/current/gcc/var-tracking.c:10080 I am not familiar with this code, so I would be very grateful if someone could explain what this error really means and maybe point me towards a solution. Cheers Nick
Re: Question on documentation about RTL PRE in gccint
On 10/22/2012 12:59 AM, Bin.Cheng wrote: Hi, Quoting from GCCINT, section "9.5 RTL passes": "When optimizing for size, GCSE is done using Morel-Renvoise Partial Redundancy Elimination, with the exception that it does not try to move invariants out of loops—that is left to the loop optimization pass. If MR PRE GCSE is done, code hoisting (aka unification) is also done, as well as load motion." While the pass gate function is as below: static bool gate_rtl_pre (void) { return optimize > 0 && flag_gcse && !cfun->calls_setjmp && optimize_function_for_speed_p (cfun) && dbg_cnt (pre); } It seems the PRE pass is disabled when not optimizing for speed. Doesn't this conflict with the documentation, which says Morel-Renvoise PRE will be used when optimizing for size. I suspect both the code and the documentation need updating. We actually use an LCM based PRE rather than MR PRE. As you note the gating function disables PRE completely when optimizing for size. In the past we had a completely different implementation of gcse when optimizing for size (classic GCSE) as PRE will tend to increase code size the optimize expression computation. It's not immediately clear from scanning the code what, if any, gcse we perform when optimizing for size. jeff
Re: AIX trunk build fail #3
On Mon, Oct 22, 2012 at 9:49 AM, Perry Smith wrote: > In stage 3, libatomic's configure fails. The config.log file is here: > https://gist.github.com/3931504 > > I've recreated the conftest.c and ran the same command. The output is fine > and executes with a 0 status. > > The clue (that I can't figure out) is cc1 is a 32 bit program but it tried to > load the 64 bit version of libstdc++. I can't figure out why it tried to do > that and I can't recreate it. > > I also added the output of dump -H /usr/work/build/gcc.git/./gcc/cc1 to the > gist. > > Any suggestions? I do not know why you specifically are experiencing this problem. We need to provide a version of libstdc++ for cc1 to use before the multilib directory. Do you have a copy of libstdc++.a installed in the same directory as gmp, mpfr, mpc? Maybe that occurs earlier in the search path. Unfortunately all of the libraries have the same name and AIX tries the first library with the name it finds, regardless of 32/64 mode. Thanks, David
Re: cse_process_notes_1 issue ?
>>> In the following RTL, the hardware (reg:HI r2), whose natural mode is >>> HImode, is set to 0, but when analysing the REG_EQUAL notes of the MULT >>> insn during CSE pass, the (reg:SI r2) is computed to be equivalent to 0, >>> which is wrong (the target is big endian). >>> >>> (insn 51 9 52 3 (set (reg:HI 2 r2) >>> (const_int 0 [0])) gcc.c-torture/execute/pr27364.c:5 18 {*movhi1} >>> (expr_list:REG_DEAD (reg:HI 31) >>> (expr_list:REG_EQUAL (const_int 0 [0]) >>> (nil >>> >>> (insn 52 51 12 3 (set (reg:HI 3 r3 [orig:2+2 ] [2]) >>> (reg/v:HI 20 [ number_of_digits_to_use ])) >>> gcc.c-torture/execute/pr27364.c:5 18 {*movhi1} >>> (expr_list:REG_DEAD (reg/v:HI 20 [ number_of_digits_to_use ]) >>> (nil))) >>> >>> (insn 12 52 13 3 (set (reg:SI 0 r0) >>> (const_int 3321928 [0x32b048])) >>> gcc.c-torture/execute/pr27364.c:5 19 {movsi} >>> (nil)) >>> >>> (insn 13 12 16 3 (parallel [ >>> (set (reg:SI 0 r0) >>> (mult:SI (reg:SI 2 r2) >>> (reg:SI 0 r0))) >>> (clobber (reg:SI 2 r2)) >>> ]) gcc.c-torture/execute/pr27364.c:5 54 {*mulsi3_call} >>> (expr_list:REG_EQUAL (mult:SI (reg:SI 2 r2) >>> (const_int 3321928 [0x32b048])) >>> (expr_list:REG_DEAD (reg:HI 3 r3) >>> (expr_list:REG_UNUSED (reg:SI 2 r2) >>> (nil) >>> >>> >>> I think a mode size check is missing when processing REG code in >>> cse_process_notes_1. Adding such a check prevents the CSE pass from >>> elimintating the MULT instruction. >> >> It looks like such a check is indeed missing in cse_process_notes_1 (and >> probably equiv_constant as well). There is one in insert_regs with a >> comment >> explaining the issue with hard registers. >> > > OK. I will file a bug and propose a patch ASAP. Bug + patch for CSE issue: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55024 >>> But then this MULT insn is simplified during the combine pass: >>> >>> Trying 12 -> 13: >>> ... >>> Successfully matched this instruction: >>> (set (reg:SI 0 r0) >>> (const_int 0 [0])) >>> deferring deletion of insn with uid = 12. >>> deferring deletion of insn with uid = 52. >>> modifying insn i313 r0:SI=0 >>> deferring rescan insn with uid = 13. >>> >>> >>> So double middle-end bug or do I miss something? >> >> Probably a similar issue. I guess the code expects to have subregs of >> pseudos >> here and isn't prepared for (arithmetic) operations on double-word hard regs. >> > > I will try to track this one down too. Bug + patch for combine issue: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55025 Aurélien
i386 --with-abi={x32|32|64} extending multiarch ....
In using 4.7.2 and am working on extending our distro to have x86/x86_64/x32/arm Ive yanked the H.Lu patch to add --with-abi support from trunk and am extending it to have a default 32bit ABI we have nicknamed this the LOTR compiler [One compiler to compile them all] [for the i386 at least] with out the support for --with-abi=32 i would not be able to cheat and ship the x86_64 compiler as default on i686 with the ability to x compile to 64/x32 built in also this allows been run on 64 bit as a universal intel compiler i find this appealing as a "seed/bootstrap" compiler and this will be on our repository. if there is any interest im happy to supply the patch when tested and we happy. Greg
Re: AIX trunk build fail #3
On Oct 22, 2012, at 8:55 AM, Michael Haubenwallner wrote: > On 10/22/2012 03:49 PM, Perry Smith wrote: >> In stage 3, libatomic's configure fails. The config.log file is here: >> https://gist.github.com/3931504 >> >> I've recreated the conftest.c and ran the same command. The output is fine >> and executes with a 0 status. >> >> The clue (that I can't figure out) is cc1 is a 32 bit program but it tried >> to load the 64 bit version of libstdc++. >> I can't figure out why it tried to do that and I can't recreate it. > > This one is (similar to) http://gcc.gnu.org/PR52623 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52623#c4 I bet that is my problem. LD_LIBRARY_PATH has the first element is the same path that the loader decided to find libstdc++. I will find the APAR(s) that introduced this later and post it. (I'm speculating here...) The dance that needs to happen is it can not be in the real environment or the loader will see it and gcc or other executables will start to fail. But it needs to use it when it creates the internal libpath for the executable. I am on 6.1 and the comment says 7.1 but usually features are dropped to both. e.g. a 6.1 TL will come out the same time that a 7.1 TL comes out with a lot of the same features. I'm a bit under the weather right now. I will dig into this eventually but it might take a while. Note: my previous work was done on 6.1 TL05 SP07 so... I moved machines. That is probably why I'm hitting this now whereas I did not hit it before. Thank you to all Perry
Re: AIX trunk build fail #3
On Oct 22, 2012, at 7:58 PM, Perry Smith wrote: > > On Oct 22, 2012, at 8:55 AM, Michael Haubenwallner wrote: > >> On 10/22/2012 03:49 PM, Perry Smith wrote: >>> In stage 3, libatomic's configure fails. The config.log file is here: >>> https://gist.github.com/3931504 >>> >>> I've recreated the conftest.c and ran the same command. The output is fine >>> and executes with a 0 status. >>> >>> The clue (that I can't figure out) is cc1 is a 32 bit program but it tried >>> to load the 64 bit version of libstdc++. >>> I can't figure out why it tried to do that and I can't recreate it. >> >> This one is (similar to) http://gcc.gnu.org/PR52623 > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52623#c4 > > I bet that is my problem. LD_LIBRARY_PATH has the first element is the same > path that the loader decided to find libstdc++. > > I will find the APAR(s) that introduced this later and post it. So far, I'm not finding where this was introduced. I need to play with it and see if the loader actually uses it or perhaps the build process sets LIBPATH equal to LD_LIBRARY_PATH too early in the processing.
Re: Question on documentation about RTL PRE in gccint
On Mon, Oct 22, 2012 at 6:14 PM, Steven Bosscher wrote: > Bin. Cheng wrote: >> Quoting from GCCINT, section "9.5 RTL passes": >> "When optimizing for size, GCSE is done using Morel-Renvoise Partial >> Redundancy Elimination, with the exception that it does not try to >> move invariants out of loops—that is left to the loop optimization >> pass. If MR PRE GCSE is done, code hoisting (aka unification) is also >> done, as well as load motion." >> >> While the pass gate function is as below: >> static bool >> gate_rtl_pre (void) >> { >> return optimize > 0 && flag_gcse >> && !cfun->calls_setjmp >> && optimize_function_for_speed_p (cfun) >> && dbg_cnt (pre); >> } >> >> It seems the PRE pass is disabled when not optimizing for speed. >> Doesn't this conflict with the documentation, which says >> Morel-Renvoise PRE will be used when optimizing for size. > > The documentation should say "When *not* optimizing for size..." . > > But this piece of documentation seems to be in need of some TLC anyway: > * hoisting is not enabled (or used to be not enabled, not sure what > it's like now) when not optimizing for size. hoisting is enabled with > -Os, PRE is disabled. > * MR PRE was replaced with edge-based lazy code motion even before GCC 3.0 > * loop code motion is now done on GIMPLE way before RTL PRE > * ... (probably half a dozen more issues) ... > > If you file a PR, I'll update the documentation for the old gcse.c > passes (HOIST, PRE, CPROP). > Hi Steven, I filed PR55031 for tracking this issue. Thanks. -- Best Regards.
Re: Question on updating DF info during code hoisting
On Mon, Oct 22, 2012 at 6:25 PM, Steven Bosscher wrote: > Bin.Cheng wrote: >> It is possible to have register pressure decreased when hoisting an >> expression up in flow graph because of shrunk live range of input >> register operands. >> To accurately simulating the change of register pressure, I have to >> check the change of live range of input operands during hoisting. For >> example, to hoist "x+y" through a basic block B up in flow graph, I >> have to: >> 1. Check whether x/y is in DF_LR_OUT(B), it yes, the live range won't be >> shrunk. > > in df_get_live_out(B). That uses DF_LIVE instead of DF_LR if DF_LIVE > is available. That's the more optimistic liveness definition where > something that's used but not defined is not considered live. > > >> 2. Check whether x/y is referred by any other insns in B, if yes, the >> live range won't be shrunk. Basically I have two methods to do this: >> a) Iterate over insns in B reversely, checking whether x/y is >> referred by the insn. >> b) Iterate over all references of x/y(using >> DF_REG_USE_CHAIN(REGNO(x/y))), checking whether the reference is made >> in B. >> >> Method A) is simple, but I guess it would be expensive and I am >> thinking about using method B). > > Method B should be fine for pseudo-registers, they tend to have few > uses i.e. short chains. > > >> The problem is code hoisting itself create/modify/delete insns in >> basic block, I have to update DF cache info each time an expression is >> hoisted thus the info can be used to check the change of live range >> when hoisting other expressions later. >> >> Though not familiar with DF in GCC, I think I can use >> df_insn_delete/df_insn_rescan to update DF caches for newly >> modified/deleted instructions. > > If you delete an insn, most of this happens automatically, unless you > use the DF_DEFER_INSN_RESCAN or DF_NO_INSN_RESCAN flags. But these are > not used in gcse.c so the DF caches are updated on-the-fly. No need to > use df_insn_delete/df_insn_rescan manually unless you change an insn > in-place without going through recog.c (validate_change and friends). > > >> What I am not sure are: >> 1. Basically I only need to update the reference information(i.e., >> DF_REG_USE_CHAIN(REGNO(x/y))) for each input operand and don't >> need to >> update global DF_LR_IN/OUT information(because this will be done >> manually when trying to hoist an expression). Could this be done in >> current DF infrastructure? > > This should already be happening. But you should update the stuff you > get back from df_get_live_{in,out}, not DF_LR_{IN,OUT}. > > >> 2. I did not find any DF interface to calculate reference information >> for newly created insn, so how could I do this? > > Should also already be happening on-the-fly. > Thanks for your explanation. Now I understand how df_insn_info is updated when deleting/modifying/creating insn. One more question is when and how IN/OUT information is updated. GCC calls df_set_bb_dirty when handling insns, but I did not found any spot in GCSE updating that information. is it done in CFG_CLEANUP? I would like to study DF infrastructure later, could you share some background knowledge on this, for example theory/algorithms to which GCC referred. Though there is good comment at the beginning of df-core.c, it doesn't mention any background/references. Thanks. -- Best Regards.
VN_INFO might be NULL in PRE
Hi, PRE bases on the result of value numbering (run_scc_vn). At the end, it free_scc_vn. But before free_scc_vn, it might call cleanup_tree_cfg (); if (do_eh_cleanup || do_ab_cleanup) cleanup_tree_cfg (); cleanup_tree_cfg might call make_ssa_name which might reuse some "name" from the FREE_SSANAMES list. If the VN_INFO of the "name" is NULL, free_scc_vn will "Segmentation fault". PR 54902 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54902) shows a real case. The attached log shows the gdb backtrace to create a new ssa name. Here is function VN_INFO: vn_ssa_aux_t VN_INFO (tree name) { vn_ssa_aux_t res = VEC_index (vn_ssa_aux_t, vn_ssa_aux_table, SSA_NAME_VERSION (name)); gcc_checking_assert (res); return res; } Can we make sure "res" is not NULL for the new ssa name? In trunk, Richard's "Make gsi_remove return whether EH cleanup is required" patches in r186159 and r186164 make "do_eh_cleanup" to false. So cleanup_tree_cfg is not called in PRE. Then no new ssa name will be created. Does the Richard's patch fix the root cause? Thanks! -Zhenqiang bt.log Description: Binary data
Re: AIX trunk build fail #3
On 10/22/2012 06:03 PM, David Edelsohn wrote: > On Mon, Oct 22, 2012 at 9:49 AM, Perry Smith wrote: >> In stage 3, libatomic's configure fails. The config.log file is here: >> https://gist.github.com/3931504 >> >> I've recreated the conftest.c and ran the same command. The output is fine >> and executes with a 0 status. >> >> The clue (that I can't figure out) is cc1 is a 32 bit program but it tried >> to load the 64 bit version of libstdc++. >> I can't figure out why it tried to do that and I can't recreate it. >> >> I also added the output of dump -H /usr/work/build/gcc.git/./gcc/cc1 to the >> gist. >> >> Any suggestions? > > I do not know why you specifically are experiencing this problem. We > need to provide a version of libstdc++ for cc1 to use before the > multilib directory. AFAICT, this does happen since the C++ switch in stage1, as libstdc++ was not needed at all during bootstrap before. > Do you have a copy of libstdc++.a installed in the same directory as > gmp, mpfr, mpc? Maybe that occurs earlier in the search path. > Unfortunately all of the libraries have the same name and AIX tries > the first library with the name it finds, regardless of 32/64 mode. Also, I've wondered when AIX started to listen to LD_LIBRARY_PATH at all (in addition to LIBPATH). /haubi/