Alias-Improvements Branch ========================= The primary objective of the branch is to transition away from the use of the virtual operand use-def chains as data-flow representation for memory optimizations. Memory optimizers need to transition to the use of the alias-oracle which is a query based system. For the forseeable future they can still rely on the safety-net the virtual operand use-def chain provides.
I propose to merge the branch at the beginning of stage1 for GCC 4.5. Bugfixes put on the branch can be merged independently. Likewise it should be possible to merge the points-to changes and some of the call-clobber changes before the rest. Other changes though are difficult or impossible to separate without intermediate regressions. Overview of infrastructure changes done on the branch: - There is a single memory symbol (.MEM) per function used to build the virtual operand use-def chain. In trunk terms you can view this as equivalent to having a single memory partition containing all symbols. Due to the lack of precision computing and keeping this virtual operand use-def chain is trivial, all code supporting more complex operations has been removed. In particular it can be assumed that at most a single VUSE and VDEF is available per statement. Most code has been simplified according to that assumption. - This single-symbol virtual operand use-def chain safety-net is also available during early optimizations (in fact, always if the IL is in SSA form). This removes the need for special-casing memory statements during that compilation phase and enables to use common infrastructure during early (and IPA) optimization phases. It also allows to schedule regular memory optimization passes during early optimization. - The virtual operand use-def chain is kept up-to-date by the operand-scanner on the branch. There is no need to manually mark any symbols for renaming (instead this is now useless and costly, code auditing is still necessary here). As there cannot be magically more, less or different memory symbols required (there's only one) optimization passes usually know if they are removing a load or a store and so can act properly (or rely on the operand scanner which updates virtual SSA form manually in the simple cases). - The at most two virtual operands (a VUSE and a VDEF) are now real operands (there is a vuse and a vdef member in the gimple structure for memory statements). This means they can share the infrastructure present for regular operands and immediate uses. All infrastructure dealing with the old virtual operands has been removed. The operand structures are linked in the normal use and defs lists, the special lists for virtual uses and defs has been removed. This means that operand iterators have been simplified a lot. The possibility to iterate over only virtual operands has been removed (there is only at most a single one, directly accessible via gimple_v{use,def}_op). Most bulk changes in optimizers had to be done because of this simplification. - Global data that was not kept in sync and is either unused on the branch or very hard to keep up-to-date has been removed. This includes the bitmap of addressable variables and the escape type per variable and pointer. - Points-to information has been abstracted and is directly used for pointer information as well as call-clobber related information. All existing call-clobber related code has been removed. - Numerous fixes to the points-to analysis code have been done. - The alias-oracle present on the trunk has been extended to also use points-to alias information. - The FRE and PRE passes using the common SCCVN infrastructure have been improved by improving the memory statement walking infrastructure of the alias-oracle and its precision. - The tree DSE pass has been improved to compensate the loss of DSE done by the tree DCE pass on the trunk (but see below). The alias-oracle and its interface is contained in the tree-ssa-alias.[ch] files. Common with the trunk is the refs_may_alias_p type of query which has been accompanied by statement based clobber / use queries. Preliminary performance measurements show comparable performance for SPEC 2000, Polyhedron and our usual set of C++ benchmarks. SPEC 2006 measurements are on the way, all on x86_64. Daily testing can be monitored beyond http://gcc.opensuse.org/, look for "alias-improvements" branch annotation and compare with the other runs on the frescobaldi machine. Performance comparisons on other architectures are welcome. In general the trade-off with the new scheme is that walking statements via the virtual use-def links will be slower because possibly (a lot) more statements are visited. This is offsetted by leaner operand iterators, cheaper operand scanning and updating and less memory use for maintaining the virtual operands. And most important, a lot less code to maintain. Current shortcomings ("regressions") present on the branch, which I do not plan to work on before the merge. - Uninitialized warnings for memory which depend on the virtual operands will no longer work (but in very small testcases). FAIL: gcc.dg/uninit-B.c uninit i warning (test for warnings, line 12) FAIL: gcc.dg/uninit-pr19430.c (test for warnings, line 32) FAIL: gcc.dg/uninit-pr19430.c uninitialized (test for warnings, line 41) - A trick on the 4.4 branch shadowed the issue with the C/C++ frontend using alias set zero for char and how that relates to char members in structures (their alias sets have a zero member). FAIL: gcc.dg/tree-ssa/pr27799.c (test for excess errors) - DCE no longer removes as many dead stores as on trunk. FAIL: gcc.dg/tree-ssa/loop-36.c scan-tree-dump-not dce2 "c.array" FAIL: gcc.dg/tree-ssa/ldist-11.c scan-tree-dump-times ldist "distributed: split to 2 loops" 1 There are also some XPASSes. Likewise gazillions of copile-time and memory-hog related bugs should be fixed (and an unknown amount of new ones introduced). A bunch of optimization related PRs are fixed as well. There are a number of tasks left to do, either on the branch or (prefered) on trunk after merging the branch. - The pass-pipeline needs re-tuning. We re-compute points-to information too often and at non-optimal places (is no longer necessary for correctness to "re-compute" alias information). - Special "early" passes can get removed, merged or be made use of generally in the optimization pipeline. - The DCE / DSE situation needs to be resolved. On the trunk DCE does remove far more dead stores than DSE does. On the branch dead store removal from DCE is severely pessimized (because it still relies on the virtual operands). DSE is improved on the branch, but it runs too late and not often enough and is also very simple. The best solution is to remove the DSE pass and do all (and proper) dead store removal from within DCE, but this is quite an effort and as can be seen from benchmark numbers not a blocker for merging the branch. Comments / Performance measurements / code-reviews welcome. Thanks, Richard.