Help with detection of an invariant
On S/390 the test case gcc.dg/loop-9.c currently fails: void f (double *a) { int i; for (i = 0; i < 100; i++) a[i] = 18.4242; } It seems to expect that moving 18.4242 to a register is moved out of the loop, but on S/390 it isn't. It turns out that move_invariant_reg() is never called from move_invariants() because the invariants vector is empty. Now, find_invariant_insn() checks the insn for invariants using check_dependencies(). (insn 29 28 30 3 (set (mem:DF (reg:DI 81 [ ivtmp.8 ]) [0 MEM[base: _15, offset: 0B]+0 S8 A64]) (const_double:DF 1.842419990222931955941021442413330078125e+1 [0x0.9364c2f837b4ap+5])) .../loop-9.c:9 918 {*movdf_64} (nil)) check_dependencies() comes across reg 81 first, decides that is not an invariant and returns false so that find_invariant_insn() never even looks at the constant. Actually, the constant should be moved (from the literal pool) to a floating point register (and actually is in the assembly output), and that move could be moved out of the loop (it's not). Where should I look for the root cause? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
How to rewrite call targets (OpenACC bind clause)
Hi! I have tried a few things, and got things somewhat working, but I'm not satisfied with my results so far, so I'd like to ask for help. OpenACC specifics are not relevant to my question, which I'm thus formulating in a very generic way. (But find an illustrative example at the end of the email.) If attached to a function declaration X (using a function attribute, basically), the OpenACC bind clause specifies that when compiling for an offloading target, all calls to function X should be diverted to function Y, and the body of function X be discarded. X remains the call target when compiling for the host. Y may be different per offloading target. In the generic case, Y will be identified with an assembler name. The requirements mandate an implementation in the LTO front end (which is the entry point for every offloading compiler), or later. Is the LTO front end the right place to do this? After read_cgraph_and_symbols or somewhere else? As we're not going to use it in the offloaded code (it's unreachable), my first thought was: for all decls (X) that have a bind (Y) clause attached, set the decl X's assembler name to Y's (using symtab->change_decl_assembler_name -- or gcc/varasm.c:set_user_assembler_name?). That somewhat works, but Y will then be compiled to X's name, and I saw problems if not only X's declaration but also its definition were available, because we'd then get two function definitions with X's (assembler) name, and I didn't manage to discard only the original (unreachable) X definition while keeping its decl alive (with assembler name Y), which is still used at all call sites. Maybe the wrong approach after all... I'm able to look up cgraph_node::get_for_asmname([Y]), and I tried experimenting with cgraph_node::create_alias and resolve_alias (in the LTO front end) but that also hasn't been completely successful: this worked if compiling with optimizations (Y even got inlined at the call site of X, good!), but it didn't work with -O0. I found the redirect_callee and redirect_call_stmt_to_callee functions of cgraph_edge -- is that something I should be using? (Still in the LTO front end?) Or, should I do this redirection after the LTO front end, in an early pass (execute_oacc_device_lower?). That is, for every current_function_decl, locate all calls to all functions tagged with a bind clause, and then rewrite the call sites to Y instead of X? An illustrative example: #pragma acc routine int Y() { return 2; } #pragma acc routine bind(Y) int X() { return 1; } int main() { int ret; #pragma acc parallel copyout(ret) ret = X(); return ret; } If running with ACC_DEVICE_TYPE=host, this should return 1, and if running with ACC_DEVICE_TYPE=not_host, it should return 2. Grüße Thomas
Re: RFC: Intel386 psABI version 1.1 draft
On Tue, Nov 24, 2015 at 8:16 AM, H.J. Lu wrote: > Hi, > > Here is the Intel386 psABI version 1.1 draft: > > https://github.com/hjl-tools/x86-psABI/wiki/intel386-psABI-20151120.pdf > > Main changes are > > 1. Add AVX-512 support. > 2. Add linker optimization to combine GOTPLT and GOT slots. > 3. Add R_386_GOT32X relocation and linker optimization. > 4. Add FS/GS Base addresses to DWARF register number mapping. > 5. Add Intel MPX support. > > MPX supported has been checked into GCC 5. Linker optimization has > been added to ld in binutils 2.25 and gold in binutils 2.26. Gold and ld in > binutils 2.26 supports new relocations. Ld in binutils 2.26 can optimize > new relocations. > > Any comments and feedbacks? > Here is the Intel386 psABI version 1.1: https://github.com/hjl-tools/x86-psABI/wiki/intel386-psABI-1.1.pdf -- H.J.
Re: Help with detection of an invariant
On Mon, Dec 7, 2015 at 6:44 AM, Dominik Vogt wrote: > On S/390 the test case gcc.dg/loop-9.c currently fails: > > void f (double *a) > { > int i; > for (i = 0; i < 100; i++) > a[i] = 18.4242; > } > > It seems to expect that moving 18.4242 to a register is moved out > of the loop, but on S/390 it isn't. It turns out that > move_invariant_reg() is never called from move_invariants() > because the invariants vector is empty. Now, > find_invariant_insn() checks the insn for invariants using > check_dependencies(). > > (insn 29 28 30 3 (set (mem:DF (reg:DI 81 [ ivtmp.8 ]) [0 MEM[base: _15, > offset: 0B]+0 S8 A64]) > (const_double:DF > 1.842419990222931955941021442413330078125e+1 [0x0.9364c2f837b4ap+5])) > .../loop-9.c:9 918 {*movdf_64} >(nil)) > > check_dependencies() comes across reg 81 first, decides that is > not an invariant and returns false so that find_invariant_insn() > never even looks at the constant. > > Actually, the constant should be moved (from the literal pool) to > a floating point register (and actually is in the assembly > output), and that move could be moved out of the loop (it's not). > Where should I look for the root cause? Hmm, I want to say the predicates on movdf_64 are too lose allowing the above when it should not. That is movdf_64 should have pushed the load of the fp constant into its psedu-register and used that to do the storing. Thanks, Andrew > > Ciao > > Dominik ^_^ ^_^ > > -- > > Dominik Vogt > IBM Germany >
Re: Help with detection of an invariant
On Mon, Dec 07, 2015 at 11:48:10AM -0800, Andrew Pinski wrote: > On Mon, Dec 7, 2015 at 6:44 AM, Dominik Vogt wrote: > > On S/390 the test case gcc.dg/loop-9.c currently fails: > > > > void f (double *a) > > { > > int i; > > for (i = 0; i < 100; i++) > > a[i] = 18.4242; > > } > > > > It seems to expect that moving 18.4242 to a register is moved out > > of the loop, but on S/390 it isn't. It turns out that > > move_invariant_reg() is never called from move_invariants() > > because the invariants vector is empty. Now, > > find_invariant_insn() checks the insn for invariants using > > check_dependencies(). > > > > (insn 29 28 30 3 (set (mem:DF (reg:DI 81 [ ivtmp.8 ]) [0 MEM[base: _15, > > offset: 0B]+0 S8 A64]) > > (const_double:DF > > 1.842419990222931955941021442413330078125e+1 > > [0x0.9364c2f837b4ap+5])) .../loop-9.c:9 918 {*movdf_64} > >(nil)) > > > > check_dependencies() comes across reg 81 first, decides that is > > not an invariant and returns false so that find_invariant_insn() > > never even looks at the constant. > > > > Actually, the constant should be moved (from the literal pool) to > > a floating point register (and actually is in the assembly > > output), and that move could be moved out of the loop (it's not). > > Where should I look for the root cause? > > > Hmm, > I want to say the predicates on movdf_64 are too lose allowing the > above when it should not. That is movdf_64 should have pushed the > load of the fp constant into its psedu-register and used that to do > the storing. Thanks! It turns out that for historical S/390[x] moves such constants to the literal pool only after the loop-invariants pass. That is because on old cpus it was more efficient to do a direct memory to memory move with the MVC instruction (i.e. moving the constant to a register warly is harmful on old cpus). Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany