Re: Adding official support into the main tree for SPARC Leon
Eric Botcazou wrote: > So I'd suggest that Luís Vitório and/or Konrad do the required paperwork, and > then start to post their patches on the gcc-patches@ list. I'll sponsor them > for write access at that point. > Hello Eric Botcazou, I want to once again ask for write credentials so that I can submit patches for the sparc-leon architecture: The first patch is for the 'gcc' repository while the second patch is for the 'binutils' repository. They are both related so I think it makes sense to send them together. I dont have write access to binutils eather so, I thought you might be able to apply them on both. Some background: Leon supports the umac/smac instructions. The Leon3-Ft and Leon4 also support the SMP compare-and-swap (casa) v9-instruction. The appended 2 patches do: 1. 0001-sparc-leon-Use-Aleon-assembler-switch-for-mcpu-leon-.patch Append "-Aleon" to the assembler 2. 0001-sparc-leon-add-leon-architecture-to-GAS.patch Define new "leon" processor type in GAS + enable for "leon" umac/smac and "casa". It would help a lot if you could apply this for us once more. -- Greetings Konrad >From 2d799b053e78383a3029845aee858487225004e3 Mon Sep 17 00:00:00 2001 From: Konrad Eisele Date: Fri, 21 Oct 2011 14:30:58 +0200 Subject: [PATCH 1/1] sparc leon: Use -Aleon assembler switch for -mcpu=leon arch Use -Aleon to enable binutils sparc-leon architecture. The leon-arch binutils GAS has umul/smul and casa enabled. --- gcc/config/sparc/sparc.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h index e0db816..94e5887 100644 --- a/gcc/config/sparc/sparc.h +++ b/gcc/config/sparc/sparc.h @@ -314,7 +314,7 @@ extern enum cmodel sparc_cmodel; #if TARGET_CPU_DEFAULT == TARGET_CPU_leon #define CPP_CPU32_DEFAULT_SPEC "-D__leon__ -D__sparc_v8__" -#define ASM_CPU32_DEFAULT_SPEC "" +#define ASM_CPU32_DEFAULT_SPEC "-Aleon" #endif #endif @@ -403,7 +403,7 @@ extern enum cmodel sparc_cmodel; /* Override in target specific files. */ #define ASM_CPU_SPEC "\ -%{mcpu=sparclet:-Asparclet} %{mcpu=tsc701:-Asparclet} \ +%{mcpu=sparclet:-Asparclet} %{mcpu=leon:-Aleon} %{mcpu=tsc701:-Asparclet} \ %{mcpu=sparclite:-Asparclite} \ %{mcpu=sparclite86x:-Asparclite} \ %{mcpu=f930:-Asparclite} %{mcpu=f934:-Asparclite} \ -- 1.6.4.1 >From a0a9021cf3280ebf4df79cb6692366c55e507d25 Mon Sep 17 00:00:00 2001 From: Konrad Eisele Date: Fri, 21 Oct 2011 13:32:42 +0200 Subject: [PATCH 1/1] sparc leon: add leon architecture to GAS Add -Aleon architecture selection to GAS. -Aleon supports umul/smul and [casa,casl]. --- gas/config/tc-sparc.c |3 ++- gas/configure.tgt |6 +- include/opcode/sparc.h |1 + opcodes/sparc-opc.c| 16 +--- 4 files changed, 17 insertions(+), 9 deletions(-) diff --git a/gas/config/tc-sparc.c b/gas/config/tc-sparc.c index 77fda56..47f4386 100644 --- a/gas/config/tc-sparc.c +++ b/gas/config/tc-sparc.c @@ -221,7 +221,7 @@ static void output_insn (const struct sparc_opcode *, struct sparc_it *); for this use. That table is for opcodes only. This table is for opcodes and file formats. */ -enum sparc_arch_types {v6, v7, v8, sparclet, sparclite, sparc86x, v8plus, +enum sparc_arch_types {v6, v7, v8, leon, sparclet, sparclite, sparc86x, v8plus, v8plusa, v9, v9a, v9b, v9_64}; static struct sparc_arch { @@ -246,6 +246,7 @@ static struct sparc_arch { { "sparcima", "v9b", v9, 0, 1, F_MUL32|F_DIV32|F_FSMULD|F_POPC|F_VIS|F_VIS2|F_FMAF|F_IMA }, { "sparcvis3", "v9b", v9, 0, 1, F_MUL32|F_DIV32|F_FSMULD|F_POPC|F_VIS|F_VIS2|F_FMAF|F_VIS3|F_HPC }, { "sparcvis3r", "v9b", v9, 0, 1, F_MUL32|F_DIV32|F_FSMULD|F_POPC|F_VIS|F_VIS2|F_FMAF|F_VIS3|F_HPC|F_RANDOM|F_TRANS|F_FJFMAU }, + { "leon", "leon", leon, 32, 1, F_MUL32|F_DIV32|F_FSMULD }, { "sparclet", "sparclet", sparclet, 32, 1, F_MUL32|F_DIV32|F_FSMULD }, { "sparclite", "sparclite", sparclite, 32, 1, F_MUL32|F_DIV32|F_FSMULD }, { "sparc86x", "sparclite", sparc86x, 32, 1, F_MUL32|F_DIV32|F_FSMULD }, diff --git a/gas/configure.tgt b/gas/configure.tgt index a171a32..7b1f7e8 100644 --- a/gas/configure.tgt +++ b/gas/configure.tgt @@ -79,7 +79,11 @@ case ${cpu} in sparc86x*) cpu_type=sparc arch=sparc86x ;; sparclet*) cpu_type=sparc arch=sparclet ;; sparclite*) cpu_type=sparc arch=sparclite ;; - sparc*) cpu_type=sparc arch=sparclite ;; # ??? See tc-sparc.c. + sparc*) +case ${vendor} in +leon*) cpu_type=sparc arch=leon ;; +*) cpu_type=sparc arch=sparclite ;; # ??? See tc-sparc.c. +esac ;; v850*) cpu_type=v850 ;; x86_64*) cpu_type=i386 arch=x86_64;; xtensa*) cpu_type=xtensa arch=xtensa ;; diff --git a/include/opcode/sparc.h b/include/opcode/sparc.h index 7ae3641..2283a93 100644 --- a/include/opcode/sparc.h +++ b/include/opcode/sparc.h @@ -42,6 +42,7 @@ enum sparc_opcode_arch_val SPARC_OPCODE_ARCH_V6 = 0, SPARC_OPCODE_ARCH_V7, SPARC_OPCODE_AR
Re: Expanding instructions with condition codes inter-deps
On 21/10/11 22:41, Richard Henderson wrote: On 10/21/2011 10:15 AM, Paulo J. Matos wrote: So I have implemented the nadd and addc as: (define_insn "negqi2" [(set (match_operand:QI 0 "register_operand" "=c") (neg:QI (match_operand:QI 1 "register_operand" "0"))) (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0))) (clobber (reg:CC RCC))] "" { operands[2] = const0_rtx; return "nadd\\t%0,%2"; }) There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. But this case is not a normal insn per se, I did this to negqi2 because I need GCC to know that this instruction explicitly changes RCC and that the following instruction will use the carry flag (addc). The reason I say it is not a normal insn is because it comes often in a pair negqi2 / addc_internal, like for example addqi3 / addc_internal or subqi3 / subc_internal. (define_insn "addc_internal" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (plus:QI (ltu:QI (reg:CC RCC) (const_int 0)) (match_operand:QI 1 "nonimmediate_operand" "%0")) (match_operand:QI 2 "general_operand" "cwmi"))) (use (reg:CC_C RCC)) (clobber (reg:CC RCC))] "" "addc\\t%0,%f2") You don't need the USE, because you mention RCC inside the LTU. (define_insn "*addc_internal_flags" Likewise. Got it, thanks. A couple of things to note: * negqi (which generates the nadd x, y equivalent to -x + y) has a set RCC in C mode followed by a clobber. The set in C mode doesn't show up in the _flags variant which is used only for the compare-elim since it doesn't really matter and it already contains a set RCC anyway. Surely the NADD insn is simply a normal subtract (with reversed operands). You shouldn't *need* to implement NEG at all, as the middle-end will let NEG expand via MINUS. Just so you know... But it is not exactly the same thing in this arch because: subqi3 generates a sub , == = - to represent negqi2 of register R with a nadd I just do: nadd R,#0 to represent it using a sub I require more moves: ld R1, #0 sub R1, @R ; @R is memory mapped R ld R, @R1 * is this enough for GCC to understand that anything that clobbers RCC or specifically touches the RCC in C mode shouldn't go in between these two instructions? Yes. Also, do I need to specify in the RCC clobber, exactly which flags are clobbered, or should I use a set instead? No, the compiler will assume the entire register is changed, no matter what CCmode you place there. Got it, so the only way to deal with the carry flag by itself would be to represent the Carry flag as a separate flags register. Although that would require more than one flags register and it feels messy. -- PMatos
Re: Expanding instructions with condition codes inter-deps
On 23/10/11 22:21, Richard Henderson wrote: On 10/21/2011 05:49 PM, paul_kon...@dell.com wrote: There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction? Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output. But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then? The solution is to have *two* insn patterns, one with a set of the flags and one with only a clobber. Have a look through i386.md and how the flags register is handled there. In version 4.6.1, i386.md, I see things like: (define_insn "addqi3_cc" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:QI 1 "nonimmediate_operand" "%0,0") (match_operand:QI 2 "general_operand" "qn,qm")] UNSPEC_ADD_CARRY)) (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q") (plus:QI (match_dup 1) (match_dup 2)))] "ix86_binary_operator_ok (PLUS, QImode, operands)" "add{b}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "mode" "QI")]) This seems to be exactly what we are doing. I can't see where there are separate rules for the flags register. -- PMatos
issue of store to stack after an instruction
Hello, I would like to implement a compiler fix for a SPARC-cpu variant that does the following: After each "fdivs" (SPARC single-float division) save the destination FPU register to a stack memory location. The sparc.md definition of fdivs is this one: (define_insn "divsf3" [(set (match_operand:SF 0 "register_operand" "=f") (div:SF (match_operand:SF 1 "register_operand" "f") (match_operand:SF 2 "register_operand" "f")))] "TARGET_FPU" "fdivs\t%1, %2, %0" [(set_attr "type" "fpdivs")]) What is the best way to accomplish that exactly after the "fdivs\t%1, %2, %0" a "st %0, []" is issued? where is a stack location that is allocated for each "divsf3" insn? Is something similar for other architectures? It seems I cannot use the sparc_reorg pass hook because the stackframe might be > 4096 and a new register has to be allocated before for the store address... Is there some hook before the reload pass that I could use? -- Thanks Konrad
Re: [4.6.1] ICE in size_binop_loc, at fold-const.c:1433
Indeed, ptr_mode!=Pmode for my target. I will try to figure out where such a Pmode comes from. Thanks, Aurélien 2011/10/23 Richard Guenther : > On Fri, Oct 21, 2011 at 4:53 PM, Aurelien Buhrig > wrote: >> Hi, >> >> I'm trying to port gcc 4.6.1 for a new target for which Pmode=PSI. >> I have an ICE in size_binop_loc, at fold-const.c:1433 when compiling >> gcc.c-torture/compile/92-1.c >> >> Here is the back trace >> #1 0x0060f8f3 in size_binop_loc (loc=0, code=PLUS_EXPR, >> arg0=0x2e8d8150, arg1=0x2e874dc0) >> at /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/fold-const.c:1432 >> #2 0x007ddefd in dr_analyze_innermost (dr=0xf5b0f0) at >> /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-data-ref.c:765 >> #3 0x007de881 in create_data_ref (nest=0x0, >> loop=0x2e8917f8, memref=0x2e87c558, stmt=0x2e88ca00, >> is_read=1 '\001') >> at /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-data-ref.c:970 >> #4 0x007e7d0d in find_data_references_in_stmt (nest=0x0, >> stmt=0x2e88ca00, datarefs=0xf57c98) >> at /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-data-ref.c:4238 >> #5 0x007e810d in find_data_references_in_bb (loop=0x0, >> bb=0x2e879548, datarefs=0xf57c98) >> at /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-data-ref.c:4307 >> #6 0x007e8838 in compute_data_dependences_for_bb >> (bb=0x2e879548, compute_self_and_read_read_dependences=1 '\001', >> datarefs=0xf57c98, >> dependence_relations=0xf57ca0) at >> /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-data-ref.c:4493 >> #7 0x00a91c71 in vect_analyze_data_refs (loop_vinfo=0x0, >> bb_vinfo=0xf57c80, min_vf=0x7fffd17c) >> at >> /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-vect-data-refs.c:2533 >> #8 0x0092705c in vect_slp_analyze_bb (bb=0x2e879548) at >> /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-vect-slp.c:1704 >> #9 0x00929d1b in execute_vect_slp () at >> /home/buhrig/work/embedded/sdk/gcc/v2/src/gcc/tree-vectorizer.c:256 >> >> It seems dr_analyze_innermost calls size_binop (PLUS_EXPR, poffset, >> TREE_OPERAND (base, 1)) >> with poffset an intreger SImode, and TREE_OPERAND (base, 1) a pointer, PSI. >> So int_binop_types_match_p fails in TYPE_MODE (type1) == TYPE_MODE (type2). >> >> I'm not sure where is the bug, if it is TREE_OPERAND (base, 1)) that >> should be INTEGER (where should the conversion be?), or size_binop >> that should work with PSI/SI ... > > You should be only seeing ptr_mode at the tree level, not Pmode (well, > if they are > not the same). > > Richard. > >> Thanks, >> Aurélien >> >
A question abt finding all register uses in instruction
Hello, I am trying to extract the regsiter uses in instructions using note_uses function. When encountering the following instruction I do not get r479 as a use; seemingly because of the following in note_use function: if (GET_CODE (dest) == ZERO_EXTRACT) { (*fun) (&XEXP (dest, 1), data); (*fun) (&XEXP (dest, 2), data); } the instruction: (insn 386 385 387 16 (set (zero_extract:SI (reg:SI 479) (const_int 16 [0x10]) (const_int 16 [0x10])) (const_int 4112 [0x1010])) 343 {*arm_movtas_ze} (nil)) I appreciate any advise of how to resolve this -- should I add (*fun) (&XEXP (dest, 0), data); ? Thanks, Revital
Ann: MELT plugin 0.9.1 for GCC 4.6
Hello All, It is my pleasure to announce the release of MELT plugin 0.9.1 for GCC 4.6 MELT is a high level domain specific language to ease the development of GCC extensions. See http://gcc-melt.org/ for more. The MELT plugin 0.9.1 for GCC 4.6 (dedicated to the memory of Denis M. Ritchie) is available, as a gzipped source tar archive, from http://gcc-melt.org/melt-0.9.1-plugin-for-gcc-4.6.tgz of size 4127673 bytes and md5sum 5d342073af875296d9aad1bd25aa59f9 (october 24th 2011). It is extracted from MELT branch svn revision 180378. The version number 0.9.1 of the MELT plugin is unrelated to the version of the GCC compiler (4.6) for which it is supposed to work. Bug reports and patches are welcome (to the gcc-melt list). October 24, 2011: Release of MELT plugin 0.9.1 for gcc-4.6 dedicated to the memory of Dennis M. Ritchie http://en.wikipedia.org/wiki/Dennis_Ritchie New features: variadic MELT functions. === A formal arguments list (i.e. formals for LAMBDA or DEFUN) ending with :REST is for variadic functions with a variable number and type of arguments (so :REST in MELT is similar to the ellipsis ... notation in C prototypes). At least one first formal argument should be provided and should be a value. The (VARIADIC ) macro is used to fetch actual variadic arguments. A variadic cursor is internally maintained to parse the variadic actual arguments. The VARIADIC macro has a sequence of variadic cases. Each variadic case starts with an ordinary [non-variadic] formal arguments list, and has a body which is evaluated for side effects if the current arguments at the cursor position fits into the formal. The last variadic case can also starts with an :ELSE. See also http://groups.google.com/group/gcc-melt/browse_thread/thread/c124ea6af940c08e variadic (DEBUG ) macro. Debugging messages should go thru the variadic (DEBUG ...) macro which accepts an arbitrary kind and number of arguments. The DEBUG_MSG macro is obsolete. variadic ADD2OUT function = The ADD2OUT variadic function add to an output (either a file value; or a string buffer values) arbitrary things. Enjoy. Bug reports and patches are welcome. Please send questions and improvements to gcc-m...@googlegroups.com & to the main author bas...@starynkevitch.net Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
Support for DW_AT_return_addr in gcc
Hi All, According to DWARF spec, a subprogram 'may' have DW_AT_return_addr attribute. Please help me understand the following:: (1). GCC (latest) does not emit DW_AT_return_addr attributes in subprogram tags (2). If [1] is true, then is it because of the fact that return address can be computed by unwinding the stack in debug_frame? (and that generating DW_AT_return_addr will be redundant?) (3). Would supporting DW_AT_return_addr in DW_TAG_subprogram be a good idea? (more of a short cut to dwarf consumers ) Thanks -- Anitha
Why doesn't GCC generate conditional move for COND_EXPR?
Hello, I noticed that COND_EXPR is not expanded to conditional move as MIN_EXPR/MAX_EXPR are (assuming movmodecc is available). I wonder why not? I have some loop that fails tree vectorization, but still contains COND_EXPR from tree ifcvt pass. In the end, the generated code is worse than if I don't turned -ftree-vectorize on. This is on our private port. Thanks, Bingfeng Mei
Re: A question abt finding all register uses in instruction
> I appreciate any advise of how to resolve this -- should I add > > (*fun) (&XEXP (dest, 0), data); ? Actually I don't see why not - a zero_extract on the LHS of an expression is supposed to be a bit field insert on that register. Isn't there an implicit read of the destination register involved in this case in that the lower order bits of the register are left unchanged ? cheers Ramana
Re: Why doesn't GCC generate conditional move for COND_EXPR?
On Mon, Oct 24, 2011 at 2:55 PM, Bingfeng Mei wrote: > Hello, > I noticed that COND_EXPR is not expanded to conditional move > as MIN_EXPR/MAX_EXPR are (assuming movmodecc is available). > I wonder why not? > > I have some loop that fails tree vectorization, but still contains > COND_EXPR from tree ifcvt pass. In the end, the generated code > is worse than if I don't turned -ftree-vectorize on. This > is on our private port. Because nobody touched COND_EXPR expansion since ages. > Thanks, > Bingfeng Mei > >
Re: [Qemu-devel] gcc auto-omit-frame-pointer vs msvc longjmp
Kai Tietz wrote: > Hi, > > For trunk-version I have a tentative patch for this issue. On 4.6.x > and older branches this doesn't work, as here we can't differenciate > that easy between ms- and sysv-abi. > > But could somebody give this patch a try? > > Regards, > Kai > > ChangeLog > > * config/i386/i386.c (ix86_frame_pointer_required): Enforce use of > frame-pointer for 32-bit ms-abi, if setjmp is used. > > Index: i386.c > === > --- i386.c (revision 180099) > +++ i386.c (working copy) > @@ -8391,6 +8391,10 @@ >if (SUBTARGET_FRAME_POINTER_REQUIRED) > return true; > > + /* For older 32-bit runtimes setjmp requires valid frame-pointer. */ > + if (TARGET_32BIT_MS_ABI && cfun->calls_setjmp) > +return true; > + >/* In ix86_option_override_internal, TARGET_OMIT_LEAF_FRAME_POINTER > turns off the frame pointer by default. Turn it back on now if > we've not got a leaf function. */ > For a gcc 4.7 snapshot, this does fix the longjmp problem that I encountered. So aside from specifying -fno-omit-frame-pointer for affected files, what can be done for 4.6? Bob
Re: [Qemu-devel] gcc auto-omit-frame-pointer vs msvc longjmp
2011/10/24 Bob Breuer : > Kai Tietz wrote: >> Hi, >> >> For trunk-version I have a tentative patch for this issue. On 4.6.x >> and older branches this doesn't work, as here we can't differenciate >> that easy between ms- and sysv-abi. >> >> But could somebody give this patch a try? >> >> Regards, >> Kai >> >> ChangeLog >> >> * config/i386/i386.c (ix86_frame_pointer_required): Enforce use of >> frame-pointer for 32-bit ms-abi, if setjmp is used. >> >> Index: i386.c >> === >> --- i386.c (revision 180099) >> +++ i386.c (working copy) >> @@ -8391,6 +8391,10 @@ >> if (SUBTARGET_FRAME_POINTER_REQUIRED) >> return true; >> >> + /* For older 32-bit runtimes setjmp requires valid frame-pointer. */ >> + if (TARGET_32BIT_MS_ABI && cfun->calls_setjmp) >> + return true; >> + >> /* In ix86_option_override_internal, TARGET_OMIT_LEAF_FRAME_POINTER >> turns off the frame pointer by default. Turn it back on now if >> we've not got a leaf function. */ >> > > For a gcc 4.7 snapshot, this does fix the longjmp problem that I > encountered. So aside from specifying -fno-omit-frame-pointer for > affected files, what can be done for 4.6? > > Bob Well, for 4.6.x (or older) we just can use the mingw32.h header in gcc/config/i386/ and define here a subtarget-macro to indicate that. The only incompatible point here might be for Wine using the linux-compiler to build Windows related code. A possible patch for 4.6 gcc versions I attached to this mail. Regards, Kai Index: mingw32.h === --- mingw32.h (revision 180393) +++ mingw32.h (working copy) @@ -239,3 +239,8 @@ /* We should find a way to not have to update this manually. */ #define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-12.dll" +/* For 32-bit Windows we need valid frame-pointer for function using + setjmp. */ +#define SUBTARGET_SETJMP_NEED_FRAME_POINTER \ + (!TARGET_64BIT && cfun->calls_setjmp) + Index: i386.c === --- i386.c (revision 180393) +++ i386.c (working copy) @@ -8741,6 +8741,12 @@ if (SUBTARGET_FRAME_POINTER_REQUIRED) return true; +#ifdef SUBTARGET_SETJMP_NEED_FRAME_POINTER + /* For older 32-bit runtimes setjmp requires valid frame-pointer. */ + if (SUBTARGET_SETJMP_NEED_FRAME_POINTER) +return true; +#endif + /* In ix86_option_override_internal, TARGET_OMIT_LEAF_FRAME_POINTER turns off the frame pointer by default. Turn it back on now if we've not got a leaf function. */
Re: Why doesn't GCC generate conditional move for COND_EXPR?
On Mon, Oct 24, 2011 at 7:00 AM, Richard Guenther wrote: > On Mon, Oct 24, 2011 at 2:55 PM, Bingfeng Mei wrote: >> Hello, >> I noticed that COND_EXPR is not expanded to conditional move >> as MIN_EXPR/MAX_EXPR are (assuming movmodecc is available). >> I wonder why not? >> >> I have some loop that fails tree vectorization, but still contains >> COND_EXPR from tree ifcvt pass. In the end, the generated code >> is worse than if I don't turned -ftree-vectorize on. This >> is on our private port. > > Because nobody touched COND_EXPR expansion since ages. I have a patch which I will be submitting next week or so that does this expansion correctly. In fact I have a few patches which improves the generation of COND_EXPR in simple cases (in PHI-OPT). Thanks, Andrew Pinski
Re: issue of store to stack after an instruction
> I would like to implement a compiler fix for a SPARC-cpu variant > that does the following: > After each "fdivs" (SPARC single-float division) save the destination > FPU register to a stack memory location. Do you need to reload it afterward or just save it? -- Eric Botcazou
Re: AIX library issues
On Sun, Oct 23, 2011 at 8:33 PM, Perry Smith wrote: > One more question on this quest (drifting a little more off topic). > In my log files I see a lot of these errors: > > ld: 0711-768 WARNING: Object > ../libsupc++/.libs/libsupc++convenience.a[eh_terminate.o], section 1, > function .std::terminate(): > The branch at address 0x10c is not followed by a recognized no-op > or TOC-reload instruction. The unrecognized instruction is 0x0. > > The build continues and completes. I just want to make sure that I can > safely ignore them. Surfing the web, sometimes I see people flag the as > errors and other times not. G++ probably is performing an optimization because the terminate function will not return, so I suspect the error message can be ignored. GCC does not always generate completely conformant AIX assembly code and the AIX assembler does not always follow the rule to be liberal in what it accepts. The biggest problem is that AIX users of GCC ask questions here and on other forums, but do not communicate to IBM AIX Brand executives that the GNU Toolchain on AIX is important. The fact that GCC continues to function at all on AIX seem to place it in the "out of sight, out of mind" category. If you are a developer or ISV or your company uses GCC on AIX, tell your IBM sales representative or executive contact that it is important to your business. - David
extending fpmuls
While working on some test cases I noticed that the 'fsmuld' instruction on sparc was not being matched by the combiner for things like: double fsmuld (float a, float b) { return a * b; } Combine does try to match: (set x (float_extend:DF (mul:SF y z))) instead of what backends (and in particular at least Sparc and Alpha) seem to use canonically for this pattern which is: (set x (mul:DF (float_extend:DF y) (float_extend:DF y))) Something similar happens for: double fnsmuld (float a, float b) { return -(a * b); } which combine should match to the *fnsmuld sparc.md pattern, but similar to above combine tries: (set x (float_extend:DF (mul:SF (neg:SF y) z))) instead of: (set x (mul:DF (neg:DF (float_extend:DF y) (float_extend:DF z Which is right? "Canonicalization of Instructions" in the internals documentation doesn't give any guidance :-)
libtool.m4 update?
All, Earlier this year libtool.m4 and friends received an update to support the upcoming FreeBSD10.x version. Now as it seems it was not enough. We need an additional update, especially the objformat detection needs a fix. It is a one-liner, in libtool.m4. (around line 2273 of libtool.m4 we have to make sure that FreeBSD10.x does get elf iso. aout.) To be prepared for FreeBSD20.x there are a couple more lines, but for these we have time :) My question is how to proceed? I have a certain respect to sync upstream libtool.m4 and do all the config steps including testing for all targets. I have done it for the mentioned changes, FreeBSD only, and posted the results to the list. Is it preferred to sync libtool.m4 completely? Or do we want to shift this update for a later time? I'm aware of the closing stage one. Gerald and I would like to see this happening on gcc-4.7/6/5. I'd appreciate any comments on this topic. TIA, Andreas
Re: issue of store to stack after an instruction
Eric Botcazou wrote: >> I would like to implement a compiler fix for a SPARC-cpu variant >> that does the following: >> After each "fdivs" (SPARC single-float division) save the destination >> FPU register to a stack memory location. > > Do you need to reload it afterward or just save it? > I just need to save it. It needs to be saved so that the FPU pipeline is flushed. It could be one per function allocated stack location, or one stack location for each fdivs. I was previously using define_expand to generate a different pattern + the stack location for divsf3 and then define that pattern. It does work however it feels like a hack... -- Thanks Konrad Like this: ;; handle divsf3 (define_expand "divsf3" [(set (match_operand:SF 0 "register_operand" "=f") (div:SF (match_operand:SF 1 "register_operand" "f") (match_operand:SF 2 "register_operand" "f"))) ] "TARGET_FPU && (!TARGET_NO_SF_DIVSQRT)" "{ output_divsf3_emit (operands[0], operands[1], operands[2], 0); DONE; }") (define_insn "divsf3_store" [(set (match_operand:SF 0 "register_operand" "=f") (div:SF (match_operand:SF 1 "register_operand" "f") (match_operand:SF 2 "register_operand" "f"))) (use (match_operand:SI 3 "general_operand" "" ))] "TARGET_FPU && TARGET_STORE_AFTER_DIVSQRT && (!TARGET_NO_SF_DIVSQRT)" "fdivs\t%%1, %%2, %%0; st %%0, [%%3] " [(set_attr "type" "multi") (set_attr "length" "2") ]) (define_insn "divsf3_nostore" [(set (match_operand:SF 0 "register_operand" "=f") (div:SF (match_operand:SF 1 "register_operand" "f") (match_operand:SF 2 "register_operand" "f")))] "TARGET_FPU && (!TARGET_STORE_AFTER_DIVSQRT) && (!TARGET_NO_SF_DIVSQRT)" "fdivs\t%1, %2, %0" [(set_attr "type" "fpdivs")]) void output_divsf3_emit (rtx dest, rtx op0, rtx op1, rtx scratch) { rtx slot0, div, divsave; div = gen_rtx_SET (VOIDmode, dest, gen_rtx_DIV (SFmode, op0, op1)); if (TARGET_STORE_AFTER_DIVSQRT) { rtx m; slot0 = assign_stack_local (SFmode, 4, 4); m = copy_to_reg (XEXP (slot0, 0)); emit_insn (gen_rtx_PARALLEL(VOIDmode, gen_rtvec (2, div, gen_rtx_USE (VOIDmode, m; } else { emit_insn(div); } }
Re: issue of store to stack after an instruction
> I just need to save it. It needs to be saved so that the FPU > pipeline is flushed. Then why not save it just below the stack pointer? -- Eric Botcazou
Re: libtool.m4 update?
On 2011.10.25 at 06:39 +0200, Andreas Tobler wrote: > Is it preferred to sync libtool.m4 completely? Or do we want to shift > this update for a later time? I'm aware of the closing stage one. An libtool update is also needed for bootstrap-lto with slim lto object files. So a complete sync with upstream would be the best option IMO. -- Markus