Re: C++ support for decimal floating point
On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson wrote: > I've been implementing ISO/IEC TR 24733, "an extension for the > programming language C++ to support decimal floating-point arithmetic", > in GCC. It might be ready as an experimental feature for 4.5, but I > would particularly like to get in the compiler changes that are needed > for it. > > Most of the support for the TR is in new header files in libstdc++ that > depend on compiler support for decimal float scalar types. Most of that > compiler functionality was already available in G++ via mode attributes. > I've made a couple of small fixes and have a couple more to submit, and > when those are in I'll starting running dfp tests for C++ as well as C. > The suitable tests have already been moved from gcc.dg to c-c++-common. > > In order to provide interoperability with C, people on the C++ ABI > mailing list suggested that a C++ compiler should recognize the new > decimal classes defined in the TR and pass arguments of those types the > same as scalar decimal float types for a particular target. I had this > working in an ugly way using a langhook, but that broke with LTO. I'm > looking for the right places to record that an argument or return value > should be passed as if it were a different type, but could use some > advice about that. How do we (do we?) handle std::complex<> there? My first shot would be to make sure the aggregate type has the proper mode, but I guess most target ABIs would already pass them in registers, no? Richard.
Non-portable test?
Hi all, This is my first post to the list so do not be too harsh) I have expected all c-torture tests to be highly portable but I have recently ran into test which relies on int being 32-bit (execute/980526-2.c). The test runs to_kdev_t(0x12345678) (see below) and verifies that result equals 0x15800078. But this is true only with 32-bit ints. With 64-bits we have 0x48d15800078. static inline kdev_t to_kdev_t(int dev) { int major, minor; if (sizeof(kdev_t) == 16) return (kdev_t)dev; major = (dev >> 8); minor = (dev & 0xff); return ((( major ) << 22 ) | ( minor )) ; } Shouldn't we modify a precondition in main: if (sizeof (int) < 4) exit (0); to be if (sizeof (int) != 4) exit (0); or better if( sizeof(int)*CHAR_BIT != 32 ) exit(0) ? Best regards, Yuri
Re: Non-portable test?
On 09/23/2009 10:44 AM, Yuri Gribov wrote: Hi all, This is my first post to the list so do not be too harsh) I have expected all c-torture tests to be highly portable but I have recently ran into test which relies on int being 32-bit (execute/980526-2.c). Yes, it's possible that 64-bit ints are not supported by the testsuite. Changes to fix that are welcome. Paolo
Re: Non-portable test?
> Yes, it's possible that 64-bit ints are not supported by the testsuite. > Changes to fix that are welcome. I am not a gcc developer. Could someone verify and commit this patch for testsuite/gcc.c-torture/execute/980526-2.c? Best regards, Yuri 980526-2.patch Description: Binary data
Re: Non-portable test?
> Done. But if you have more cases, please report them. Not yet. Thx! -- Best regards, Yuri
Re: what does the calling for min_insn_conflict_delay mean
On Tue, Sep 22, 2009 at 11:50 PM, Vladimir Makarov wrote: > Ian Lance Taylor wrote: >> >> "Amker.Cheng" writes: >> >> >>> >>> In function new_ready, it calls to min_insn_conflict_delay with >>> "min_insn_conflict_delay (curr_state, next, next)". >>> But the function's comments say that it returns minimal delay of issue of >>> the 2nd insn after issuing the 1st in given state. >>> Why the last two parameter for the call are both "next"? >>> seems conflict with the comments. >>> >> >> > > Amker, thanks for finding this issue. It's great pleasure if can help anything. >> >> This change dates back to the first DFA scheduler patch. It does seem a >> little odd, particularly as the call in new_ready is the only use of >> min_insn_conflict_delay. CC'ing vmakarov in case he remembers anything >> about this old code. >> > > I've not remembered this. I guess it was a result of long period of > transition from the old pipeline hazard recognizier to the DFA one which > required to rewrite all old pipeline descriptions. > > Also after starring at this code for some time, I don't like this code. > Now I'd use min_issue_delay (curr_state, next) which is delay of issuing > next in the current function unit reservation state instead of > min_insn_conflict_delay (curr_state, next, next) which is a delay of > issuing the first insn (next) after issuing the second insn (next) on a free > processor (when all function units are free). Probably it was a typo. > Although I think that such change (in many other conditions to move insn > speculatively to the ready list) will not give a visible improvement for > most processors, I'll try it. > > It looks to me that probably I had also some plans for usage of > min_insn_conflict_delay, but I forgot them because it was long ago. > > Is it the delay of issuing next in the current reservation state which expected here? seems the call to min_insn_conflict_delay does nothing harm, except may result in more or less speculative motions(which are all valid ones). -- Best Regards.
Re: RFC: missed loop optimizations from loop induction variable copies
Hi, > IVOpts cannot identify start_26, start_4 and ivtmp_32_7 to be copies. > The root cause is that expression 'i + start' is identified as a common > expression between the test in the header and the index operation in the > latch. This is unified by copy propagation or FRE prior to loop > optimizations > and creates a new induction variable. > > > Does this imply we try and not copy propagate or FRE potential induction > variables? Or is this simply a missed case in IVOpts? IIRC, at some point maybe year or two ago Sebastian worked on enhancing scev to analyze such induction variables (thus enabling IVopts to handle them). But it seems the code did not make it to mainline, Zdenek
the Right place to change a target default for a common compiler flag?
Hi, In the case that a compiler flag in common.opt would best be served with different default values on different targets. I.E. a target-dependent Init() Where can this be effected in the machinery ? I can see how to make an override - but not a default. cheers, Iain
Re: Add new architechture in gcc build error
Thank you. I fixed the error. it caused by macro: #define ELIMINABLE_REGS \ {\ {ARG_POINTER_REGNUM,FRAME_POINTER_REGNUM}, \ {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM}, \ {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM} \ } because everytime when gcc check the frame_pointer_need, if it is false, aim eliminated register is SP. But in the former array, gcc still got FP. So error accurred. Now it is OK with the following: #define ELIMINABLE_REGS \ {\ {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM}, \ {ARG_POINTER_REGNUM,FRAME_POINTER_REGNUM}, \ {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM} \ } just exchange the former two elements. So thanks for your guys.
libgcc doesn't support my target
Hi, When I build gcc first time this which the configure parameter is like this: ../rice-gcc-4.3.0/configure --target=$TARGET --prefix=$PREFIX --enable-languages=c --without-headers --with-newlib --with-gnu-as --with-gnu-ld --disable-multilib --disable-libssp Binutils is ok and install in the $PREFIX path. Error information is like this: checking for /home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/xgcc -B/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/ -B/usr/local/cross/rice-elf/rice-elf/bin/ -B/usr/local/cross/rice-elf/rice-elf/lib/ -isystem /usr/local/cross/rice-elf/rice-elf/include -isystem /usr/local/cross/rice-elf/rice-elf/sys-include option to accept ANSI C... none needed checking how to run the C preprocessor... /home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/xgcc -B/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/ -B/usr/local/cross/rice-elf/rice-elf/bin/ -B/usr/local/cross/rice-elf/rice-elf/lib/ -isystem /usr/local/cross/rice-elf/rice-elf/include -isystem /usr/local/cross/rice-elf/rice-elf/sys-include -E checking whether decimal floating point is supported... no checking whether fixed-point is supported... no *** Configuration rice-mavrix-elf not supported make[1]: *** [configure-target-libgcc] Error 1 make[1]: Leaving directory `/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc' make: *** [all] Error 2 rice-mavrix-elf : rice is my target name. I search the configure in libgcc, there is no target information. And I check the CRX port, it also didn't add more information than I did. Can anybody give me some clue to debug it? Any suggestion is appreciated. Thank you very much. Daniel.Tian
Re: libgcc doesn't support my target
Sorry, I just found and fixed the bug. the config.host file in /libgcc/. Sorry.
DImode operations
Hi: Do I have to write the DImode operations on my *.md target description file? Now I build my gcc first, there is an error on libgcc2.c. which is an __muldi3 function. The error information is: ../../../rice-gcc-4.3.0/libgcc/../gcc/libgcc2.c: In function __muldi3: ../../../rice-gcc-4.3.0/libgcc/../gcc/libgcc2.c:557: internal compiler error: in emit_move_insn, at expr.c:3379 My target is a RISC32 chip. There is no 64bit operations. And now I don't wanna any 64bit operations in my C programs. So do I have to finish the DImode operations? Thank you very much. Best Wishes. daniel.tian
Re: the Right place to change a target default for a common compiler flag?
Quoting IainS : Hi, In the case that a compiler flag in common.opt would best be served with different default values on different targets. I.E. a target-dependent Init() Where can this be effected in the machinery ? I can see how to make an override - but not a default. Set the default to a special value that indicates that the variable has not been set by a user option. Then make the override set the variable to the target-specific default if the variable still has this special value.
Re: DImode operations
daniel tian wrote: > Hi: > > Do I have to write the DImode operations on my *.md target description > file? Yes. movMM must be implemented for all types that you want the compiler to be able to handle at all; it's the only way it knows to move them around. (Technically, it's supposed to be able to treat DImode as BLKmode and break it down by pieces, but this code hasn't always been reliable and is definitely less efficient than implementing a proper movdi pattern in your backend.) > My target is a RISC32 chip. There is no 64bit operations. And now I > don't wanna any 64bit operations in my C programs. > So do I have to finish the DImode operations? I think you really should. Take a look at how other ports handle it; generally they use a define_expand for movdi, which emits the move as two separate SI-mode move insns. (Note in particular how they have to take care what order to emit the two word moves in, as it's possible for the register pairs used in input and output operations to overlap.) If you insisted, you could probably just hack the *di* routines out of the libgcc makefile and get through to the end of the build, but I really wouldn't recommend it, since "long long" is a standard C99 type. It's not a great deal of work to add the expander pattern and code that you'll need. cheers, DaveK
Re: enable-build-with-cxx bootstrap compare broken by r149964
On Tue, 2009-09-22 at 09:40 -0400, Jason Merrill wrote: > On 09/22/2009 07:04 AM, Jerry Quinn wrote: > > On Mon, 2009-09-21 at 13:06 -0400, Jason Merrill wrote: > >> On 09/14/2009 11:54 AM, Jason Merrill wrote: > >>> I think the way to go with this is to revert the compiler bits of > >>> r149964, not mess with mangle.c at all, and insert the initial * if the > >>> typeinfo name won't have TREE_PUBLIC set, since that's precisely the > >>> property we want to mirror in comparison. > >> > >> Thoughts? Another concern I have is that adding an initial * breaks > >> simple demangling of type_info::name(), so I'd like to find another way > >> of marking it for pointer comparison. > > > > What if we have type_info::name() be smart? I.e. > > > > const char* name() { return name[0] == '*' ? name + 1 : name; } > > > > Then the * can still be a flag indicating compare by pointer. > > I like it. I'm trying the following in cp/rtti.c, but I get a segfault compiling testsuite/g++.dg/debug/dwarf2/pr41063.C Removing the TREE_PUBLIC code fixes the segfault, so it's definitely related. I also tried using an arbitrary string for name_string, but I get the same segfault. It seems like something is expecting the name to be exactly in synch with the decl. I'm not really sure how everything fits together here. Am I missing something obvious? tinfo_base_init (tinfo_s *ti, tree target) { tree init = NULL_TREE; tree name_decl; tree vtable_ptr; { tree name_name; /* Generate the NTBS array variable. */ tree name_type = build_cplus_array_type (build_qualified_type (char_type_node, TYPE_QUAL_CONST), NULL_TREE); tree name_string = tinfo_name (target); /* Determine the name of the variable -- and remember with which type it is associated. */ name_name = mangle_typeinfo_string_for_type (target); TREE_TYPE (name_name) = target; name_decl = build_lang_decl (VAR_DECL, name_name, name_type); SET_DECL_ASSEMBLER_NAME (name_decl, name_name); DECL_ARTIFICIAL (name_decl) = 1; DECL_IGNORED_P (name_decl) = 1; TREE_READONLY (name_decl) = 1; TREE_STATIC (name_decl) = 1; DECL_EXTERNAL (name_decl) = 0; DECL_TINFO_P (name_decl) = 1; set_linkage_according_to_type (target, name_decl); import_export_decl (name_decl); if (!TREE_PUBLIC (name_decl)) { /* Inject '*' at start of name to force pointer comparison. */ int len = TREE_STRING_LENGTH (name_string); char* buf = (char*) XNEWVEC (char, len + 1); buf[0] = '*'; memcpy (buf + 1, TREE_STRING_POINTER (name_string), len); name_string = build_string (len + 1, buf); XDELETEVEC (buf); } DECL_INITIAL (name_decl) = name_string; mark_used (name_decl); pushdecl_top_level_and_finish (name_decl, name_string); }
Re: the Right place to change a target default for a common compiler flag?
Iain, I am currently bootstrapping on i686-apple-darwin9 with the current patch: diff -uN /opt/gcc/_gcc_clean/config/mh-intel-darwin /opt/gcc/gcc-4.5-work/config/mh-intel-darwin --- /opt/gcc/_gcc_clean/config/mh-intel-darwin 1970-01-01 01:00:00.0 +0100 +++ /opt/gcc/gcc-4.5-work/config/mh-intel-darwin2009-09-23 13:47:12.0 +0200 @@ -0,0 +1,3 @@ +# Set strict-dwarf for Darwin + +BOOT_CFLAGS += -gstrict-dwarf diff -uN /opt/gcc/_gcc_clean/config/mh-ppc-darwin /opt/gcc/gcc-4.5-work/config/mh-ppc-darwin --- /opt/gcc/_gcc_clean/config/mh-ppc-darwin2008-02-25 11:00:23.0 +0100 +++ /opt/gcc/gcc-4.5-work/config/mh-ppc-darwin 2009-09-23 12:07:12.0 +0200 @@ -2,4 +2,4 @@ # position-independent-code -- the usual default on Darwin. This fix speeds # compiles by 3-5%. -BOOT_CFLAGS += -mdynamic-no-pic +BOOT_CFLAGS += -mdynamic-no-pic -gstrict-dwarf --- /opt/gcc/_gcc_clean/configure 2009-09-22 20:04:27.0 +0200 +++ /opt/gcc/gcc-4.5-work/configure 2009-09-23 13:50:29.0 +0200 @@ -3655,6 +3655,12 @@ powerpc-*-darwin*) host_makefile_frag="config/mh-ppc-darwin" ;; + i[3456789]86-*-darwin*) +host_makefile_frag="config/mh-intel-darwin" +;; + x86_64-*-darwin[912]*) +host_makefile_frag="config/mh-intel-darwin" +;; powerpc-*-aix*) host_makefile_frag="config/mh-ppc-aix" ;; I am currently at stage 3 and I see -gstrict-dwarf in the log file. I don't know if it is the Right place, but it seems to work so far. Cheers, Dominique
Re: the Right place to change a target default for a common compiler flag?
Hi Dominique, I would expect you to need -gstrict-dwarf in CFLAGS_FOR_TARGET also but the point of my question is to find a way of having this on by default on Darwin (which is what we currently seem to need). (more research is need on the latter - to determine whether the problem lies in our emission of debug fragments - or in the tools cheers, Iain On 23 Sep 2009, at 14:42, Dominique Dhumieres wrote: Iain, I am currently bootstrapping on i686-apple-darwin9 with the current patch: diff -uN /opt/gcc/_gcc_clean/config/mh-intel-darwin /opt/gcc/ gcc-4.5-work/config/mh-intel-darwin --- /opt/gcc/_gcc_clean/config/mh-intel-darwin 1970-01-01 01:00:00.0 +0100 +++ /opt/gcc/gcc-4.5-work/config/mh-intel-darwin 2009-09-23 13:47:12.0 +0200 @@ -0,0 +1,3 @@ +# Set strict-dwarf for Darwin + +BOOT_CFLAGS += -gstrict-dwarf diff -uN /opt/gcc/_gcc_clean/config/mh-ppc-darwin /opt/gcc/gcc-4.5- work/config/mh-ppc-darwin --- /opt/gcc/_gcc_clean/config/mh-ppc-darwin 2008-02-25 11:00:23.0 +0100 +++ /opt/gcc/gcc-4.5-work/config/mh-ppc-darwin 2009-09-23 12:07:12.0 +0200 @@ -2,4 +2,4 @@ # position-independent-code -- the usual default on Darwin. This fix speeds # compiles by 3-5%. -BOOT_CFLAGS += -mdynamic-no-pic +BOOT_CFLAGS += -mdynamic-no-pic -gstrict-dwarf --- /opt/gcc/_gcc_clean/configure 2009-09-22 20:04:27.0 +0200 +++ /opt/gcc/gcc-4.5-work/configure 2009-09-23 13:50:29.0 +0200 @@ -3655,6 +3655,12 @@ powerpc-*-darwin*) host_makefile_frag="config/mh-ppc-darwin" ;; + i[3456789]86-*-darwin*) +host_makefile_frag="config/mh-intel-darwin" +;; + x86_64-*-darwin[912]*) +host_makefile_frag="config/mh-intel-darwin" +;; powerpc-*-aix*) host_makefile_frag="config/mh-ppc-aix" ;; I am currently at stage 3 and I see -gstrict-dwarf in the log file. I don't know if it is the Right place, but it seems to work so far. Cheers, Dominique
Re: the Right place to change a target default for a common compiler flag?
With the previous patch, bootstrap failed when building libgomp: -gstrict-dwarf was not passed during the configure stage. So it is not sufficient to pass it to BOOT_CFLAGS. Would repeating the trick for CFLAGS_FOR_TARGET have a chance to work? Dominique
SSA GIMPLE
Hello, I am looking for some more information of the SSA Gimple syntax and was wondering if there was BNF available? I am interested in the IR of gcc and am just looking for some further documentation/explanation of some of the syntax I am observing such as: OBJ_TYPE_REF(D.103787_32;D.103784_29->4) (D.103784_29, value__23); ** save_filt.1022_12 = <<>>; save_eptr.1021_13 = <<>>; resx; iftmp.256_17 = (int (*__vtbl_ptr_type) (void) *) D.52956_16; D.53402_2 = &this_1->m_cur_val; __base_ctor (&D.53467); __comp_ctor (&nm, if_typename__8, &D.53467); __cxa_atexit (__tcf_0, 0B, &__dso_handle); __static_initialization_and_destruction_0 (1, 65535); Does anyone know where I might find such information? Any help and/or pointers in the direction of information would be most welcome. I tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple Thanks and regards all! Rob
Re: enable-build-with-cxx bootstrap compare broken by r149964
On 09/23/2009 09:22 AM, Jerry Quinn wrote: I'm not really sure how everything fits together here. Am I missing something obvious? I notice that you're missing the fix_string_type that tinfo_name does. But I'd rather not duplicate the code that creates the STRING_CST; better to delay the call to tinfo_name and add the * there. Jason
Re: SSA GIMPLE
On Wed, Sep 23, 2009 at 11:01, Rob Quigley wrote: > Does anyone know where I might find such information? Any help and/or > pointers in the direction of information would be most welcome. I > tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple There are articles, slides and pointers to internal documentation at http://gcc.gnu.org/wiki/GettingStarted You can post specific questions here and/or drop by the IRC channel at irc.oftc.net/#gcc Diego.
Re: SSA GIMPLE
Rob Quigley writes: > I am looking for some more information of the SSA Gimple syntax and > was wondering if there was BNF available? There is no BNF. Sorry. > I am interested in the IR of gcc and am just looking for some further > documentation/explanation of some of the syntax I am observing such > as: This syntax is intended to be a C-like dump of the internal data structures. > Does anyone know where I might find such information? Any help and/or > pointers in the direction of information would be most welcome. I > tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple There is some documentation in the gcc internals manual at the bottom of http://gcc.gnu.org/onlinedocs/ . Ian
Re: Lattice Mico32 port
+#define PSEUDO_REG_P(X) ((X)>=FIRST_PSEUDO_REGISTER) There's already a HARD_REGISTER_NUM_P that's the exact inverse. +#define G_REG_P(X) ((X)<32) I suppose you're planning to add floating point registers? +#define CONST_OK_FOR_LETTER_P(VALUE, C) \ +( (C) == 'J' ? (VALUE) == 0\ + : (C) == 'K' ? MEDIUM_INT (VALUE) \ + : (C) == 'L' ? MEDIUM_UINT (VALUE) \ + : (C) == 'M' ? LARGE_INT (VALUE) \ + : 0\ +) + +#define CONST_DOUBLE_OK_FOR_LETTER_P(VALUE, C) 0 These defines are replaced by define_constraint, typically in constraints.md. +/* FIXME - This is not yet supported. */ +#define STATIC_CHAIN_REGNUM 3 While you don't actually support this yet, you'd do well to define it to one of the call-clobbered registers that isn't an argument register -- r9 or r10 by the looks of it. +#define GO_IF_LEGITIMATE_ADDRESS(m,x,l) \ Use the TARGET_LEGITIMATE_ADDRESS_P target hook. +#define ARM_LEGITIMIZE_ADDRESS(X, OLDX, MODE, WIN) \ Copy and paste? +#define MEDIUM_INT(X) HOST_WIDE_INT)(X)) >= -32768) && (((HOST_WIDE_INT)(X)) < 32768)) +#define MEDIUM_UINT(X) (((unsigned HOST_WIDE_INT)(X)) < 65536) Use the IN_RANGE macro. And if you move these to define_constraints, as mentioned above, you won't need the cast to HOST_WIDE_INT. > +#define LARGE_INT(X)\ > +((X) >= (-(HOST_WIDE_INT) 0x7fff - 1) \ > + && (X) <= (unsigned HOST_WIDE_INT) 0x) Did you really want a signed low and an unsigned high on this? It would seem that at some point you're getting signed and unsigned values confused somewhere if you need this... +__ashlsi3: +/* Only use 5 LSBs, as that's all the h/w shifter uses. */ +andir2, r2, 0x1f +/* Get address of offset into unrolled shift loop to jump to. */ +#ifdef __PIC__ +orhir3, r0, gotoffhi16(__ashlsi3_table) +addir3, r3, gotofflo16(__ashlsi3_table) +add r3, r3, gp +#else +mvhir3, hi(__ashlsi3_table) +ori r3, r3, lo(__ashlsi3_table) +#endif Seems like avoiding the table and knowing that each entry is 4 bytes back would be a teeny bit faster. mvhir3, hi(__ashlsi3_0) add r2, r2, r2 ori r3, r3, lo(__ashlsi3_0) add r2, r2, r2 sub r3, r3, r2 b r3 Also, it would seem that you'd be able to arrange for these alternate entry points to be invoked directly. Something like (define_insn "*ashlsi3_const" [(set (match_operand:SI 0 "register_operand" "=R1") (ashift:SI (match_operand:SI 1 "register_operand" "0") (match_operand:SI 2 "const_5bit_operand" "i"))) (clobber (match_scratch:SI 3 "=RA"))] "!TARGET_BARREL_SHIFT_ENABLED" "calli __ashlsi3_%2" [(set_attr "type" "call")]) Where R1 and RA are singleton register classes for those respective registers. Obviously you can delay this as an improvement for later. + /* Raise divide by zero exception. */ + int eba; + __asm__ __volatile__ ("rcsr %0, EBA":"=r" (eba)); + eba += 32 * 5; + __asm__ __volatile__ ("mv ea, ra"); + __asm__ __volatile__ ("b %0"::"r" (eba)); You want to put __builtin_unreachable() there after the branch. + emit_insn (gen_movsi_imm_lo (operands[0], operands[0], GEN_INT (INTVAL (operands[1]; Line wrap. There are other instances too. +(define_insn "movsi_kimm" +(define_insn "movsi_limm" +(define_insn "movsi_imm_hi" +(define_insn "movsi_reloc_gprel" +(define_insn "movsi_reloc_hi" +(define_insn "*movsi_insn" Having these as separate instruction patterns is an extremely bad idea. All moves of a given mode should be in the same pattern, so that reload can have the freedom to do its spilling as needed. While your unspecs are except from this, things that just use HIGH aren't. Using HIGH and LO_SUM on integer constants is a bad idea. Much better to just go ahead and create a constraint letter; see for instance Alpha's define_constraint "L". +(define_insn "*movqi_insn" + [(set (match_operand:QI 0 "register_or_memory_operand" "=r,r,m") +(match_operand:QI 1 "register_or_memory_operand" "m,r,r"))] Not having QImode or HImode constants is a mistake. +static bool +lm32_frame_pointer_required (void) +{ + /* If the function contains dynamic stack allocations, we need to + use the frame pointer to access the static parts of the frame. */ + if (cfun->calls_alloca) +return true; alloca is handled for you by generic code. You shouldn't need to define this hook at all. r~
question on dwarf2 debug-frame.
Hello, I have this scenario: using "dwarfdump --debug-frame" in a very simple object generated with current trunk. I am trying to figure out (with the dwarf3 spec) wether the problem is in the tool (dwarfdump), or what we're emitting. Can anyone more knowledgeable comment? Iain. -- File: simplistic.o { mach32-i386 } -- .debug_frame contents: 0x: CIE length: 0x0010 CIE_id: 0x version: 0x01 augmentation: "" code_align: 1 data_align: -4 ra_register: 0x08 Initial Inst: DW_CFA_def_cfa (4, 4) DW_CFA_offset (8, 0) DW_CFA_nop DW_CFA_nop Init State: CFA( R4+4 ) R8=+0 0x0014: FDE length: 0x0028 CIE_pointer: 0x start_addr: 0x range_size: 0x0012 Instructions: 0x: CFA( R4+4 ) R8=+0 DW_CFA_advance_loc4 (1) DW_CFA_def_cfa_offset (8) DW_CFA_offset (5, -8) 0x0001: CFA( R4+8 ) R5=-8 R8=+0 DW_CFA_advance_loc4 (2) DW_CFA_def_cfa_register (5) 0x0003: CFA( R5+8 ) R5=-8 R8=+0 DW_CFA_advance_loc4 (14) DW_CFA_restore (5) Assertion failed: (reg_state_pos != cie->initial_state.regs.end()), function ParseInstructions, file /SourceCache/dwarf_utilities/ dwarf_utilities-49/source/DWARFDebugFrame.cpp, line 353. Abort trap the -save-temps -dA output for this is: .section __DWARF,__debug_frame,regular,debug Lframe0: .set L$set$0,LECIE0-LSCIE0 .long L$set$0 # Length of Common Information Entry LSCIE0: .long 0x # CIE Identifier Tag .byte 0x1 # CIE Version .ascii "\0" # CIE Augmentation .byte 0x1 # uleb128 0x1; CIE Code Alignment Factor .byte 0x7c# sleb128 -4; CIE Data Alignment Factor .byte 0x8 # CIE RA Column .byte 0xc # DW_CFA_def_cfa .byte 0x4 # uleb128 0x4 .byte 0x4 # uleb128 0x4 .byte 0x88# DW_CFA_offset, column 0x8 .byte 0x1 # uleb128 0x1 .align 2 LECIE0: LSFDE0: .set L$set$1,LEFDE0-LASFDE0 .long L$set$1 # FDE Length LASFDE0: .set L$set$2,Lframe0-Lsection__debug_frame .long L$set$2 # FDE CIE offset .long LFB0# FDE initial location .set L$set$3,LFE0-LFB0 .long L$set$3 # FDE address range .byte 0x4 # DW_CFA_advance_loc4 .set L$set$4,LCFI0-LFB0 .long L$set$4 .byte 0xe # DW_CFA_def_cfa_offset .byte 0x8 # uleb128 0x8 .byte 0x85# DW_CFA_offset, column 0x5 .byte 0x2 # uleb128 0x2 .byte 0x4 # DW_CFA_advance_loc4 .set L$set$5,LCFI1-LCFI0 .long L$set$5 .byte 0xd # DW_CFA_def_cfa_register .byte 0x5 # uleb128 0x5 .byte 0x4 # DW_CFA_advance_loc4 .set L$set$6,LCFI3-LCFI1 .long L$set$6 .byte 0xc5# DW_CFA_restore, column 0x5 .byte 0xc # DW_CFA_def_cfa .byte 0x4 # uleb128 0x4 .byte 0x4 # uleb128 0x4 .align 2 LEFDE0:
Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]
Dave Korn wrote: > Eric Botcazou wrote: >> Your .diff contains this >> >> + EH_MECHANISM=-gcc >> >> so it looks as though the base compiler was SJLJ. > > Ah, bingo! Thanks Eric; yes, I have a recent build of an SJLJ Gnat from > HEAD lying around my PATH ahead of my old 4.3.2-with-ZCX. Getting that out of > the way should help! And although it turns out that was the case, it didn't actually solve the problem. It turns out to be a horribly subtle artifact of this factor: > switched it over to ZCX, and it worked well > enough to pass most of the testsuite, including EH. Now I'm changing the > target pairs on top of that and suddenly it's complaining, which is why I'm > confused; I thought that bit was stable. This was driving me mad, I had a perfectly working ZCX compiler but every time I tried to change anything, it mysteriously switched itself back to SJLJ for seemingly no reason at all and then failed building target-libada as a consequence. The thing was down to the particular way in which I was setting the LIBGNAT_TARGET_PAIRS variable; because of the way Cygwin and MinGW share most of their port implementation, I was doing this: LIBGNAT_TARGET_PAIRS = \ [ ... overrides only for mingw ... ] if ( ... target is cygwin ... ) # blank it out, no cygwin-only overrides yet LIBGNAT_TARGET_PAIRS = endif LIBGNAT_TARGET_PAIRS += \ [ ... common overrides ... ] And the result of doing it this way was that LIBGNAT_TARGET_PAIRS ended up with an embedded leading space. This wouldn't have mattered much, except for one little thing: later, in gcc-interface/Makefile.in, we have ... ifeq ($(filter-out a-except%,$(LIBGNAT_TARGET_PAIRS)),$(LIBGNAT_TARGET_PAIRS)) LIBGNAT_TARGET_PAIRS += \ a-except.ads
Re: question on dwarf2 debug-frame.
On 09/23/2009 11:00 AM, IainS wrote: DW_CFA_restore (5) Assertion failed: (reg_state_pos != cie->initial_state.regs.end()), function ParseInstructions, file /SourceCache/dwarf_utilities/dwarf_utilities-49/source/DWARFDebugFrame.cpp, line 353. Abort trap There could be some confusion in DW_CFA_restore vs DW_CFA_same_value, though I don't know on whose side it is. Certainly the existing consumers that I know treat a DW_CFA_restore for a register not mentioned by the CIE the same as "same_value". r~
Re: Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]
> Is it just a bug for me to generate LIBGNAT_TARGET_PAIRS in a way that > has superfluous spaces (whether leading, trailing or embedded), or shall I > send a patch to add a $(strip) to the right-hand side of the ifeq > comparison? Or perhaps we should do > > LIBGNAT_TARGET_PAIRS:=$(strip $(LIBGNAT_TARGET_PAIRS)) > > right at the top-level, just after the per-target chunks, to ensure the > string is properly normalised before any further tests and comparisons we > might want to make? That indeed seems to be a good idea (with a little comment). -- Eric Botcazou
Re: C++ support for decimal floating point
On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote: > On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson wrote: > > I've been implementing ISO/IEC TR 24733, "an extension for the > > programming language C++ to support decimal floating-point arithmetic", > > in GCC. It might be ready as an experimental feature for 4.5, but I > > would particularly like to get in the compiler changes that are needed > > for it. > > > > Most of the support for the TR is in new header files in libstdc++ that > > depend on compiler support for decimal float scalar types. Most of that > > compiler functionality was already available in G++ via mode attributes. > > I've made a couple of small fixes and have a couple more to submit, and > > when those are in I'll starting running dfp tests for C++ as well as C. > > The suitable tests have already been moved from gcc.dg to c-c++-common. > > > > In order to provide interoperability with C, people on the C++ ABI > > mailing list suggested that a C++ compiler should recognize the new > > decimal classes defined in the TR and pass arguments of those types the > > same as scalar decimal float types for a particular target. I had this > > working in an ugly way using a langhook, but that broke with LTO. I'm > > looking for the right places to record that an argument or return value > > should be passed as if it were a different type, but could use some > > advice about that. > > How do we (do we?) handle std::complex<> there? My first shot would > be to make sure the aggregate type has the proper mode, but I guess > most target ABIs would already pass them in registers, no? std::complex<> is not interoperable with GCC's complex extension, which is generally viewed as "unfortunate". The class types for std::decimal::decimal32 and friends do have the proper modes. I suppose I could special-case aggregates of those modes but the plan was to pass these particular classes (and typedefs of them) the same as scalars, rather than _any_ class with those modes. I'll bring this up again on the C++ ABI mailing list. Perhaps most target ABIs pass single-member aggregates using the mode of the aggregate, but not all. In particular, not the 32-bit ELF ABI for Power. Janis
Re: C++ support for decimal floating point
On 09/23/2009 02:11 PM, Janis Johnson wrote: The class types for std::decimal::decimal32 and friends do have the proper modes. I suppose I could special-case aggregates of those modes but the plan was to pass these particular classes (and typedefs of them) the same as scalars, rather than _any_ class with those modes. I'll bring this up again on the C++ ABI mailing list. You could special-case this in the C++ conversion to generic by having the std::decimal classes decompose to scalars immediately.
Re: C++ support for decimal floating point
On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson wrote: > On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote: >> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson wrote: >> > I've been implementing ISO/IEC TR 24733, "an extension for the >> > programming language C++ to support decimal floating-point arithmetic", >> > in GCC. It might be ready as an experimental feature for 4.5, but I >> > would particularly like to get in the compiler changes that are needed >> > for it. >> > >> > Most of the support for the TR is in new header files in libstdc++ that >> > depend on compiler support for decimal float scalar types. Most of that >> > compiler functionality was already available in G++ via mode attributes. >> > I've made a couple of small fixes and have a couple more to submit, and >> > when those are in I'll starting running dfp tests for C++ as well as C. >> > The suitable tests have already been moved from gcc.dg to c-c++-common. >> > >> > In order to provide interoperability with C, people on the C++ ABI >> > mailing list suggested that a C++ compiler should recognize the new >> > decimal classes defined in the TR and pass arguments of those types the >> > same as scalar decimal float types for a particular target. I had this >> > working in an ugly way using a langhook, but that broke with LTO. I'm >> > looking for the right places to record that an argument or return value >> > should be passed as if it were a different type, but could use some >> > advice about that. >> >> How do we (do we?) handle std::complex<> there? My first shot would >> be to make sure the aggregate type has the proper mode, but I guess >> most target ABIs would already pass them in registers, no? > > std::complex<> is not interoperable with GCC's complex extension, which > is generally viewed as "unfortunate". Could you expand on why std::complex<> is not interoperable with GCC's complex extension. The reason is that I would like to know better where the incompatibilities come from -- I've tried to remove any. > > The class types for std::decimal::decimal32 and friends do have the > proper modes. I suppose I could special-case aggregates of those modes > but the plan was to pass these particular classes (and typedefs of > them) the same as scalars, rather than _any_ class with those modes. > I'll bring this up again on the C++ ABI mailing list. I introduced the notion of 'literal types' in C++0x precisely so that compilers can pretend that user-defined types are like builtin types and provide appropriate support. decimal types are literal types. So are std::complex for T = builtin arithmetic types. > > Perhaps most target ABIs pass single-member aggregates using the > mode of the aggregate, but not all. In particular, not the 32-bit > ELF ABI for Power. > > Janis > > > >
Re: Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]
Eric Botcazou wrote: >> Is it just a bug for me to generate LIBGNAT_TARGET_PAIRS in a way that >> has superfluous spaces (whether leading, trailing or embedded), or shall I >> send a patch to add a $(strip) to the right-hand side of the ifeq >> comparison? Or perhaps we should do >> >> LIBGNAT_TARGET_PAIRS:=$(strip $(LIBGNAT_TARGET_PAIRS)) >> >> right at the top-level, just after the per-target chunks, to ensure the >> string is properly normalised before any further tests and comparisons we >> might want to make? > > That indeed seems to be a good idea (with a little comment). > Actually, the test logic is kinda backwards. We want to know if LIBGNAT_TARGET_PAIRS contains anything matching a certain pattern, so we remove anything matching that pattern and then see if the string has changed or not? It would seem a bit more direct and to-the-point to just have used $(filter) instead of $(filter-out) and compare against an empty string, and that way would have been robust in the face of whitespace changes. Maybe I'll rewrite that test as well in the patch. cheers, DaveK
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
Hi Richard, I finally got around to getting the data you wanted. Thanks for the response. Please find my comments below. On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther wrote: > On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote: >> Hi, >> >>Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. >> >> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >> that the results are the same with/without this patch. > > The patch misses testcases. Added. Why does zee run after register allocation? > Your examples suggest that it will free hard registers so doing it before > regalloc looks odd. Originally, I had written this patch to have ZEE run before IRA. However, I noticed that IRA generates poorer code when my patch is turned on. Here is to give an example of how badly RA can hurt . I show a piece of code around a zero-extend that got eliminated. The code on the right is after eliminating zero-extends. The code is pretty much the same except the extra move highlighted in yellow. IRA is not able to coalesce %esi and %r15d. Base line : 48b760: imul $0x9e406cb5,%r15d,%esi 48b767: mov%rax,%rcx 48b76a: shr$0x12,%esi 48b76d: and%r12d,%esi 48b770: mov%edi,%eax 48b772: add$0x1,%edi 48b775: shr$0x5,%eax 48b778: mov%eax,%eax# redundant zero extend 48b77a: lea(%rcx,%rax,1),%rax 48b77e: cmp%rax,%r9 -fzee : 48b7d0: imul $0x9e406cb5,%r15d,%r15d # The destination should have just been esi. 48b7d7: mov%rax,%rcx 48b7da: shr$0x12,%r15d 48b7de: mov%r15d,%esi # This move is useless if r15d and esi can be coalesced into esi. 48b7e1: and%r12d,%esi 48b7e4: mov%edi,%eax 48b7e6: add$0x1,%edi 48b7e9: shr$0x5,%eax Ok, zero-extend eliminated. 48b7ec: lea(%rcx,%rax,1),%rax 48b7f0: cmp%rax,%r9 Going after IRA preserves code quality and the useless extension gets removed. > > What is the compile-time impact of your patch on say, gcc bootstrap? > How many percent of instructions are removed as useless zero-extensions > during gcc bootstrap? How much do CSiBE numbers improve? CSiBE numbers : Total number of zero-extension instructions before : 667. Total number of zero-extension instructions after : 122. Performance : no measurable impact. GCC bootstrap : Total number of zero-extension instructions before : 1456 Total number of zero-extension instructions after: 5814 No impact on boot-strap time. I have attached the latest patch : On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther wrote: > On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote: >> Hi, >> >> Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. >> >> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >> that the results are the same with/without this patch. > > The patch misses testcases. Why does zee run after register allocation? > Your examples suggest that it will free hard registers so doing it before > regalloc looks odd. > > What is the compile-time impact of your patch on say, gcc bootstrap? > How many percent of instructions are removed as useless zero-extensions > during gcc bootstrap? How much do CSiBE numbers improve? > > Thanks, > Richard. > >> >> Problem Description : >> - >> >> This pass is intended to be applicable only to targets that implicitly >> zero-extend 64-bit registers after writing to their lower 32-bit half. >> For instance, x86_64 zero-extends the upper bits of a register >> implicitly whenever an instruction writes to its lower 32-bit half. >> For example, the instruction *add edi,eax* also zero-extends the upper >> 32-bits of rax after doing the addition. These zero extensions come >> for free and GCC does not always exploit this well. That is, it has >> been observed that there are plenty of cases where GCC explicitly >> zero-extends registers for x86_64 that are actually useless because >> these registers were already implicitly zero-extended in a prior >> instruction. This pass tries to eliminate such useless zero extension >> instructions. >> >> Motivating Example I : >> -- >> For this program : >> ** >> bad_code.c >> >> int mask[1000]; >> >> int foo(unsigned x) >> { >> if (x < 10) >> x = x * 45; >> else >> x = x * 78; >> return mask[x]; >> } >> ** >> >> $ gcc -O2 bad_code.c >> >> 400315: b8 4e 00 00 00 mov $0x4e,%eax >> 40031a: 0f af f8 imul %eax,%edi >> 40031d: 89 ff mov %edi,%edi >> ---> Useless zero extend. >> 40031f: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >> 400326: c3 retq >> .. >> 400330: ba 2d 00 00 00 mov $0x2d,%edx >> 400335: 0f af fa imul
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
On Sat, Aug 8, 2009 at 2:59 PM, Sriraman Tallam wrote: > Hi, > > Here is a patch to eliminate redundant zero-extension instructions > on x86_64. > > Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified > that the results are the same with/without this patch. > > > Problem Description : > - > > This pass is intended to be applicable only to targets that implicitly > zero-extend 64-bit registers after writing to their lower 32-bit half. > For instance, x86_64 zero-extends the upper bits of a register > implicitly whenever an instruction writes to its lower 32-bit half. > For example, the instruction *add edi,eax* also zero-extends the upper > 32-bits of rax after doing the addition. These zero extensions come > for free and GCC does not always exploit this well. That is, it has > been observed that there are plenty of cases where GCC explicitly > zero-extends registers for x86_64 that are actually useless because > these registers were already implicitly zero-extended in a prior > instruction. This pass tries to eliminate such useless zero extension > instructions. > Does this fix: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17387 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34653 -- H.J.
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
> > GCC bootstrap : > > Total number of zero-extension instructions before : 1456 > Total number of zero-extension instructions after : 5814 > No impact on boot-strap time. You sure you have these numbers the right way around ? Shouldn't the number of zero-extension instructions after the patch be less than the number of zero-extension instructions before or is this a regression ? Thanks, Ramana > > > I have attached the latest patch : > > > On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther > wrote: >> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote: >>> Hi, >>> >>> Here is a patch to eliminate redundant zero-extension instructions >>> on x86_64. >>> >>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >>> that the results are the same with/without this patch. >> >> The patch misses testcases. Why does zee run after register allocation? >> Your examples suggest that it will free hard registers so doing it before >> regalloc looks odd. >> >> What is the compile-time impact of your patch on say, gcc bootstrap? >> How many percent of instructions are removed as useless zero-extensions >> during gcc bootstrap? How much do CSiBE numbers improve? >> >> Thanks, >> Richard. >> >>> >>> Problem Description : >>> - >>> >>> This pass is intended to be applicable only to targets that implicitly >>> zero-extend 64-bit registers after writing to their lower 32-bit half. >>> For instance, x86_64 zero-extends the upper bits of a register >>> implicitly whenever an instruction writes to its lower 32-bit half. >>> For example, the instruction *add edi,eax* also zero-extends the upper >>> 32-bits of rax after doing the addition. These zero extensions come >>> for free and GCC does not always exploit this well. That is, it has >>> been observed that there are plenty of cases where GCC explicitly >>> zero-extends registers for x86_64 that are actually useless because >>> these registers were already implicitly zero-extended in a prior >>> instruction. This pass tries to eliminate such useless zero extension >>> instructions. >>> >>> Motivating Example I : >>> -- >>> For this program : >>> ** >>> bad_code.c >>> >>> int mask[1000]; >>> >>> int foo(unsigned x) >>> { >>> if (x < 10) >>> x = x * 45; >>> else >>> x = x * 78; >>> return mask[x]; >>> } >>> ** >>> >>> $ gcc -O2 bad_code.c >>> >>> 400315: b8 4e 00 00 00 mov $0x4e,%eax >>> 40031a: 0f af f8 imul %eax,%edi >>> 40031d: 89 ff mov %edi,%edi >>> ---> Useless zero extend. >>> 40031f: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>> 400326: c3 retq >>> .. >>> 400330: ba 2d 00 00 00 mov $0x2d,%edx >>> 400335: 0f af fa imul %edx,%edi >>> 400338: 89 ff mov %edi,%edi ---> >>> Useless zero extend. >>> 40033a: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>> 400341: c3 retq >>> >>> $ gcc -O2 -fzee bad_code.c >>> .. >>> 400315: 6b ff 4e imul $0x4e,%edi,%edi >>> 400318: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>> 40031f: c3 retq >>> 400320: 6b ff 2d imul $0x2d,%edi,%edi >>> 400323: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>> 40032a: c3 retq >>> >>> >>> >>> Thanks, >>> >>> Sriraman M Tallam. >>> Google, Inc. >>> tmsri...@google.com >>> >> >
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
Sorry, it is the other way around. Total number of zero-extension instructions before : 5814 Total number of zero-extension instructions after : 1456 Thanks for pointing it. On Wed, Sep 23, 2009 at 4:10 PM, Ramana Radhakrishnan wrote: >> >> GCC bootstrap : >> >> Total number of zero-extension instructions before : 1456 >> Total number of zero-extension instructions after : 5814 >> No impact on boot-strap time. > > > You sure you have these numbers the right way around ? Shouldn't the > number of zero-extension instructions after the patch be less than the > number of zero-extension instructions before or is this a regression > ? > > Thanks, > Ramana > >> >> >> I have attached the latest patch : >> >> >> On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther >> wrote: >>> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote: Hi, Here is a patch to eliminate redundant zero-extension instructions on x86_64. Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified that the results are the same with/without this patch. >>> >>> The patch misses testcases. Why does zee run after register allocation? >>> Your examples suggest that it will free hard registers so doing it before >>> regalloc looks odd. >>> >>> What is the compile-time impact of your patch on say, gcc bootstrap? >>> How many percent of instructions are removed as useless zero-extensions >>> during gcc bootstrap? How much do CSiBE numbers improve? >>> >>> Thanks, >>> Richard. >>> Problem Description : - This pass is intended to be applicable only to targets that implicitly zero-extend 64-bit registers after writing to their lower 32-bit half. For instance, x86_64 zero-extends the upper bits of a register implicitly whenever an instruction writes to its lower 32-bit half. For example, the instruction *add edi,eax* also zero-extends the upper 32-bits of rax after doing the addition. These zero extensions come for free and GCC does not always exploit this well. That is, it has been observed that there are plenty of cases where GCC explicitly zero-extends registers for x86_64 that are actually useless because these registers were already implicitly zero-extended in a prior instruction. This pass tries to eliminate such useless zero extension instructions. Motivating Example I : -- For this program : ** bad_code.c int mask[1000]; int foo(unsigned x) { if (x < 10) x = x * 45; else x = x * 78; return mask[x]; } ** $ gcc -O2 bad_code.c 400315: b8 4e 00 00 00 mov $0x4e,%eax 40031a: 0f af f8 imul %eax,%edi 40031d: 89 ff mov %edi,%edi ---> Useless zero extend. 40031f: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax 400326: c3 retq .. 400330: ba 2d 00 00 00 mov $0x2d,%edx 400335: 0f af fa imul %edx,%edi 400338: 89 ff mov %edi,%edi ---> Useless zero extend. 40033a: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax 400341: c3 retq $ gcc -O2 -fzee bad_code.c .. 400315: 6b ff 4e imul $0x4e,%edi,%edi 400318: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax 40031f: c3 retq 400320: 6b ff 2d imul $0x2d,%edi,%edi 400323: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax 40032a: c3 retq Thanks, Sriraman M Tallam. Google, Inc. tmsri...@google.com >>> >> >
Re: C++ support for decimal floating point
On Wed, 2009-09-23 at 16:27 -0500, Gabriel Dos Reis wrote: > On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson wrote: > > On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote: > >> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson wrote: > >> > I've been implementing ISO/IEC TR 24733, "an extension for the > >> > programming language C++ to support decimal floating-point arithmetic", > >> > in GCC. It might be ready as an experimental feature for 4.5, but I > >> > would particularly like to get in the compiler changes that are needed > >> > for it. > >> > > >> > Most of the support for the TR is in new header files in libstdc++ that > >> > depend on compiler support for decimal float scalar types. Most of that > >> > compiler functionality was already available in G++ via mode attributes. > >> > I've made a couple of small fixes and have a couple more to submit, and > >> > when those are in I'll starting running dfp tests for C++ as well as C. > >> > The suitable tests have already been moved from gcc.dg to c-c++-common. > >> > > >> > In order to provide interoperability with C, people on the C++ ABI > >> > mailing list suggested that a C++ compiler should recognize the new > >> > decimal classes defined in the TR and pass arguments of those types the > >> > same as scalar decimal float types for a particular target. I had this > >> > working in an ugly way using a langhook, but that broke with LTO. I'm > >> > looking for the right places to record that an argument or return value > >> > should be passed as if it were a different type, but could use some > >> > advice about that. > >> > >> How do we (do we?) handle std::complex<> there? My first shot would > >> be to make sure the aggregate type has the proper mode, but I guess > >> most target ABIs would already pass them in registers, no? > > > > std::complex<> is not interoperable with GCC's complex extension, which > > is generally viewed as "unfortunate". > > Could you expand on why std::complex<> is not interoperable with GCC's > complex extension. The reason is that I would like to know better where > the incompatibilities come from -- I've tried to remove any. I was just repeating what I had heard from C++ experts. On powerpc-linux they are currently passed and mangled differently. > > The class types for std::decimal::decimal32 and friends do have the > > proper modes. I suppose I could special-case aggregates of those modes > > but the plan was to pass these particular classes (and typedefs of > > them) the same as scalars, rather than _any_ class with those modes. > > I'll bring this up again on the C++ ABI mailing list. > > I introduced the notion of 'literal types' in C++0x precisely so that > compilers can pretend that user-defined types are like builtin types > and provide appropriate support. decimal types are literal types. So > are std::complex for T = builtin arithmetic types. I'm looking at these now. > > Perhaps most target ABIs pass single-member aggregates using the > > mode of the aggregate, but not all. In particular, not the 32-bit > > ELF ABI for Power. > > > > Janis > >
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
On Wed, Sep 23, 2009 at 3:57 PM, H.J. Lu wrote: > On Sat, Aug 8, 2009 at 2:59 PM, Sriraman Tallam wrote: >> Hi, >> >> Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. >> >> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >> that the results are the same with/without this patch. >> >> >> Problem Description : >> - >> >> This pass is intended to be applicable only to targets that implicitly >> zero-extend 64-bit registers after writing to their lower 32-bit half. >> For instance, x86_64 zero-extends the upper bits of a register >> implicitly whenever an instruction writes to its lower 32-bit half. >> For example, the instruction *add edi,eax* also zero-extends the upper >> 32-bits of rax after doing the addition. These zero extensions come >> for free and GCC does not always exploit this well. That is, it has >> been observed that there are plenty of cases where GCC explicitly >> zero-extends registers for x86_64 that are actually useless because >> these registers were already implicitly zero-extended in a prior >> instruction. This pass tries to eliminate such useless zero extension >> instructions. >> > > Does this fix: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17387 Yes, this patch fixes this problem. All the mov %eax, %eax are removed. > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34653 No, this patch does not fix this problem. > > -- > H.J. >
Re: C++ support for decimal floating point
On Wed, Sep 23, 2009 at 6:23 PM, Janis Johnson wrote: > On Wed, 2009-09-23 at 16:27 -0500, Gabriel Dos Reis wrote: >> On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson wrote: >> > On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote: >> >> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson >> >> wrote: >> >> > I've been implementing ISO/IEC TR 24733, "an extension for the >> >> > programming language C++ to support decimal floating-point arithmetic", >> >> > in GCC. It might be ready as an experimental feature for 4.5, but I >> >> > would particularly like to get in the compiler changes that are needed >> >> > for it. >> >> > >> >> > Most of the support for the TR is in new header files in libstdc++ that >> >> > depend on compiler support for decimal float scalar types. Most of that >> >> > compiler functionality was already available in G++ via mode attributes. >> >> > I've made a couple of small fixes and have a couple more to submit, and >> >> > when those are in I'll starting running dfp tests for C++ as well as C. >> >> > The suitable tests have already been moved from gcc.dg to c-c++-common. >> >> > >> >> > In order to provide interoperability with C, people on the C++ ABI >> >> > mailing list suggested that a C++ compiler should recognize the new >> >> > decimal classes defined in the TR and pass arguments of those types the >> >> > same as scalar decimal float types for a particular target. I had this >> >> > working in an ugly way using a langhook, but that broke with LTO. I'm >> >> > looking for the right places to record that an argument or return value >> >> > should be passed as if it were a different type, but could use some >> >> > advice about that. >> >> >> >> How do we (do we?) handle std::complex<> there? My first shot would >> >> be to make sure the aggregate type has the proper mode, but I guess >> >> most target ABIs would already pass them in registers, no? >> > >> > std::complex<> is not interoperable with GCC's complex extension, which >> > is generally viewed as "unfortunate". >> >> Could you expand on why std::complex<> is not interoperable with GCC's >> complex extension. The reason is that I would like to know better where >> the incompatibilities come from -- I've tried to remove any. > > I was just repeating what I had heard from C++ experts. On > powerpc-linux they are currently passed and mangled differently. I've been careful not to define a copy constructor or a destructor for the specializations of std::complex so that they get treated as PODs, with the hope that the compiler will do the right thing. At least on my x86-64 box running openSUSE, I don't see a difference. I've also left the copy-n-assignment operator at the discretion of the compiler // The compiler knows how to do this efficiently // complex& operator=(const complex&); So, if there is any difference on powerpc-*-linux, then that should be blamed on poor ABI choice than anything else intrinsic to std::complex (or C++). Where possible, we should look into how to fix that. In many ways, it is assumed that std::complex is isomorphic to the GNU extension. -- Gaby
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
On 08/08/2009 11:59 PM, Sriraman Tallam wrote: Hi, Here is a patch to eliminate redundant zero-extension instructions on x86_64. The code looks nice! However, since it is very specific to x86 (and x86 patterns), I'd rather see it in the i386 machine-dependent reorg pass. Thanks! Paolo
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
Paolo Bonzini writes: > On 08/08/2009 11:59 PM, Sriraman Tallam wrote: >> Hi, >> >> Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. > > The code looks nice! However, since it is very specific to x86 (and > x86 patterns), I'd rather see it in the i386 machine-dependent reorg > pass. I don't agree with this. If we want this code to be x86_64 specific, then it should be done by having the i386 backend add the pass to the pass manager, much as plugins can add a pass. Adding stuff to md-reorg is a step backward. In any case it seems to me that this pass should run before regrename and sched2. Ian
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
On 09/24/2009 08:14 AM, Ian Lance Taylor wrote: I don't agree with this. If we want this code to be x86_64 specific, then it should be done by having the i386 backend add the pass to the pass manager, much as plugins can add a pass. Adding stuff to md-reorg is a step backward. That's true. However, time is ticking for 4.5 and this could be a decent interim solution while for 4.6 the appropriate hooks could be added. I proposed md-reorg only because the patch does not include any special data-flow. Paolo
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
Paolo Bonzini writes: > On 09/24/2009 08:14 AM, Ian Lance Taylor wrote: >> I don't agree with this. If we want this code to be x86_64 specific, >> then it should be done by having the i386 backend add the pass to the >> pass manager, much as plugins can add a pass. Adding stuff to >> md-reorg is a step backward. > > That's true. However, time is ticking for 4.5 and this could be a > decent interim solution while for 4.6 the appropriate hooks could be > added. We already have the hooks, they have just been stuck in plugin.c when they should really be in the generic backend. See register_pass. (Sigh, every time I looked at this I said "the pass control has to be generic" but it still wound up in plugin.c.) Ian
Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)
On 09/24/2009 08:24 AM, Ian Lance Taylor wrote: We already have the hooks, they have just been stuck in plugin.c when they should really be in the generic backend. See register_pass. (Sigh, every time I looked at this I said "the pass control has to be generic" but it still wound up in plugin.c.) Then I'll rephrase and say only that the pass should be in config/i386/. Paolo