Re: C++ support for decimal floating point

2009-09-23 Thread Richard Guenther
On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson  wrote:
> I've been implementing ISO/IEC TR 24733, "an extension for the
> programming language C++ to support decimal floating-point arithmetic",
> in GCC.  It might be ready as an experimental feature for 4.5, but I
> would particularly like to get in the compiler changes that are needed
> for it.
>
> Most of the support for the TR is in new header files in libstdc++ that
> depend on compiler support for decimal float scalar types.  Most of that
> compiler functionality was already available in G++ via mode attributes.
> I've made a couple of small fixes and have a couple more to submit, and
> when those are in I'll starting running dfp tests for C++ as well as C.
> The suitable tests have already been moved from gcc.dg to c-c++-common.
>
> In order to provide interoperability with C, people on the C++ ABI
> mailing list suggested that a C++ compiler should recognize the new
> decimal classes defined in the TR and pass arguments of those types the
> same as scalar decimal float types for a particular target.  I had this
> working in an ugly way using a langhook, but that broke with LTO.  I'm
> looking for the right places to record that an argument or return value
> should be passed as if it were a different type, but could use some
> advice about that.

How do we (do we?) handle std::complex<> there?  My first shot would
be to make sure the aggregate type has the proper mode, but I guess
most target ABIs would already pass them in registers, no?

Richard.


Non-portable test?

2009-09-23 Thread Yuri Gribov
Hi all,

This is my first post to the list so do not be too harsh)

I have expected all c-torture tests to be highly portable but I have
recently ran into test which relies on int being 32-bit
(execute/980526-2.c).

The test runs to_kdev_t(0x12345678) (see below) and verifies that
result equals 0x15800078. But this is true only with 32-bit ints. With
64-bits we have 0x48d15800078.

static inline kdev_t to_kdev_t(int dev)
{
   int major, minor;

   if (sizeof(kdev_t) == 16)
   return (kdev_t)dev;
   major = (dev >> 8);
   minor = (dev & 0xff);
   return ((( major ) << 22 ) | (  minor )) ;
}

Shouldn't we modify a precondition in main:
   if (sizeof (int) < 4)
 exit (0);
to be
   if (sizeof (int) != 4)
 exit (0);
or better
   if( sizeof(int)*CHAR_BIT != 32 )
 exit(0)
?

Best regards,
Yuri


Re: Non-portable test?

2009-09-23 Thread Paolo Bonzini

On 09/23/2009 10:44 AM, Yuri Gribov wrote:

Hi all,

This is my first post to the list so do not be too harsh)

I have expected all c-torture tests to be highly portable but I have
recently ran into test which relies on int being 32-bit
(execute/980526-2.c).


Yes, it's possible that 64-bit ints are not supported by the testsuite. 
 Changes to fix that are welcome.


Paolo


Re: Non-portable test?

2009-09-23 Thread Yuri Gribov
> Yes, it's possible that 64-bit ints are not supported by the testsuite.
>  Changes to fix that are welcome.

I am not a gcc developer. Could someone verify and commit this patch
for testsuite/gcc.c-torture/execute/980526-2.c?

Best regards,
Yuri


980526-2.patch
Description: Binary data


Re: Non-portable test?

2009-09-23 Thread Yuri Gribov
> Done.  But if you have more cases, please report them.
Not yet. Thx!

-- 
Best regards,
Yuri


Re: what does the calling for min_insn_conflict_delay mean

2009-09-23 Thread Amker.Cheng
On Tue, Sep 22, 2009 at 11:50 PM, Vladimir Makarov  wrote:
> Ian Lance Taylor wrote:
>>
>> "Amker.Cheng"  writes:
>>
>>
>>>
>>>   In function new_ready, it calls to min_insn_conflict_delay with
>>> "min_insn_conflict_delay (curr_state, next, next)".
>>> But the function's comments say that it returns minimal delay of issue of
>>> the 2nd insn after issuing the 1st in given state.
>>> Why the last two parameter for the call are both "next"?
>>> seems conflict with the comments.
>>>
>>
>>
>
> Amker, thanks for finding this issue.
It's great pleasure if can help anything.

>>
>> This change dates back to the first DFA scheduler patch.  It does seem a
>> little odd, particularly as the call in new_ready is the only use of
>> min_insn_conflict_delay.  CC'ing vmakarov in case he remembers anything
>> about this old code.
>>
>
> I've not remembered this.  I guess  it was a result of long period of
> transition from the old pipeline hazard recognizier to the DFA one which
> required to rewrite all old pipeline descriptions.
>
> Also after starring at this code for some time,  I don't like this code.
>  Now I'd use min_issue_delay (curr_state, next) which is delay of  issuing
>  next in the current function unit reservation state instead of
>  min_insn_conflict_delay (curr_state, next, next) which is a delay of
> issuing the first insn (next) after issuing the second insn (next) on a free
> processor (when all function units are free).  Probably it was a typo.
>  Although I think that such change (in many other conditions to move insn
> speculatively to the ready list) will not give a visible improvement for
> most processors, I'll try it.
>
> It looks to me that probably I had also some plans for usage of
> min_insn_conflict_delay, but I forgot them because it was long ago.
>
>

Is it the delay of issuing next in the current reservation state which
expected here?

seems the call to min_insn_conflict_delay does nothing harm, except
may result in
more or less speculative motions(which are all valid ones).

-- 
Best Regards.


Re: RFC: missed loop optimizations from loop induction variable copies

2009-09-23 Thread Zdenek Dvorak
Hi,



> IVOpts cannot identify start_26, start_4 and ivtmp_32_7 to be copies.
> The root cause is that expression 'i + start' is identified as a common
> expression between the test in the header and the index operation in the
> latch. This is unified by copy propagation or FRE prior to loop
> optimizations
> and creates a new induction variable.
> 
> 
> Does this imply we try and not copy propagate or FRE potential induction
> variables? Or is this simply a missed case in IVOpts?

IIRC, at some point maybe year or two ago Sebastian worked on enhancing
scev to analyze such induction variables (thus enabling IVopts to handle them).
But it seems the code did not make it to mainline,

Zdenek


the Right place to change a target default for a common compiler flag?

2009-09-23 Thread IainS

Hi,

In the case that a compiler flag in common.opt would best be served  
with different default values on different targets.


I.E. a target-dependent Init()

Where can this be effected in the machinery ?

I can see how to make an override - but not a default.

cheers,
Iain


Re: Add new architechture in gcc build error

2009-09-23 Thread daniel tian
Thank you. I fixed the error. it caused by macro:
#define ELIMINABLE_REGS \
{\

 {ARG_POINTER_REGNUM,FRAME_POINTER_REGNUM}, \
 {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM}, \
 {FRAME_POINTER_REGNUM,  STACK_POINTER_REGNUM} \
}

because everytime when gcc check the frame_pointer_need, if it is
false, aim eliminated register is SP.
But in the former array, gcc still got FP. So error accurred.

Now it is OK with the following:
#define ELIMINABLE_REGS \
{\
 {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM}, \
 {ARG_POINTER_REGNUM,FRAME_POINTER_REGNUM}, \
 {FRAME_POINTER_REGNUM,  STACK_POINTER_REGNUM} \
}
just exchange the former two elements.

So thanks for your guys.


libgcc doesn't support my target

2009-09-23 Thread daniel tian
Hi,

When I build gcc first time this which the configure parameter is like this:

../rice-gcc-4.3.0/configure --target=$TARGET --prefix=$PREFIX
--enable-languages=c  --without-headers --with-newlib --with-gnu-as
--with-gnu-ld --disable-multilib --disable-libssp

Binutils is ok and install in the $PREFIX path.

Error information is like this:

checking for /home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/xgcc
-B/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/
-B/usr/local/cross/rice-elf/rice-elf/bin/
-B/usr/local/cross/rice-elf/rice-elf/lib/ -isystem
/usr/local/cross/rice-elf/rice-elf/include -isystem
/usr/local/cross/rice-elf/rice-elf/sys-include option to accept ANSI
C... none needed
checking how to run the C preprocessor...
/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/xgcc
-B/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc/./gcc/
-B/usr/local/cross/rice-elf/rice-elf/bin/
-B/usr/local/cross/rice-elf/rice-elf/lib/ -isystem
/usr/local/cross/rice-elf/rice-elf/include -isystem
/usr/local/cross/rice-elf/rice-elf/sys-include -E
checking whether decimal floating point is supported... no
checking whether fixed-point is supported... no
*** Configuration rice-mavrix-elf not supported
make[1]: *** [configure-target-libgcc] Error 1
make[1]: Leaving directory
`/home/daniel.tian/gcc_rice_dev/rice-binutils/build-gcc'
make: *** [all] Error 2

 rice-mavrix-elf :  rice is my target name.
I search the configure in libgcc, there is no target information. And
I check the CRX port, it also didn't add more information than I did.

Can anybody give me some clue to debug it?

Any suggestion is appreciated.
Thank you very much.


   Daniel.Tian


Re: libgcc doesn't support my target

2009-09-23 Thread daniel tian
Sorry, I just found and fixed the bug. the config.host file in /libgcc/.
Sorry.


DImode operations

2009-09-23 Thread daniel tian
Hi:

Do I have to write the DImode operations on my *.md target description file?
Now I build my gcc first, there is an error on libgcc2.c. which is
an __muldi3 function.
The error information is:
../../../rice-gcc-4.3.0/libgcc/../gcc/libgcc2.c: In function __muldi3:
../../../rice-gcc-4.3.0/libgcc/../gcc/libgcc2.c:557: internal compiler
error: in emit_move_insn, at expr.c:3379

My target is a RISC32 chip. There is no 64bit operations. And now I
don't wanna any 64bit operations in my C programs.
So do I have to finish the DImode operations?

Thank you very much.
Best Wishes.

  daniel.tian


Re: the Right place to change a target default for a common compiler flag?

2009-09-23 Thread Joern Rennecke

Quoting IainS :


Hi,

In the case that a compiler flag in common.opt would best be served
with different default values on different targets.

I.E. a target-dependent Init()

Where can this be effected in the machinery ?

I can see how to make an override - but not a default.


Set the default to a special value that indicates that the variable
has not been set by a user option.
Then make the override set the variable to the target-specific
default if the variable still has this special value.


Re: DImode operations

2009-09-23 Thread Dave Korn
daniel tian wrote:
> Hi:
> 
> Do I have to write the DImode operations on my *.md target description 
> file?

  Yes.  movMM must be implemented for all types that you want the compiler to
be able to handle at all; it's the only way it knows to move them around.
(Technically, it's supposed to be able to treat DImode as BLKmode and break it
down by pieces, but this code hasn't always been reliable and is definitely
less efficient than implementing a proper movdi pattern in your backend.)

> My target is a RISC32 chip. There is no 64bit operations. And now I
> don't wanna any 64bit operations in my C programs.
> So do I have to finish the DImode operations?

  I think you really should.  Take a look at how other ports handle it;
generally they use a define_expand for movdi, which emits the move as two
separate SI-mode move insns.  (Note in particular how they have to take care
what order to emit the two word moves in, as it's possible for the register
pairs used in input and output operations to overlap.)

  If you insisted, you could probably just hack the *di* routines out of the
libgcc makefile and get through to the end of the build, but I really wouldn't
recommend it, since "long long" is a standard C99 type.  It's not a great deal
of work to add the expander pattern and code that you'll need.

cheers,
  DaveK



Re: enable-build-with-cxx bootstrap compare broken by r149964

2009-09-23 Thread Jerry Quinn
On Tue, 2009-09-22 at 09:40 -0400, Jason Merrill wrote:
> On 09/22/2009 07:04 AM, Jerry Quinn wrote:
> > On Mon, 2009-09-21 at 13:06 -0400, Jason Merrill wrote:
> >> On 09/14/2009 11:54 AM, Jason Merrill wrote:
> >>> I think the way to go with this is to revert the compiler bits of
> >>> r149964, not mess with mangle.c at all, and insert the initial * if the
> >>> typeinfo name won't have TREE_PUBLIC set, since that's precisely the
> >>> property we want to mirror in comparison.
> >>
> >> Thoughts?  Another concern I have is that adding an initial * breaks
> >> simple demangling of type_info::name(), so I'd like to find another way
> >> of marking it for pointer comparison.
> >
> > What if we have type_info::name() be smart?  I.e.
> >
> > const char* name() { return name[0] == '*' ? name + 1 : name; }
> >
> > Then the * can still be a flag indicating compare by pointer.
> 
> I like it.

I'm trying the following in cp/rtti.c, but I get a segfault compiling 
testsuite/g++.dg/debug/dwarf2/pr41063.C

Removing the TREE_PUBLIC code fixes the segfault, so it's definitely
related.  I also tried using an arbitrary string for name_string, but I
get the same segfault.  It seems like something is expecting the name to
be exactly in synch with the decl.

I'm not really sure how everything fits together here.  Am I missing
something obvious?

tinfo_base_init (tinfo_s *ti, tree target)
{
  tree init = NULL_TREE;
  tree name_decl;
  tree vtable_ptr;

  {
tree name_name;

/* Generate the NTBS array variable.  */
tree name_type = build_cplus_array_type
 (build_qualified_type (char_type_node, TYPE_QUAL_CONST),
 NULL_TREE);
tree name_string = tinfo_name (target);

/* Determine the name of the variable -- and remember with which
   type it is associated.  */
name_name = mangle_typeinfo_string_for_type (target);
TREE_TYPE (name_name) = target;

name_decl = build_lang_decl (VAR_DECL, name_name, name_type);
SET_DECL_ASSEMBLER_NAME (name_decl, name_name);
DECL_ARTIFICIAL (name_decl) = 1;
DECL_IGNORED_P (name_decl) = 1;
TREE_READONLY (name_decl) = 1;
TREE_STATIC (name_decl) = 1;
DECL_EXTERNAL (name_decl) = 0;
DECL_TINFO_P (name_decl) = 1;
set_linkage_according_to_type (target, name_decl);
import_export_decl (name_decl);
if (!TREE_PUBLIC (name_decl))
  {
/* Inject '*' at start of name to force pointer comparison.  */
int len = TREE_STRING_LENGTH (name_string);
char* buf = (char*) XNEWVEC (char, len + 1);
buf[0] = '*';
memcpy (buf + 1, TREE_STRING_POINTER (name_string), len);
name_string = build_string (len + 1, buf);
XDELETEVEC (buf);
  }
DECL_INITIAL (name_decl) = name_string;
mark_used (name_decl);
pushdecl_top_level_and_finish (name_decl, name_string);
  }




Re: the Right place to change a target default for a common compiler flag?

2009-09-23 Thread Dominique Dhumieres
Iain,

I am currently bootstrapping on i686-apple-darwin9 with the current patch:

diff -uN /opt/gcc/_gcc_clean/config/mh-intel-darwin 
/opt/gcc/gcc-4.5-work/config/mh-intel-darwin
--- /opt/gcc/_gcc_clean/config/mh-intel-darwin  1970-01-01 01:00:00.0 
+0100
+++ /opt/gcc/gcc-4.5-work/config/mh-intel-darwin2009-09-23 
13:47:12.0 +0200
@@ -0,0 +1,3 @@
+# Set strict-dwarf for Darwin
+
+BOOT_CFLAGS += -gstrict-dwarf
diff -uN /opt/gcc/_gcc_clean/config/mh-ppc-darwin 
/opt/gcc/gcc-4.5-work/config/mh-ppc-darwin
--- /opt/gcc/_gcc_clean/config/mh-ppc-darwin2008-02-25 11:00:23.0 
+0100
+++ /opt/gcc/gcc-4.5-work/config/mh-ppc-darwin  2009-09-23 12:07:12.0 
+0200
@@ -2,4 +2,4 @@
 # position-independent-code -- the usual default on Darwin. This fix speeds
 # compiles by 3-5%.
 
-BOOT_CFLAGS += -mdynamic-no-pic
+BOOT_CFLAGS += -mdynamic-no-pic -gstrict-dwarf
--- /opt/gcc/_gcc_clean/configure   2009-09-22 20:04:27.0 +0200
+++ /opt/gcc/gcc-4.5-work/configure 2009-09-23 13:50:29.0 +0200
@@ -3655,6 +3655,12 @@
   powerpc-*-darwin*)
 host_makefile_frag="config/mh-ppc-darwin"
 ;;
+  i[3456789]86-*-darwin*)
+host_makefile_frag="config/mh-intel-darwin"
+;;
+  x86_64-*-darwin[912]*)
+host_makefile_frag="config/mh-intel-darwin"
+;;
   powerpc-*-aix*)
 host_makefile_frag="config/mh-ppc-aix"
 ;;

I am currently at stage 3 and I see -gstrict-dwarf in the log file.

I don't know if it is the Right place, but it seems to work so far.

Cheers,

Dominique


Re: the Right place to change a target default for a common compiler flag?

2009-09-23 Thread IainS

Hi Dominique,

I would expect you to need  -gstrict-dwarf in CFLAGS_FOR_TARGET also

but the point of my question is to find a way of having this on by  
default on Darwin (which is what we currently seem to need).


(more research is need on the latter - to determine whether the  
problem lies in our emission of debug fragments - or in the tools


cheers,
Iain

On 23 Sep 2009, at 14:42, Dominique Dhumieres wrote:


Iain,

I am currently bootstrapping on i686-apple-darwin9 with the current  
patch:


diff -uN /opt/gcc/_gcc_clean/config/mh-intel-darwin /opt/gcc/ 
gcc-4.5-work/config/mh-intel-darwin
--- /opt/gcc/_gcc_clean/config/mh-intel-darwin	1970-01-01  
01:00:00.0 +0100
+++ /opt/gcc/gcc-4.5-work/config/mh-intel-darwin	2009-09-23  
13:47:12.0 +0200

@@ -0,0 +1,3 @@
+# Set strict-dwarf for Darwin
+
+BOOT_CFLAGS += -gstrict-dwarf
diff -uN /opt/gcc/_gcc_clean/config/mh-ppc-darwin /opt/gcc/gcc-4.5- 
work/config/mh-ppc-darwin
--- /opt/gcc/_gcc_clean/config/mh-ppc-darwin	2008-02-25  
11:00:23.0 +0100
+++ /opt/gcc/gcc-4.5-work/config/mh-ppc-darwin	2009-09-23  
12:07:12.0 +0200

@@ -2,4 +2,4 @@
 # position-independent-code -- the usual default on Darwin. This  
fix speeds

 # compiles by 3-5%.

-BOOT_CFLAGS += -mdynamic-no-pic
+BOOT_CFLAGS += -mdynamic-no-pic -gstrict-dwarf
--- /opt/gcc/_gcc_clean/configure   2009-09-22 20:04:27.0 +0200
+++ /opt/gcc/gcc-4.5-work/configure	2009-09-23 13:50:29.0  
+0200

@@ -3655,6 +3655,12 @@
   powerpc-*-darwin*)
 host_makefile_frag="config/mh-ppc-darwin"
 ;;
+  i[3456789]86-*-darwin*)
+host_makefile_frag="config/mh-intel-darwin"
+;;
+  x86_64-*-darwin[912]*)
+host_makefile_frag="config/mh-intel-darwin"
+;;
   powerpc-*-aix*)
 host_makefile_frag="config/mh-ppc-aix"
 ;;

I am currently at stage 3 and I see -gstrict-dwarf in the log file.

I don't know if it is the Right place, but it seems to work so far.

Cheers,

Dominique




Re: the Right place to change a target default for a common compiler flag?

2009-09-23 Thread Dominique Dhumieres
With the previous patch, bootstrap failed when building libgomp: -gstrict-dwarf 
was
not passed during the configure stage. So it is not sufficient to pass it to
BOOT_CFLAGS. Would repeating the trick for CFLAGS_FOR_TARGET have a chance to
work?

Dominique


SSA GIMPLE

2009-09-23 Thread Rob Quigley
Hello,

I am looking for some more information of the SSA Gimple syntax and
was wondering if there was  BNF available?

I am interested in the IR of gcc and am just looking for some further
documentation/explanation of some of the syntax I am observing such
as:

OBJ_TYPE_REF(D.103787_32;D.103784_29->4) (D.103784_29, value__23); **
save_filt.1022_12 = <<>>;
save_eptr.1021_13 = <<>>;
resx;
iftmp.256_17 = (int (*__vtbl_ptr_type) (void) *) D.52956_16;
D.53402_2 = &this_1->m_cur_val;
__base_ctor  (&D.53467);
__comp_ctor  (&nm, if_typename__8, &D.53467);
__cxa_atexit (__tcf_0, 0B, &__dso_handle);
__static_initialization_and_destruction_0 (1, 65535);

Does anyone know where I might find such information? Any help and/or
pointers in the direction of information would be most welcome. I
tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple

Thanks and regards all!


Rob


Re: enable-build-with-cxx bootstrap compare broken by r149964

2009-09-23 Thread Jason Merrill

On 09/23/2009 09:22 AM, Jerry Quinn wrote:

I'm not really sure how everything fits together here.  Am I missing
something obvious?


I notice that you're missing the fix_string_type that tinfo_name does. 
But I'd rather not duplicate the code that creates the STRING_CST; 
better to delay the call to tinfo_name and add the * there.


Jason


Re: SSA GIMPLE

2009-09-23 Thread Diego Novillo
On Wed, Sep 23, 2009 at 11:01, Rob Quigley  wrote:

> Does anyone know where I might find such information? Any help and/or
> pointers in the direction of information would be most welcome. I
> tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple

There are articles, slides and pointers to internal documentation at
http://gcc.gnu.org/wiki/GettingStarted
You can post specific questions here and/or drop by the IRC channel at
irc.oftc.net/#gcc


Diego.


Re: SSA GIMPLE

2009-09-23 Thread Ian Lance Taylor
Rob Quigley  writes:

> I am looking for some more information of the SSA Gimple syntax and
> was wondering if there was  BNF available?

There is no BNF.  Sorry.

> I am interested in the IR of gcc and am just looking for some further
> documentation/explanation of some of the syntax I am observing such
> as:

This syntax is intended to be a C-like dump of the internal data
structures.

> Does anyone know where I might find such information? Any help and/or
> pointers in the direction of information would be most welcome. I
> tried the gcc wiki but I couldn't find much on SSA Gimple/low-Gimple

There is some documentation in the gcc internals manual at the bottom of
http://gcc.gnu.org/onlinedocs/ .

Ian


Re: Lattice Mico32 port

2009-09-23 Thread Richard Henderson

+#define PSEUDO_REG_P(X) ((X)>=FIRST_PSEUDO_REGISTER)


There's already a HARD_REGISTER_NUM_P that's the exact inverse.


+#define G_REG_P(X)  ((X)<32)


I suppose you're planning to add floating point registers?


+#define CONST_OK_FOR_LETTER_P(VALUE, C) \
+(  (C) == 'J' ? (VALUE) == 0\
+ : (C) == 'K' ? MEDIUM_INT (VALUE)  \
+ : (C) == 'L' ? MEDIUM_UINT (VALUE) \
+ : (C) == 'M' ? LARGE_INT (VALUE)   \
+ : 0\
+)
+
+#define CONST_DOUBLE_OK_FOR_LETTER_P(VALUE, C)  0


These defines are replaced by define_constraint,
typically in constraints.md.


+/* FIXME - This is not yet supported.  */
+#define STATIC_CHAIN_REGNUM 3


While you don't actually support this yet, you'd do well to
define it to one of the call-clobbered registers that isn't
an argument register -- r9 or r10 by the looks of it.


+#define GO_IF_LEGITIMATE_ADDRESS(m,x,l) \


Use the TARGET_LEGITIMATE_ADDRESS_P target hook.


+#define ARM_LEGITIMIZE_ADDRESS(X, OLDX, MODE, WIN)  \


Copy and paste?


+#define MEDIUM_INT(X)  HOST_WIDE_INT)(X)) >= -32768) && (((HOST_WIDE_INT)(X)) 
< 32768))
+#define MEDIUM_UINT(X) (((unsigned HOST_WIDE_INT)(X)) < 65536)


Use the IN_RANGE macro.  And if you move these to define_constraints, as 
mentioned above, you won't need the cast to HOST_WIDE_INT.


> +#define LARGE_INT(X)\
> +((X) >= (-(HOST_WIDE_INT) 0x7fff - 1)   \
> + && (X) <= (unsigned HOST_WIDE_INT) 0x)

Did you really want a signed low and an unsigned high on this?  It would 
seem that at some point you're getting signed and unsigned values 
confused somewhere if you need this...



+__ashlsi3:
+/* Only use 5 LSBs, as that's all the h/w shifter uses.  */
+andir2, r2, 0x1f
+/* Get address of offset into unrolled shift loop to jump to.  */
+#ifdef __PIC__
+orhir3, r0, gotoffhi16(__ashlsi3_table)
+addir3, r3, gotofflo16(__ashlsi3_table)
+add r3, r3, gp
+#else
+mvhir3, hi(__ashlsi3_table)
+ori r3, r3, lo(__ashlsi3_table)
+#endif


Seems like avoiding the table and knowing that each entry is 4 bytes 
back would be a teeny bit faster.


mvhir3, hi(__ashlsi3_0)
add r2, r2, r2
ori r3, r3, lo(__ashlsi3_0)
add r2, r2, r2
sub r3, r3, r2
b   r3

Also, it would seem that you'd be able to arrange for these alternate 
entry points to be invoked directly.  Something like


(define_insn "*ashlsi3_const"
  [(set (match_operand:SI 0 "register_operand" "=R1")
(ashift:SI (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "const_5bit_operand" "i")))
   (clobber (match_scratch:SI 3 "=RA"))]
  "!TARGET_BARREL_SHIFT_ENABLED"
  "calli   __ashlsi3_%2"
  [(set_attr "type" "call")])

Where R1 and RA are singleton register classes for those respective 
registers.  Obviously you can delay this as an improvement for later.



+  /* Raise divide by zero exception.  */
+  int eba;
+  __asm__ __volatile__ ("rcsr %0, EBA":"=r" (eba));
+  eba += 32 * 5;
+  __asm__ __volatile__ ("mv ea, ra");
+  __asm__ __volatile__ ("b %0"::"r" (eba));


You want to put __builtin_unreachable() there after the branch.


+  emit_insn (gen_movsi_imm_lo (operands[0], operands[0], GEN_INT 
(INTVAL (operands[1];


Line wrap.  There are other instances too.


+(define_insn "movsi_kimm"
+(define_insn "movsi_limm"
+(define_insn "movsi_imm_hi"
+(define_insn "movsi_reloc_gprel"
+(define_insn "movsi_reloc_hi"
+(define_insn "*movsi_insn"


Having these as separate instruction patterns is an extremely bad idea. 
 All moves of a given mode should be in the same pattern, so that 
reload can have the freedom to do its spilling as needed.  While your 
unspecs are except from this, things that just use HIGH aren't.


Using HIGH and LO_SUM on integer constants is a bad idea.  Much better 
to just go ahead and create a constraint letter; see for instance 
Alpha's define_constraint "L".



+(define_insn "*movqi_insn"
+  [(set (match_operand:QI 0 "register_or_memory_operand" "=r,r,m")
+(match_operand:QI 1 "register_or_memory_operand" "m,r,r"))]


Not having QImode or HImode constants is a mistake.


+static bool
+lm32_frame_pointer_required (void)
+{
+  /* If the function contains dynamic stack allocations, we need to
+ use the frame pointer to access the static parts of the frame.  */
+  if (cfun->calls_alloca)
+return true;


alloca is handled for you by generic code.
You shouldn't need to define this hook at all.



r~



question on dwarf2 debug-frame.

2009-09-23 Thread IainS

Hello,

I have this scenario:
 using  "dwarfdump --debug-frame" in a very simple object generated  
with current trunk.
I am trying to figure out (with the dwarf3 spec) wether the problem  
is in the tool (dwarfdump), or what we're emitting.

Can anyone more knowledgeable comment?

Iain.

--
 File: simplistic.o { mach32-i386 }
--

.debug_frame contents:

0x: CIE
length: 0x0010
CIE_id: 0x
   version: 0x01
  augmentation: ""
code_align: 1
data_align: -4
   ra_register: 0x08
  Initial Inst: DW_CFA_def_cfa (4, 4)
DW_CFA_offset (8, 0)
DW_CFA_nop
DW_CFA_nop
Init State: CFA( R4+4  )   R8=+0


0x0014: FDE
length: 0x0028
   CIE_pointer: 0x
start_addr: 0x
range_size: 0x0012
  Instructions: 0x: CFA( R4+4  )   R8=+0
DW_CFA_advance_loc4 (1)
DW_CFA_def_cfa_offset (8)
DW_CFA_offset (5, -8)
0x0001: CFA( R4+8  )   R5=-8   R8=+0
DW_CFA_advance_loc4 (2)
DW_CFA_def_cfa_register (5)
0x0003: CFA( R5+8  )   R5=-8   R8=+0
DW_CFA_advance_loc4 (14)
DW_CFA_restore (5)
Assertion failed: (reg_state_pos != cie->initial_state.regs.end()),  
function ParseInstructions, file /SourceCache/dwarf_utilities/ 
dwarf_utilities-49/source/DWARFDebugFrame.cpp, line 353.

Abort trap


the -save-temps -dA output for this is:

.section __DWARF,__debug_frame,regular,debug
Lframe0:
.set L$set$0,LECIE0-LSCIE0
.long L$set$0   # Length of Common Information Entry
LSCIE0:
.long   0x  # CIE Identifier Tag
.byte   0x1 # CIE Version
.ascii "\0"   # CIE Augmentation
.byte   0x1 # uleb128 0x1; CIE Code Alignment Factor
.byte   0x7c# sleb128 -4; CIE Data Alignment Factor
.byte   0x8 # CIE RA Column
.byte   0xc # DW_CFA_def_cfa
.byte   0x4 # uleb128 0x4
.byte   0x4 # uleb128 0x4
.byte   0x88# DW_CFA_offset, column 0x8
.byte   0x1 # uleb128 0x1
.align 2
LECIE0:
LSFDE0:
.set L$set$1,LEFDE0-LASFDE0
.long L$set$1   # FDE Length
LASFDE0:
.set L$set$2,Lframe0-Lsection__debug_frame
.long L$set$2   # FDE CIE offset
.long   LFB0# FDE initial location
.set L$set$3,LFE0-LFB0
.long L$set$3   # FDE address range
.byte   0x4 # DW_CFA_advance_loc4
.set L$set$4,LCFI0-LFB0
.long L$set$4
.byte   0xe # DW_CFA_def_cfa_offset
.byte   0x8 # uleb128 0x8
.byte   0x85# DW_CFA_offset, column 0x5
.byte   0x2 # uleb128 0x2
.byte   0x4 # DW_CFA_advance_loc4
.set L$set$5,LCFI1-LCFI0
.long L$set$5
.byte   0xd # DW_CFA_def_cfa_register
.byte   0x5 # uleb128 0x5
.byte   0x4 # DW_CFA_advance_loc4
.set L$set$6,LCFI3-LCFI1
.long L$set$6
.byte   0xc5# DW_CFA_restore, column 0x5
.byte   0xc # DW_CFA_def_cfa
.byte   0x4 # uleb128 0x4
.byte   0x4 # uleb128 0x4
.align 2
LEFDE0:



Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]

2009-09-23 Thread Dave Korn
Dave Korn wrote:
> Eric Botcazou wrote:
>> Your .diff contains this
>>
>> +  EH_MECHANISM=-gcc
>>
>> so it looks as though the base compiler was SJLJ.
>
>   Ah, bingo!  Thanks Eric; yes, I have a recent build of an SJLJ Gnat from
> HEAD lying around my PATH ahead of my old 4.3.2-with-ZCX.  Getting that out of
> the way should help!

  And although it turns out that was the case, it didn't actually solve the
problem.  It turns out to be a horribly subtle artifact of this factor:

> switched it over to ZCX, and it worked well
> enough to pass most of the testsuite, including EH.  Now I'm changing the
> target pairs on top of that and suddenly it's complaining, which is why I'm
> confused; I thought that bit was stable.

  This was driving me mad, I had a perfectly working ZCX compiler but every
time I tried to change anything, it mysteriously switched itself back to SJLJ
for seemingly no reason at all and then failed building target-libada as a
consequence.  The thing was down to the particular way in which I was setting
the LIBGNAT_TARGET_PAIRS variable; because of the way Cygwin and MinGW share
most of their port implementation, I was doing this:

  LIBGNAT_TARGET_PAIRS = \
[ ... overrides only for mingw ... ]
  if (  ... target is cygwin ... )
# blank it out, no cygwin-only overrides yet
LIBGNAT_TARGET_PAIRS =
  endif
  LIBGNAT_TARGET_PAIRS += \
[ ... common overrides ... ]

  And the result of doing it this way was that LIBGNAT_TARGET_PAIRS ended up
with an embedded leading space.  This wouldn't have mattered much, except for
one little thing: later, in gcc-interface/Makefile.in, we have ...

ifeq ($(filter-out a-except%,$(LIBGNAT_TARGET_PAIRS)),$(LIBGNAT_TARGET_PAIRS))
  LIBGNAT_TARGET_PAIRS += \
a-except.ads

Re: question on dwarf2 debug-frame.

2009-09-23 Thread Richard Henderson

On 09/23/2009 11:00 AM, IainS wrote:

DW_CFA_restore (5)
Assertion failed: (reg_state_pos != cie->initial_state.regs.end()),
function ParseInstructions, file
/SourceCache/dwarf_utilities/dwarf_utilities-49/source/DWARFDebugFrame.cpp,
line 353.
Abort trap


There could be some confusion in DW_CFA_restore vs DW_CFA_same_value, 
though I don't know on whose side it is.  Certainly the existing 
consumers that I know treat a DW_CFA_restore for a register not 
mentioned by the CIE the same as "same_value".



r~


Re: Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]

2009-09-23 Thread Eric Botcazou
>   Is it just a bug for me to generate LIBGNAT_TARGET_PAIRS in a way that
> has superfluous spaces (whether leading, trailing or embedded), or shall I
> send a patch to add a $(strip) to the right-hand side of the ifeq
> comparison?  Or perhaps we should do
>
> LIBGNAT_TARGET_PAIRS:=$(strip $(LIBGNAT_TARGET_PAIRS))
>
> right at the top-level, just after the per-target chunks, to ensure the
> string is properly normalised before any further tests and comparisons we
> might want to make?

That indeed seems to be a good idea (with a little comment).

-- 
Eric Botcazou


Re: C++ support for decimal floating point

2009-09-23 Thread Janis Johnson
On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote:
> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson  wrote:
> > I've been implementing ISO/IEC TR 24733, "an extension for the
> > programming language C++ to support decimal floating-point arithmetic",
> > in GCC.  It might be ready as an experimental feature for 4.5, but I
> > would particularly like to get in the compiler changes that are needed
> > for it.
> >
> > Most of the support for the TR is in new header files in libstdc++ that
> > depend on compiler support for decimal float scalar types.  Most of that
> > compiler functionality was already available in G++ via mode attributes.
> > I've made a couple of small fixes and have a couple more to submit, and
> > when those are in I'll starting running dfp tests for C++ as well as C.
> > The suitable tests have already been moved from gcc.dg to c-c++-common.
> >
> > In order to provide interoperability with C, people on the C++ ABI
> > mailing list suggested that a C++ compiler should recognize the new
> > decimal classes defined in the TR and pass arguments of those types the
> > same as scalar decimal float types for a particular target.  I had this
> > working in an ugly way using a langhook, but that broke with LTO.  I'm
> > looking for the right places to record that an argument or return value
> > should be passed as if it were a different type, but could use some
> > advice about that.
> 
> How do we (do we?) handle std::complex<> there?  My first shot would
> be to make sure the aggregate type has the proper mode, but I guess
> most target ABIs would already pass them in registers, no?

std::complex<> is not interoperable with GCC's complex extension, which
is generally viewed as "unfortunate".

The class types for std::decimal::decimal32 and friends do have the
proper modes.  I suppose I could special-case aggregates of those modes
but the plan was to pass these particular classes (and typedefs of
them) the same as scalars, rather than _any_ class with those modes.
I'll bring this up again on the C++ ABI mailing list.

Perhaps most target ABIs pass single-member aggregates using the
mode of the aggregate, but not all.  In particular, not the 32-bit
ELF ABI for Power.

Janis





Re: C++ support for decimal floating point

2009-09-23 Thread Richard Henderson

On 09/23/2009 02:11 PM, Janis Johnson wrote:

The class types for std::decimal::decimal32 and friends do have the
proper modes.  I suppose I could special-case aggregates of those modes
but the plan was to pass these particular classes (and typedefs of
them) the same as scalars, rather than _any_ class with those modes.
I'll bring this up again on the C++ ABI mailing list.


You could special-case this in the C++ conversion to generic
by having the std::decimal classes decompose to scalars immediately.


Re: C++ support for decimal floating point

2009-09-23 Thread Gabriel Dos Reis
On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson  wrote:
> On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote:
>> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson  wrote:
>> > I've been implementing ISO/IEC TR 24733, "an extension for the
>> > programming language C++ to support decimal floating-point arithmetic",
>> > in GCC.  It might be ready as an experimental feature for 4.5, but I
>> > would particularly like to get in the compiler changes that are needed
>> > for it.
>> >
>> > Most of the support for the TR is in new header files in libstdc++ that
>> > depend on compiler support for decimal float scalar types.  Most of that
>> > compiler functionality was already available in G++ via mode attributes.
>> > I've made a couple of small fixes and have a couple more to submit, and
>> > when those are in I'll starting running dfp tests for C++ as well as C.
>> > The suitable tests have already been moved from gcc.dg to c-c++-common.
>> >
>> > In order to provide interoperability with C, people on the C++ ABI
>> > mailing list suggested that a C++ compiler should recognize the new
>> > decimal classes defined in the TR and pass arguments of those types the
>> > same as scalar decimal float types for a particular target.  I had this
>> > working in an ugly way using a langhook, but that broke with LTO.  I'm
>> > looking for the right places to record that an argument or return value
>> > should be passed as if it were a different type, but could use some
>> > advice about that.
>>
>> How do we (do we?) handle std::complex<> there?  My first shot would
>> be to make sure the aggregate type has the proper mode, but I guess
>> most target ABIs would already pass them in registers, no?
>
> std::complex<> is not interoperable with GCC's complex extension, which
> is generally viewed as "unfortunate".

Could you expand on why std::complex<> is not interoperable with GCC's
complex extension.  The reason is that I would like to know better where
the incompatibilities come from -- I've tried to remove any.

>
> The class types for std::decimal::decimal32 and friends do have the
> proper modes.  I suppose I could special-case aggregates of those modes
> but the plan was to pass these particular classes (and typedefs of
> them) the same as scalars, rather than _any_ class with those modes.
> I'll bring this up again on the C++ ABI mailing list.

I introduced the notion of 'literal types' in C++0x precisely so that
compilers can pretend that user-defined types are like builtin types
and provide appropriate support.  decimal types are literal types.  So
are std::complex for T = builtin arithmetic types.

>
> Perhaps most target ABIs pass single-member aggregates using the
> mode of the aggregate, but not all.  In particular, not the 32-bit
> ELF ABI for Power.
>
> Janis
>
>
>
>


Re: Why Ada always seems to want to devolve from ZCX back to SJLJ: the mystery explained [was Re: GNAT mysterious "missing stub for subunit" error. ]

2009-09-23 Thread Dave Korn
Eric Botcazou wrote:
>>   Is it just a bug for me to generate LIBGNAT_TARGET_PAIRS in a way that
>> has superfluous spaces (whether leading, trailing or embedded), or shall I
>> send a patch to add a $(strip) to the right-hand side of the ifeq
>> comparison?  Or perhaps we should do
>>
>> LIBGNAT_TARGET_PAIRS:=$(strip $(LIBGNAT_TARGET_PAIRS))
>>
>> right at the top-level, just after the per-target chunks, to ensure the
>> string is properly normalised before any further tests and comparisons we
>> might want to make?
> 
> That indeed seems to be a good idea (with a little comment).
> 

  Actually, the test logic is kinda backwards.  We want to know if
LIBGNAT_TARGET_PAIRS contains anything matching a certain pattern, so we
remove anything matching that pattern and then see if the string has changed
or not?  It would seem a bit more direct and to-the-point to just have used
$(filter) instead of $(filter-out) and compare against an empty string, and
that way would have been robust in the face of whitespace changes.  Maybe I'll
rewrite that test as well in the patch.

cheers,
  DaveK


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Sriraman Tallam
Hi Richard,

 I finally got around to getting the data you wanted. Thanks for
the response. Please
find my comments below.


On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
 wrote:
> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote:
>> Hi,
>>
>>Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>>
>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>> that the results are the same with/without this patch.
>
> The patch misses testcases.

Added.

Why does zee run after register allocation?
> Your examples suggest that it will free hard registers so doing it before
> regalloc looks odd.

Originally, I had written this patch to have ZEE run before IRA.
However, I noticed
that IRA generates poorer code when my patch is turned on.

Here is to give an example of how badly RA can hurt . I show a piece
of code around a
zero-extend that got eliminated. The code on the right is after
eliminating zero-extends.
The code is pretty much the same except the extra move highlighted in
yellow. IRA is not
able to coalesce %esi and %r15d.

Base line :

48b760: imul   $0x9e406cb5,%r15d,%esi
48b767: mov%rax,%rcx
48b76a: shr$0x12,%esi
48b76d: and%r12d,%esi
48b770: mov%edi,%eax
48b772: add$0x1,%edi
48b775: shr$0x5,%eax
48b778: mov%eax,%eax# redundant zero extend
48b77a: lea(%rcx,%rax,1),%rax
48b77e: cmp%rax,%r9


-fzee :

48b7d0: imul   $0x9e406cb5,%r15d,%r15d # The destination should have
just been esi.
48b7d7: mov%rax,%rcx
48b7da: shr$0x12,%r15d
48b7de: mov%r15d,%esi   # This move is useless if r15d and esi can
be coalesced into esi.
48b7e1: and%r12d,%esi
48b7e4: mov%edi,%eax
48b7e6: add$0x1,%edi
48b7e9: shr$0x5,%eax
Ok, zero-extend eliminated.
48b7ec: lea(%rcx,%rax,1),%rax
48b7f0: cmp%rax,%r9

Going after IRA preserves code quality and the useless extension gets removed.

>
> What is the compile-time impact of your patch on say, gcc bootstrap?
> How many percent of instructions are removed as useless zero-extensions
> during gcc bootstrap?  How much do CSiBE numbers improve?

CSiBE numbers :

Total number of zero-extension instructions before : 667.
Total number of zero-extension instructions after   : 122.
Performance : no measurable impact.

GCC bootstrap :

Total number of zero-extension instructions before  : 1456
Total number of zero-extension instructions after:  5814
No impact on boot-strap time.


I have attached the latest patch :


On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
 wrote:
> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote:
>> Hi,
>>
>>    Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>>
>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>> that the results are the same with/without this patch.
>
> The patch misses testcases.  Why does zee run after register allocation?
> Your examples suggest that it will free hard registers so doing it before
> regalloc looks odd.
>
> What is the compile-time impact of your patch on say, gcc bootstrap?
> How many percent of instructions are removed as useless zero-extensions
> during gcc bootstrap?  How much do CSiBE numbers improve?
>
> Thanks,
> Richard.
>
>>
>> Problem Description :
>> -
>>
>> This pass is intended to be applicable only to targets that implicitly
>> zero-extend 64-bit registers after writing to their lower 32-bit half.
>> For instance, x86_64 zero-extends the upper bits of a register
>> implicitly whenever an instruction writes to its lower 32-bit half.
>> For example, the instruction *add edi,eax* also zero-extends the upper
>> 32-bits of rax after doing the addition.  These zero extensions come
>> for free and GCC does not always exploit this well.  That is, it has
>> been observed that there are plenty of cases where GCC explicitly
>> zero-extends registers for x86_64 that are actually useless because
>> these registers were already implicitly zero-extended in a prior
>> instruction.  This pass tries to eliminate such useless zero extension
>> instructions.
>>
>> Motivating Example I :
>> --
>> For this program :
>> **
>> bad_code.c
>>
>> int mask[1000];
>>
>> int foo(unsigned x)
>> {
>>  if (x < 10)
>>    x = x * 45;
>>  else
>>    x = x * 78;
>>  return mask[x];
>> }
>> **
>>
>> $ gcc -O2 bad_code.c
>>  
>>  400315:       b8 4e 00 00 00            mov    $0x4e,%eax
>>  40031a:       0f af f8                        imul   %eax,%edi
>>  40031d:       89 ff                             mov    %edi,%edi
>> ---> Useless zero extend.
>>  40031f:       8b 04 bd 60 19 40 00    mov    0x401960(,%rdi,4),%eax
>>  400326:       c3                                 retq
>>  ..
>>  400330:       ba 2d 00 00 00          mov    $0x2d,%edx
>>  400335:       0f af fa                      imul   

Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread H.J. Lu
On Sat, Aug 8, 2009 at 2:59 PM, Sriraman Tallam  wrote:
> Hi,
>
>    Here is a patch to eliminate redundant zero-extension instructions
> on x86_64.
>
> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
> that the results are the same with/without this patch.
>
>
> Problem Description :
> -
>
> This pass is intended to be applicable only to targets that implicitly
> zero-extend 64-bit registers after writing to their lower 32-bit half.
> For instance, x86_64 zero-extends the upper bits of a register
> implicitly whenever an instruction writes to its lower 32-bit half.
> For example, the instruction *add edi,eax* also zero-extends the upper
> 32-bits of rax after doing the addition.  These zero extensions come
> for free and GCC does not always exploit this well.  That is, it has
> been observed that there are plenty of cases where GCC explicitly
> zero-extends registers for x86_64 that are actually useless because
> these registers were already implicitly zero-extended in a prior
> instruction.  This pass tries to eliminate such useless zero extension
> instructions.
>

Does this fix:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17387
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34653

-- 
H.J.


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Ramana Radhakrishnan
>
> GCC bootstrap :
>
> Total number of zero-extension instructions before  : 1456
> Total number of zero-extension instructions after    :  5814
> No impact on boot-strap time.


You sure you have these numbers the right way around ? Shouldn't the
number of zero-extension instructions after the patch be less than the
number of zero-extension instructions before or is this a regression
?

Thanks,
Ramana

>
>
> I have attached the latest patch :
>
>
> On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
>  wrote:
>> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote:
>>> Hi,
>>>
>>>    Here is a patch to eliminate redundant zero-extension instructions
>>> on x86_64.
>>>
>>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>>> that the results are the same with/without this patch.
>>
>> The patch misses testcases.  Why does zee run after register allocation?
>> Your examples suggest that it will free hard registers so doing it before
>> regalloc looks odd.
>>
>> What is the compile-time impact of your patch on say, gcc bootstrap?
>> How many percent of instructions are removed as useless zero-extensions
>> during gcc bootstrap?  How much do CSiBE numbers improve?
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Problem Description :
>>> -
>>>
>>> This pass is intended to be applicable only to targets that implicitly
>>> zero-extend 64-bit registers after writing to their lower 32-bit half.
>>> For instance, x86_64 zero-extends the upper bits of a register
>>> implicitly whenever an instruction writes to its lower 32-bit half.
>>> For example, the instruction *add edi,eax* also zero-extends the upper
>>> 32-bits of rax after doing the addition.  These zero extensions come
>>> for free and GCC does not always exploit this well.  That is, it has
>>> been observed that there are plenty of cases where GCC explicitly
>>> zero-extends registers for x86_64 that are actually useless because
>>> these registers were already implicitly zero-extended in a prior
>>> instruction.  This pass tries to eliminate such useless zero extension
>>> instructions.
>>>
>>> Motivating Example I :
>>> --
>>> For this program :
>>> **
>>> bad_code.c
>>>
>>> int mask[1000];
>>>
>>> int foo(unsigned x)
>>> {
>>>  if (x < 10)
>>>    x = x * 45;
>>>  else
>>>    x = x * 78;
>>>  return mask[x];
>>> }
>>> **
>>>
>>> $ gcc -O2 bad_code.c
>>>  
>>>  400315:       b8 4e 00 00 00            mov    $0x4e,%eax
>>>  40031a:       0f af f8                        imul   %eax,%edi
>>>  40031d:       89 ff                             mov    %edi,%edi
>>> ---> Useless zero extend.
>>>  40031f:       8b 04 bd 60 19 40 00    mov    0x401960(,%rdi,4),%eax
>>>  400326:       c3                                 retq
>>>  ..
>>>  400330:       ba 2d 00 00 00          mov    $0x2d,%edx
>>>  400335:       0f af fa                      imul   %edx,%edi
>>>  400338:       89 ff                           mov    %edi,%edi  --->
>>> Useless zero extend.
>>>  40033a:       8b 04 bd 60 19 40 00    mov    0x401960(,%rdi,4),%eax
>>>  400341:       c3                      retq
>>>
>>> $ gcc -O2 -fzee bad_code.c
>>>  ..
>>>  400315:       6b ff 4e                imul   $0x4e,%edi,%edi
>>>  400318:       8b 04 bd 40 19 40 00    mov    0x401940(,%rdi,4),%eax
>>>  40031f:       c3                      retq
>>>  400320:       6b ff 2d                imul   $0x2d,%edi,%edi
>>>  400323:       8b 04 bd 40 19 40 00    mov    0x401940(,%rdi,4),%eax
>>>  40032a:       c3                      retq
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Sriraman M Tallam.
>>> Google, Inc.
>>> tmsri...@google.com
>>>
>>
>


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Sriraman Tallam
Sorry, it is the other way around.

Total number of zero-extension instructions before  :  5814
Total number of zero-extension instructions after   :  1456

Thanks for pointing it.

On Wed, Sep 23, 2009 at 4:10 PM, Ramana Radhakrishnan
 wrote:
>>
>> GCC bootstrap :
>>
>> Total number of zero-extension instructions before  : 1456
>> Total number of zero-extension instructions after    :  5814
>> No impact on boot-strap time.
>
>
> You sure you have these numbers the right way around ? Shouldn't the
> number of zero-extension instructions after the patch be less than the
> number of zero-extension instructions before or is this a regression
> ?
>
> Thanks,
> Ramana
>
>>
>>
>> I have attached the latest patch :
>>
>>
>> On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
>>  wrote:
>>> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam wrote:
 Hi,

    Here is a patch to eliminate redundant zero-extension instructions
 on x86_64.

 Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
 that the results are the same with/without this patch.
>>>
>>> The patch misses testcases.  Why does zee run after register allocation?
>>> Your examples suggest that it will free hard registers so doing it before
>>> regalloc looks odd.
>>>
>>> What is the compile-time impact of your patch on say, gcc bootstrap?
>>> How many percent of instructions are removed as useless zero-extensions
>>> during gcc bootstrap?  How much do CSiBE numbers improve?
>>>
>>> Thanks,
>>> Richard.
>>>

 Problem Description :
 -

 This pass is intended to be applicable only to targets that implicitly
 zero-extend 64-bit registers after writing to their lower 32-bit half.
 For instance, x86_64 zero-extends the upper bits of a register
 implicitly whenever an instruction writes to its lower 32-bit half.
 For example, the instruction *add edi,eax* also zero-extends the upper
 32-bits of rax after doing the addition.  These zero extensions come
 for free and GCC does not always exploit this well.  That is, it has
 been observed that there are plenty of cases where GCC explicitly
 zero-extends registers for x86_64 that are actually useless because
 these registers were already implicitly zero-extended in a prior
 instruction.  This pass tries to eliminate such useless zero extension
 instructions.

 Motivating Example I :
 --
 For this program :
 **
 bad_code.c

 int mask[1000];

 int foo(unsigned x)
 {
  if (x < 10)
    x = x * 45;
  else
    x = x * 78;
  return mask[x];
 }
 **

 $ gcc -O2 bad_code.c
  
  400315:       b8 4e 00 00 00            mov    $0x4e,%eax
  40031a:       0f af f8                        imul   %eax,%edi
  40031d:       89 ff                             mov    %edi,%edi
 ---> Useless zero extend.
  40031f:       8b 04 bd 60 19 40 00    mov    0x401960(,%rdi,4),%eax
  400326:       c3                                 retq
  ..
  400330:       ba 2d 00 00 00          mov    $0x2d,%edx
  400335:       0f af fa                      imul   %edx,%edi
  400338:       89 ff                           mov    %edi,%edi  --->
 Useless zero extend.
  40033a:       8b 04 bd 60 19 40 00    mov    0x401960(,%rdi,4),%eax
  400341:       c3                      retq

 $ gcc -O2 -fzee bad_code.c
  ..
  400315:       6b ff 4e                imul   $0x4e,%edi,%edi
  400318:       8b 04 bd 40 19 40 00    mov    0x401940(,%rdi,4),%eax
  40031f:       c3                      retq
  400320:       6b ff 2d                imul   $0x2d,%edi,%edi
  400323:       8b 04 bd 40 19 40 00    mov    0x401940(,%rdi,4),%eax
  40032a:       c3                      retq



 Thanks,

 Sriraman M Tallam.
 Google, Inc.
 tmsri...@google.com

>>>
>>
>


Re: C++ support for decimal floating point

2009-09-23 Thread Janis Johnson
On Wed, 2009-09-23 at 16:27 -0500, Gabriel Dos Reis wrote:
> On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson  wrote:
> > On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote:
> >> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson  wrote:
> >> > I've been implementing ISO/IEC TR 24733, "an extension for the
> >> > programming language C++ to support decimal floating-point arithmetic",
> >> > in GCC.  It might be ready as an experimental feature for 4.5, but I
> >> > would particularly like to get in the compiler changes that are needed
> >> > for it.
> >> >
> >> > Most of the support for the TR is in new header files in libstdc++ that
> >> > depend on compiler support for decimal float scalar types.  Most of that
> >> > compiler functionality was already available in G++ via mode attributes.
> >> > I've made a couple of small fixes and have a couple more to submit, and
> >> > when those are in I'll starting running dfp tests for C++ as well as C.
> >> > The suitable tests have already been moved from gcc.dg to c-c++-common.
> >> >
> >> > In order to provide interoperability with C, people on the C++ ABI
> >> > mailing list suggested that a C++ compiler should recognize the new
> >> > decimal classes defined in the TR and pass arguments of those types the
> >> > same as scalar decimal float types for a particular target.  I had this
> >> > working in an ugly way using a langhook, but that broke with LTO.  I'm
> >> > looking for the right places to record that an argument or return value
> >> > should be passed as if it were a different type, but could use some
> >> > advice about that.
> >>
> >> How do we (do we?) handle std::complex<> there?  My first shot would
> >> be to make sure the aggregate type has the proper mode, but I guess
> >> most target ABIs would already pass them in registers, no?
> >
> > std::complex<> is not interoperable with GCC's complex extension, which
> > is generally viewed as "unfortunate".
> 
> Could you expand on why std::complex<> is not interoperable with GCC's
> complex extension.  The reason is that I would like to know better where
> the incompatibilities come from -- I've tried to remove any.

I was just repeating what I had heard from C++ experts.  On
powerpc-linux they are currently passed and mangled differently.

> > The class types for std::decimal::decimal32 and friends do have the
> > proper modes.  I suppose I could special-case aggregates of those modes
> > but the plan was to pass these particular classes (and typedefs of
> > them) the same as scalars, rather than _any_ class with those modes.
> > I'll bring this up again on the C++ ABI mailing list.
> 
> I introduced the notion of 'literal types' in C++0x precisely so that
> compilers can pretend that user-defined types are like builtin types
> and provide appropriate support.  decimal types are literal types.  So
> are std::complex for T = builtin arithmetic types.

I'm looking at these now.

> > Perhaps most target ABIs pass single-member aggregates using the
> > mode of the aggregate, but not all.  In particular, not the 32-bit
> > ELF ABI for Power.
> >
> > Janis
> >



Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Sriraman Tallam
On Wed, Sep 23, 2009 at 3:57 PM, H.J. Lu  wrote:
> On Sat, Aug 8, 2009 at 2:59 PM, Sriraman Tallam  wrote:
>> Hi,
>>
>>    Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>>
>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>> that the results are the same with/without this patch.
>>
>>
>> Problem Description :
>> -
>>
>> This pass is intended to be applicable only to targets that implicitly
>> zero-extend 64-bit registers after writing to their lower 32-bit half.
>> For instance, x86_64 zero-extends the upper bits of a register
>> implicitly whenever an instruction writes to its lower 32-bit half.
>> For example, the instruction *add edi,eax* also zero-extends the upper
>> 32-bits of rax after doing the addition.  These zero extensions come
>> for free and GCC does not always exploit this well.  That is, it has
>> been observed that there are plenty of cases where GCC explicitly
>> zero-extends registers for x86_64 that are actually useless because
>> these registers were already implicitly zero-extended in a prior
>> instruction.  This pass tries to eliminate such useless zero extension
>> instructions.
>>
>
> Does this fix:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17387

Yes, this patch fixes this problem. All the mov %eax, %eax are removed.


> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34653

No, this patch does not fix this problem.

>
> --
> H.J.
>


Re: C++ support for decimal floating point

2009-09-23 Thread Gabriel Dos Reis
On Wed, Sep 23, 2009 at 6:23 PM, Janis Johnson  wrote:
> On Wed, 2009-09-23 at 16:27 -0500, Gabriel Dos Reis wrote:
>> On Wed, Sep 23, 2009 at 4:11 PM, Janis Johnson  wrote:
>> > On Wed, 2009-09-23 at 10:29 +0200, Richard Guenther wrote:
>> >> On Wed, Sep 23, 2009 at 2:38 AM, Janis Johnson  
>> >> wrote:
>> >> > I've been implementing ISO/IEC TR 24733, "an extension for the
>> >> > programming language C++ to support decimal floating-point arithmetic",
>> >> > in GCC.  It might be ready as an experimental feature for 4.5, but I
>> >> > would particularly like to get in the compiler changes that are needed
>> >> > for it.
>> >> >
>> >> > Most of the support for the TR is in new header files in libstdc++ that
>> >> > depend on compiler support for decimal float scalar types.  Most of that
>> >> > compiler functionality was already available in G++ via mode attributes.
>> >> > I've made a couple of small fixes and have a couple more to submit, and
>> >> > when those are in I'll starting running dfp tests for C++ as well as C.
>> >> > The suitable tests have already been moved from gcc.dg to c-c++-common.
>> >> >
>> >> > In order to provide interoperability with C, people on the C++ ABI
>> >> > mailing list suggested that a C++ compiler should recognize the new
>> >> > decimal classes defined in the TR and pass arguments of those types the
>> >> > same as scalar decimal float types for a particular target.  I had this
>> >> > working in an ugly way using a langhook, but that broke with LTO.  I'm
>> >> > looking for the right places to record that an argument or return value
>> >> > should be passed as if it were a different type, but could use some
>> >> > advice about that.
>> >>
>> >> How do we (do we?) handle std::complex<> there?  My first shot would
>> >> be to make sure the aggregate type has the proper mode, but I guess
>> >> most target ABIs would already pass them in registers, no?
>> >
>> > std::complex<> is not interoperable with GCC's complex extension, which
>> > is generally viewed as "unfortunate".
>>
>> Could you expand on why std::complex<> is not interoperable with GCC's
>> complex extension.  The reason is that I would like to know better where
>> the incompatibilities come from -- I've tried to remove any.
>
> I was just repeating what I had heard from C++ experts.  On
> powerpc-linux they are currently passed and mangled differently.

I've been careful not to define a copy constructor or a destructor
for the specializations of std::complex so that they get treated as PODs,
with the hope that the compiler will do the right thing.  At least on
my x86-64 box
 running openSUSE, I don't see a difference.  I've also left the
copy-n-assignment operator at the discretion of the compiler

  // The compiler knows how to do this efficiently
  // complex& operator=(const complex&);

So, if there is any difference on powerpc-*-linux, then that should be blamed on
poor ABI choice than anything else intrinsic to std::complex (or C++).
Where possible, we should look into how to fix that.

In many ways, it is assumed that std::complex is isomorphic to the
GNU extension.

-- Gaby


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Paolo Bonzini

On 08/08/2009 11:59 PM, Sriraman Tallam wrote:

Hi,

 Here is a patch to eliminate redundant zero-extension instructions
on x86_64.


The code looks nice!  However, since it is very specific to x86 (and x86 
patterns), I'd rather see it in the i386 machine-dependent reorg pass.


Thanks!

Paolo


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Ian Lance Taylor
Paolo Bonzini  writes:

> On 08/08/2009 11:59 PM, Sriraman Tallam wrote:
>> Hi,
>>
>>  Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>
> The code looks nice!  However, since it is very specific to x86 (and
> x86 patterns), I'd rather see it in the i386 machine-dependent reorg
> pass.

I don't agree with this.  If we want this code to be x86_64 specific,
then it should be done by having the i386 backend add the pass to the
pass manager, much as plugins can add a pass.  Adding stuff to
md-reorg is a step backward.

In any case it seems to me that this pass should run before regrename
and sched2.

Ian


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Paolo Bonzini

On 09/24/2009 08:14 AM, Ian Lance Taylor wrote:

I don't agree with this.  If we want this code to be x86_64 specific,
then it should be done by having the i386 backend add the pass to the
pass manager, much as plugins can add a pass.  Adding stuff to
md-reorg is a step backward.


That's true.  However, time is ticking for 4.5 and this could be a 
decent interim solution while for 4.6 the appropriate hooks could be added.


I proposed md-reorg only because the patch does not include any special 
data-flow.


Paolo


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Ian Lance Taylor
Paolo Bonzini  writes:

> On 09/24/2009 08:14 AM, Ian Lance Taylor wrote:
>> I don't agree with this.  If we want this code to be x86_64 specific,
>> then it should be done by having the i386 backend add the pass to the
>> pass manager, much as plugins can add a pass.  Adding stuff to
>> md-reorg is a step backward.
>
> That's true.  However, time is ticking for 4.5 and this could be a
> decent interim solution while for 4.6 the appropriate hooks could be
> added.

We already have the hooks, they have just been stuck in plugin.c when
they should really be in the generic backend.  See register_pass.

(Sigh, every time I looked at this I said "the pass control has to be
generic" but it still wound up in plugin.c.)

Ian


Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)

2009-09-23 Thread Paolo Bonzini

On 09/24/2009 08:24 AM, Ian Lance Taylor wrote:

We already have the hooks, they have just been stuck in plugin.c when
they should really be in the generic backend.  See register_pass.

(Sigh, every time I looked at this I said "the pass control has to be
generic" but it still wound up in plugin.c.)


Then I'll rephrase and say only that the pass should be in config/i386/.

Paolo