Re: Experimental Patchwork setup

2010-06-10 Thread Paolo Bonzini

On 06/10/2010 06:28 AM, Jeremy Kerr wrote:

Hi Paolo,


The hash would be different for git diff and svn diff due to the
different headers.


The headers are not included in the hash. However, the filenames will need to
be the same - patchwork expects '-p1' patches, but normalises the top-level
directory.

For example, at http://patchwork.ozlabs.org/patch/55140/

--- gcc/config/rs6000/e500.h(revision 160245)
+++ gcc/config/rs6000/e500.h(working copy)

The parser normalises this to:

--- a/config/rs6000/e500.h
+++ b/config/rs6000/e500.h

which may or may not be what you want here (svn outputs -p0?).


svn outputs relative to where you invoke "svn diff" so it can be -p0 but 
also "-p minus something" (i.e. -p0 _and_ you have to invoke patch from 
the right point in the tree).  But this fine, I think, coupled maybe 
with some magic in the hook.


However, it never emits -p1.  What does the parser do for

--- configure(revision blah)
+++ configure(working copy)

?  Maybe for svn patches (detected through the === line) it's better 
to *prepend* a/ b/ instead of replacing it...



The only difficulty is that the parser does not specify an order for the files
in a patch; I believe git orders the file changes alphabetically by filename,
but svn does not. This may cause different hashes.


Yes, that's basically why git users likely will have to put in place 
their own post-commit hook.


Thanks for the information!

Paolo


alpha-dec-osf5.1 4.5 built/installed

2010-06-10 Thread Jay K

per http://gcc.gnu.org/install/finalinstall.html

Built/installed 4.5 on alpha-dec-osf.

alphaev67-dec-osf5.1

bash-4.1$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/jayk/libexec/gcc/alphaev67-dec-osf5.1/4.5.0/lto-wrapper
Target: alphaev67-dec-osf5.1
Configured with: /home/jayk/src/gcc-4.5.0/configure -disable-nls 
-prefix=/home/jayk
Thread model: posix
gcc version 4.5.0 (GCC) 

C and C++
Though I meant to let it do "all".

This isn't a "normal modern" 5.1, like 5.1a or 5.1b, but is old 5.1 rev 732.

Not easy to get the prerequisites for running tests: 
http://gcc.gnu.org/ml/gcc-testresults/2010-06/msg00967.html
More followup still to do.

and I only ran make check in the gcc directory, to skip gmp/mpfr/mpc.


 - Jay
  


internal compiler error in elim_reg_cond

2010-06-10 Thread Boris Boesler
I get an internal compiler error with gcc-4.2.1 and my own back-end
when I support conditional execution:

../build/gcc/cc1 -Wall -O1 -o bug.O1.s bug.c

bug.c: In function ‘cond_assign_les0’:
bug.c:13: internal compiler error: in elim_reg_cond, at flow.c:3486


The test C file "bug.c" is:

int cond_assign_les0(int cond, int a, int b)
{
  int res;

  if(cond <= 0) {
res = a;
  }
  else {
res = b;
  }
  return(res);
}

int main(int argc, char **argv)
{
  int res = cond_assign_les0(1, 0, 1);
  return(res);
}


My "research" so far:
* This error appeared in older gcc versions and is marked as fixed.
* Support for conditional execution works for EQ and NE, but not for
  the other kinds of condition, eg LE
* The ARM back-end works on the test code above without problems, and
  I implemented support for conditional execution in my back-end the
  same way.
* The RTX code x in elim_reg_cond is "UnKnown" (0), regno=48 is the
  first virtual (not hardware) register (see trace below)


The backtrace is:

#0  elim_reg_cond (x=0x140b4ad90, regno=48) at ../../gcc-4.2.1/gcc/flow.c:3486
#1  0x0001001bc58f in flush_reg_cond_reg_1 (node=0x1406092c0, data=) at 
../../gcc-4.2.1/gcc/flow.c:3192
#2  0x00010034fa19 in splay_tree_foreach_helper (sp=0x140608e50, 
node=0x1406092c0, fn=0x1001bc530 , data=0x7fff5fbfe4b0) 
at ../../gcc-4.2.1/libiberty/splay-tree.c:218
#3  0x0001001bd484 in flush_reg_cond_reg [inlined] () at 
/some/path/to/somewhere/gcc-4.2.1/gcc/flow.c:3217
#4  0x0001001bd484 in mark_regno_cond_dead [inlined] () at 
/some/path/to/somewhere/gcc-4.2.1/gcc/flow.c:3095
#5  0x0001001bd484 in mark_set_1 (pbi=0x140608df0, code=, reg=0x140bfc760, cond=0x0, 
insn=0x140b2d5f0, flags=16) at ../../gcc-4.2.1/gcc/flow.c:2895
#6  0x0001001bf8a4 in propagate_one_insn (pbi=0x140608df0, 
insn=0x140b2d5f0) at ../../gcc-4.2.1/gcc/flow.c:1864
#7  0x0001001c009d in propagate_block (bb=0x140bfd680, live=, local_set=, cond_local_set=, flags=1085462000) at 
../../gcc-4.2.1/gcc/flow.c:2218
#8  0x0001001c0ad3 in update_life_info (blocks=0x140601990, 
extent=UPDATE_LIFE_GLOBAL_RM_NOTES, prop_flags=25) at 
../../gcc-4.2.1/gcc/flow.c:1345
#9  0x0001001c14b6 in update_life_info_in_dirty_blocks 
(extent=UPDATE_LIFE_GLOBAL_RM_NOTES, prop_flags=25) at 
../../gcc-4.2.1/gcc/flow.c:763
#10 0x0001002095eb in if_convert (x_life_data_ok=) at ../../gcc-4.2.1/gcc/ifcvt.c:3922
#11 0x000100209722 in rest_of_handle_if_after_reload () at 
../../gcc-4.2.1/gcc/ifcvt.c:4041
#12 0x0001002da597 in execute_one_pass (pass=0x10042a080) at 
../../gcc-4.2.1/gcc/passes.c:881
#13 0x0001002da6f8 in execute_pass_list (pass=0x10042a080) at 
../../gcc-4.2.1/gcc/passes.c:932
#14 0x0001002da70a in execute_pass_list (pass=0x10042b600) at 
../../gcc-4.2.1/gcc/passes.c:933
#15 0x0001002da70a in execute_pass_list (pass=0x10042b5a0) at 
../../gcc-4.2.1/gcc/passes.c:933
#16 0x00010007d5b2 in tree_rest_of_compilation (fndecl=0x140bec1c0) at 
../../gcc-4.2.1/gcc/tree-optimize.c:463
#17 0x0001a945 in c_expand_body (fndecl=0x140bec1c0) at 
../../gcc-4.2.1/gcc/c-decl.c:6836
#18 0x000100313663 in cgraph_expand_function (node=0x140bfa000) at 
../../gcc-4.2.1/gcc/cgraphunit.c:1244
#19 0x000100314c3e in cgraph_expand_all_functions [inlined] () at 
/some/path/to/somewhere/gcc-4.2.1/gcc/cgraphunit.c:1309
#20 0x000100314c3e in cgraph_optimize () at 
../../gcc-4.2.1/gcc/cgraphunit.c:1588
#21 0x0001d5c2 in c_write_global_declarations () at 
../../gcc-4.2.1/gcc/c-decl.c:7951
#22 0x0001002b8058 in toplev_main (argc=, argv=) 
at ../../gcc-4.2.1/gcc/toplev.c:1046
#23 0x00011604 in start ()


 So, I think it's not a GCC problem, but I do something wrong in my
back-end. What could I have done wrong? Register description?

Thanks in advance,
Boris



Please support coo.h

2010-06-10 Thread yuanbin
Coo - C, Object Oriented
http://sourceforge.net/projects/coo/

-coo.h--
#ifndef __COO_H__
#define __COO_H__

typedef struct VTable /*root of virtual table class*/
{
long offset; /*servers for FREE*/
} VTable;

#define EXTENDS(s) \
union \
{ \
s s; \
s; \
};
#define VT(v) const v* vt;
#define EXTENDS2(s,v) \
union \
{ \
VT(v) \
s s; \
s; \
};
#ifndef offsetof
#define offsetof(s,m) ((long)&((s*)0)->m)
#endif
#define SUPER(s,m,p) ((s*)((char*)(p)-offsetof(s,m)))
#define FREE(p,vt) free((char*)(p)-(vt)->offset)

#endif

--
initialization of the enum:
enum {
  int i;
  float f;
} t={1.5}; //t.f
because of EXTENDS2 in coo.h, compiler needs
to initialze last member of enum.


RE: externally_visible and resoultion file

2010-06-10 Thread Bingfeng Mei


> -Original Message-
> From: Cary Coutant [mailto:ccout...@google.com]
> Sent: 09 June 2010 18:43
> To: Richard Guenther
> Cc: Bingfeng Mei; Jan Hubicka; gcc@gcc.gnu.org
> Subject: Re: externally_visible and resoultion file
> 
> >> Yes, this is also what I saw without plugin. I just wonder why
> "v"
> >> is linked with plugin if resolution file is not used to eliminate
> need
> >> of externally_visible attribute here.
> >
> > Probably because of the same linker-plugin bug that causes bar
> > to be resolved.
> 
> Just to make sure I understand the problem:
> 
> - The IR file for a.c contains definitions for v and bar.
> - The linker identifies that both symbols are referenced from outside
> the LTO world (PREVAILING_DEF rather than PREVAILING_DEF_IRONLY), but
> gcc isn't (yet) reading that info from the resolution file.

> - WPA eliminates bar() and makes v static in the replacement object
> file.
> - There are still references to those symbols in b.o, which was
> compiled outside LTO.
> - The linker should be complaining about undefined symbols in both
> cases, but isn't (perhaps because it's still seeing defs left over
> from the IR files). The symbol bar has a value of 0, while the
> reference to v seems to have the right address.
> 
> Is that about right? What you're expecting is a link-time error
> reporting both bar and v as undefined, right?
> 
> -cary

Yes, I expect link-time errors of undefined reference for both
bar and vv. Instead, bar is linked to a bogus one (address 0)
and vv is linked correctly. 






SH optimized software floating point routines

2010-06-10 Thread Naveen H. S
Hi,

Software floating point(libgcc) routines were implemented for SH in the
following links:-
http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00063.html
http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00614.html
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00624.html

There were some discussions regarding the testing of these routines.
We had briefly tested those routines and found that they did not have
any major issues.
http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00791.html
Please let me know whether these routines can be used in SH toolchain.

Please let me know whether we should invoke these routines by default.
Currently, we are thinking of invoking these routines only on specifying
command line options. 

Regards,
Naveen.H.S





Re: Please support coo.h

2010-06-10 Thread Paolo Bonzini

On 06/10/2010 10:57 AM, yuanbin wrote:

initialization of the enum:

 you mean union.

enum {
   int i;
   float f;
} t={1.5}; //t.f


The above makes no sense, what if you have int and char?

You have to say

union { ... } t = { .f = 1.5 };

and that already works in GCC.  If anything else is required by "coo", 
that means "coo" is not written in standard C.


Paolo


complex arithmetics

2010-06-10 Thread roy rosen
Hi All,

I was wondering if there is any architecture which implemented complex
arithmetic in GCC i.e. used modes like CHI or HC.
I would really like to look at an example for that.

Thanks, Roy.


Announce: GNU MPFR 3.0.0 is released

2010-06-10 Thread Vincent Lefevre
GNU MPFR 3.0.0 ("boudin aux pommes") is now available for download
from the MPFR web site:

  http://www.mpfr.org/mpfr-3.0.0/

from INRIAGForge:

  https://gforge.inria.fr/projects/mpfr/

and from the GNU FTP site:

  http://ftp.gnu.org/gnu/mpfr/

Thanks very much to those who sent us bug reports and/or tested
the release candidates.

The MD5's:
f45bac3584922c8004a10060ab1a8f9f  mpfr-3.0.0.tar.bz2
2458a1616f05b7f04f049de2d209c6a7  mpfr-3.0.0.tar.gz
8ab3bef2864b8c6e6a291f5603141bbd  mpfr-3.0.0.tar.xz
8b4cb6fafab2ec5a52548ebcf246234a  mpfr-3.0.0.zip

Changes from versions 2.4.* to version 3.0.0:
- MPFR 3.0.0 is binary incompatible with previous versions but (almost)
  API compatible.  More precisely the obsolete functions mpfr_random
  and mpfr_random2 have been removed, the meaning of the return type
  of the function mpfr_get_f has changed, and the return type of the
  function mpfr_get_z is now int instead of void.  In practice, this
  should not break any existing code.
- MPFR is now distributed under the GNU Lesser General Public License
  version 3 or later (LGPL v3+).
- Rounding modes GMP_RNDx are now MPFR_RNDx (GMP_RNDx kept for
  compatibility).
- A new rounding mode (MPFR_RNDA) is available to round away from zero.
- The rounding mode type is now mpfr_rnd_t (as in previous versions,
  both mpfr_rnd_t and mp_rnd_t are accepted, but mp_rnd_t may be
  removed in the future).
- The precision type is now mpfr_prec_t (as in previous versions, both
  mpfr_prec_t and mp_prec_t are accepted, but mp_prec_t may be removed
  in the future) and it is now signed (it was unsigned in MPFR 2.*, but
  this was not documented). In practice, this change should not affect
  existing code that assumed nothing on the precision type.
- MPFR now has its own exponent type mpfr_exp_t, which is currently
  the same as GMP's mp_exp_t.
- Functions mpfr_random and mpfr_random2 have been removed.
- mpfr_get_f and mpfr_get_z now return a ternary value.
- mpfr_strtofr now accepts bases from 37 to 62.
- mpfr_custom_get_mantissa was renamed to mpfr_custom_get_significand
  (mpfr_custom_get_mantissa is still available via a #define).
- Functions mpfr_get_si, mpfr_get_ui, mpfr_get_sj, mpfr_get_uj,
  mpfr_get_z and mpfr_get_z_2exp no longer have cases with undefined
  behavior; in these cases, the behavior is now specified, and in
  particular, the erange flag is set.
- New functions mpfr_buildopt_tls_p and mpfr_buildopt_decimal_p giving
  information about options used at MPFR build time.
- New function mpfr_regular_p.
- New function mpfr_set_zero.
- New function mpfr_digamma.
- New function mpfr_ai (incomplete, experimental).
- New functions mpfr_set_flt and mpfr_get_flt to convert from/to the
  float type.
- New function mpfr_urandom.
- New function mpfr_set_z_2exp (companion to mpfr_get_z_2exp, which
  was renamed from mpfr_get_z_exp in previous versions).
- Speed improvement for large operands in the trigonometric functions
  (mpfr_sin, mpfr_cos, mpfr_tan, mpfr_sin_cos): speedup of about 2.5
  for 10^5 digits, of about 5 for 10^6 digits.
- Speed improvement for large operands of the inverse trigonometric
  functions (arcsin, arccos, arctan): about 2 for 10^3 digits, up to
  2.7 for 10^6 digits.
- Some documentation files are installed in $docdir.
- The detection of a GMP build directory (more precisely, the internal
  header files of GMP) was previously done separately from the use of
  the --with-gmp-build configure option. This was not consistent with
  the documentation and with other parts of the configure script. So,
  as of MPFR 3.0.0, the internal header files of GMP are now used if
  and only if the --with-gmp-build configure option is given.
- The configure script recognizes some extra "long double" formats
  (double big endian, double little endian, double-double big endian).
- MPFR manual: added "API Compatibility" section.
- Test coverage: 97.1% lines of code.
- Bug fixes.

You can send success and failure reports to , and give
us the canonical system name (by running the "./config.guess" script),
the processor and the compiler version, in order to complete the
"Platforms Known to Support MPFR" section of the MPFR 3.0.0 web page.

Regards,

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)


Re: No output files on 4.6/Cygwin

2010-06-10 Thread Piotr Wyderski
Dave Korn wrote:

>> I've just updated my repo and will schedule a nightly build
>> of trunk with configure settings taken from the bundled gcc4
>> compiler from Cygwin pack in order to see what will happen.
>
>  That's the simplest way to guarantee compatibility.

And now the compiler works correctly, so it is tempting
to assume that the problem was related to SJLJ exceptions,
as you conjectured. Thanks!

Best regards
Piotr Wyderski


Re: Please support coo.h

2010-06-10 Thread yuanbin
--
initialization of the union:
union {
  int i;
  float f;
} t={1.5}; //t.f
because of EXTENDS2 in coo.h, compiler needs
to initialze last member of union.

#include 
typedef struct VBase {} VBase;
typedef struct CBase { VT(VBase) int i; } CBase;
typedef struct VThis {} VThis;
typedef struct CThis { EXTENDS2(CBase,VThis) int j; } CThis;
CThis t={{.CBase={0, 1}}, 1}; //complex
int main() { return 0; }

gcc -fms-extensions
but i want default format:
CThis t={0, 1, 1}; //simple


Re: Please support coo.h

2010-06-10 Thread yuanbin
initialization of global variable?


2010/6/10 Andreas Schwab :
> yuanbin  writes:
>
>> but i want default format:
>> CThis t={0, 1, 1}; //simple
>
> Define a suitable constructor.
>
> Andreas.
>
> --
> Andreas Schwab, sch...@redhat.com
> GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
> "And now for something completely different."
>


Re: Please support coo.h

2010-06-10 Thread Andreas Schwab
yuanbin  writes:

> but i want default format:
> CThis t={0, 1, 1}; //simple

Define a suitable constructor.

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."


Re: internal compiler error in elim_reg_cond

2010-06-10 Thread Ian Lance Taylor
Boris Boesler  writes:

> I get an internal compiler error with gcc-4.2.1 and my own back-end
> when I support conditional execution:
>
> ../build/gcc/cc1 -Wall -O1 -o bug.O1.s bug.c
>
> bug.c: In function ‘cond_assign_les0’:
> bug.c:13: internal compiler error: in elim_reg_cond, at flow.c:3486

What is 'x' when that error occurs?  From a quick glance at the code
that can only happen if your backend has somehow built a conditional
with a component which is not a conditional.

Note that all this code is gone in current gcc.  It was removed in gcc
4.3, replaced by the data flow framework.

Ian


Re: internal compiler error in elim_reg_cond

2010-06-10 Thread Boris Boesler

Am 10.06.2010 um 15:27 schrieb Ian Lance Taylor:

> Boris Boesler  writes:
> 
>> I get an internal compiler error with gcc-4.2.1 and my own back-end
>> when I support conditional execution:
>> 
>> ../build/gcc/cc1 -Wall -O1 -o bug.O1.s bug.c
>> 
>> bug.c: In function ‘cond_assign_les0’:
>> bug.c:13: internal compiler error: in elim_reg_cond, at flow.c:3486
> 
> What is 'x' when that error occurs?  From a quick glance at the code
> that can only happen if your backend has somehow built a conditional
> with a component which is not a conditional.

 I don't really know what 'x' is (but its code is 0/UnKnown); it's generated
by GCC from my conditional execution specification, which I derived from the
manual and the ARM backend:

;;
(define_attr "predicable" "no,yes" (const_string "no"))

;; True if this operator is valid for predication.
(define_special_predicate "predicate_operator"
  ;; fails in tests (match_code "eq,ne,le,lt,ge,gt,geu,gtu,leu,ltu")
  (match_code "eq,ne") ;; works
  ;; fails in tests (match_code "le")
)

(define_cond_exec
  [(match_operator 0 "predicate_operator"
   [(match_operand 1 "cc_register" "")
   (const_int 0)])]
  " ! TARGET_DONT_USE_CONDITIONAL_EXECUTION "
;; this sets a C string, that will be emitted in instructions by %?
  "%J0"
)

(define_insn "addsi3_mem"
  [(set (match_operand:SI 0  "memory_operand""=m")
(plus:SI (match_operand:SI 1 "memory_operand""%m")
 (match_operand:SI 2 "immediate_operand" " i")))
   ]
  ""
  "ADDI%?\t%0, %1, %2"
  [(set_attr "length"  "4")
   (set_attr "predicable"  "yes")
   ]
)

 This works for the predicate operators "eq,ne". Is there anything
wrong with it? Maybe the ARM backend does something that I haven't seen.


> Note that all this code is gone in current gcc.  It was removed in gcc
> 4.3, replaced by the data flow framework.

 Ok, then I will try to move to gcc 4.3

Boris



Re: internal compiler error in elim_reg_cond

2010-06-10 Thread Ian Lance Taylor
Boris Boesler  writes:

> Am 10.06.2010 um 15:27 schrieb Ian Lance Taylor:
>
>> Boris Boesler  writes:
>> 
>>> I get an internal compiler error with gcc-4.2.1 and my own back-end
>>> when I support conditional execution:
>>> 
>>> ../build/gcc/cc1 -Wall -O1 -o bug.O1.s bug.c
>>> 
>>> bug.c: In function ‘cond_assign_les0’:
>>> bug.c:13: internal compiler error: in elim_reg_cond, at flow.c:3486
>> 
>> What is 'x' when that error occurs?  From a quick glance at the code
>> that can only happen if your backend has somehow built a conditional
>> with a component which is not a conditional.
>
>  I don't really know what 'x' is (but its code is 0/UnKnown);

The first step to debugging this kind of problem is figuring what 'x'
is.  If you use gdb, then your gcc object directory will have a
.gdbinit file.  When you run gdb on cc1, you can say "print x" and
then "pr" to see the value of 'x'.  More at
http://gcc.gnu.org/wiki/DebuggingGCC .

I don't see anything wrong with your code.

Ian


Re: a typo in ira-emit.c?

2010-06-10 Thread Vladimir Makarov

Amker.Cheng wrote:

Yes, I think it can be NULL in some complicated cases when a loop exit edge
comes not in the parent loop.


By that, you mean the case an regno lives on edges which transfer
between adjacent loops,
and not lives in parent loop?
  
Yes.  But there are even more complicated cases when an exit edge can go 
in a nested loop of an adjacent loop or even from a nested loop to ...

So, the fprintf would access null pointer in this case.

Thanks for explanation.
  




GCC porting questions

2010-06-10 Thread Radu Hobincu
Hello again,

I have written here a few weeks ago regarding some tutorials on GCC
porting and got some very interesting replies. However, I seem to have
gotten stuck with a couple of issues in spite of my massive Googling, and
I was wondering if anyone could spare a couple of minutes for some
clarifications.

I am having troubles with the condition codes (cc_status). I have looked
over a couple of architectures and I do not seem to understand how they
work.

The machine I am porting GCC for has 1 4bit status register for carry,
zero, less than and equal. I do not have explicit comparison instructions,
all of the ALU instructions modify one or more flags.

What I figured out so far looking over AVR and Cris machine descriptions
is that each instruction that modifies the flags contain an attr
declaration which specify what flags it is changing. Also, there is a
macro called NOTICE_UPDATE_CC which sets up the cc_status accordingly by
reading this attr. This is the part of the code I do not understand. There
are certain functions for which I could not find any descriptions, like
"single_set" and macros like "SET_DEST" and "SET_SRC". Also, looking over
conditions.h, I see that the CC_STATUS structure contains 2 rtx fields:
"value1" and "value2", and also an int called "flags". What do they
represent? Is "flags" the contents of the machine's flag register?

Thanks in advance,
Radu



Re: Please support coo.h

2010-06-10 Thread Wojciech Meyer
On Thu, Jun 10, 2010 at 2:10 PM, yuanbin  wrote:
> initialization of global variable?

No, just define a macro.

>
>
> 2010/6/10 Andreas Schwab :
>> yuanbin  writes:
>>
>>> but i want default format:
>>> CThis t={0, 1, 1}; //simple
>>
>> Define a suitable constructor.
>

Wojciech


Scheduling x86 dispatch windows

2010-06-10 Thread reza yazdani
Hi,



We are in the process of adding a feature to GCC to take advantage of a new 
hardware feature in the latest AMD micro processor. This feature requires a 
certain mix, ordering and alignments in instruction sequences to obtain the 
expected hardware performance.



I am asking the community to review this high level implementation design and 
give me direction or advice. 



The new hardware issues two windows of the size N bytes of instructions in 
every cycle. It goes into accelerate mode if the windows have the right 
combination of instructions or alignments. Our goal is to maximize the IPC by 
proper instruction scheduling and alignments. 



Here is a summary of the most important requirements:



a) Maximum of N instructions per window.

b) An instruction may cross the first window.

c) Each window can have maximum of x memory loads and y memory stores . 

d) The total number of immediate constants in the instructions of a window 
should not exceed k.

e) The first window must be aligned on 16 byte boundary.

f) A Window set terminates when a branch exists in a window.

g) The number of allowed prefixes varies for instructions.

h) A window set needs to be padded by prefixes in instructions or terminated by 
nops to ensure adherence to the rules.



We have the following implementation plan for GCC:



1) Modify the Haifa scheduler to make the desired arrangement of instructions 
for the two dispatch windows. The scheduler is called once before and once 
after register allocation as usual. In both cases it performs dispatch 
scheduling along with its normal job of instruction scheduling. 



The advantage of doing it before register allocation is avoiding extra 
dependencies caused by register allocation which may become an obstacle to 
movement of instructions.  The advantage of doing it after register allocation 
is a consideration for spilling code which may be generated by the register 
allocator.



The algorithm we use is:

a) Considering the current dispatch window set, choose the first instruction 
from ready queue that does not violate dispatch rules.

b) When an instruction is selected and scheduled, inform the dispatcher code 
about the instruction. This step keeps track of the instruction content of 
windows for future evaluation. It also manages the window set by closing and 
opening new virtual dispatch windows.



2) Insertion of alignment code. 

In x86 alignment is done by inserting prefixes or by generating nops. As the 
object code is generated by the assembler in GCC, some information such as 
sizes of branches are unknown until assembly or link time. To do alignments 
related to dispatch correctly in GCC, we need to iteratively compute prefixes 
and branch sizes until its convergence. This pass currently does not exist in 
GCC, but it exists in the assembler. 



There are two possible approaches to solve alignment problem.

a)  Let the assembler performs the alignments and padding needed to adhere with 
the new machine dispatching rules and avoid an extra pass in GCC.

b)  Add a new pass to mimic what assembler does before generating the assembly 
listing in GCC and insert the required alignments.



I appreciate your comments on the proposed implementation procedure and the 
choices a or b above.



Reza Yazdani







Re: GCC porting questions

2010-06-10 Thread Ian Lance Taylor
"Radu Hobincu"  writes:

> I have written here a few weeks ago regarding some tutorials on GCC
> porting and got some very interesting replies. However, I seem to have
> gotten stuck with a couple of issues in spite of my massive Googling, and
> I was wondering if anyone could spare a couple of minutes for some
> clarifications.
>
> I am having troubles with the condition codes (cc_status). I have looked
> over a couple of architectures and I do not seem to understand how they
> work.
>
> The machine I am porting GCC for has 1 4bit status register for carry,
> zero, less than and equal. I do not have explicit comparison instructions,
> all of the ALU instructions modify one or more flags.
>
> What I figured out so far looking over AVR and Cris machine descriptions
> is that each instruction that modifies the flags contain an attr
> declaration which specify what flags it is changing. Also, there is a
> macro called NOTICE_UPDATE_CC which sets up the cc_status accordingly by
> reading this attr. This is the part of the code I do not understand. There
> are certain functions for which I could not find any descriptions, like
> "single_set" and macros like "SET_DEST" and "SET_SRC". Also, looking over
> conditions.h, I see that the CC_STATUS structure contains 2 rtx fields:
> "value1" and "value2", and also an int called "flags". What do they
> represent? Is "flags" the contents of the machine's flag register?

For a new port I recommend that you avoid cc0, cc_status, and
NOTICE_UPDATE_CC.  Instead, model the condition codes as 1 or 4
pseudo-registers.  In your define_insn statements, include SET
expressions which show how the condition code is updated.  This is how
the i386 backend works; see uses of FLAGS_REG in i386.md.

As far as things like single_set, SET_DEST, and SET_SRC, you have
reached the limits of the internal documentation.  You have to open
the source code and look at the comments.  Similarly, the description
of the CC_STATUS fields may be found in the comments above the
definition of CC_STATUS in conditions.h.

Ian


Re: Please support coo.h

2010-06-10 Thread yuanbin
This compiler's extension is valuable


2010/6/10 Wojciech Meyer :
> On Thu, Jun 10, 2010 at 2:10 PM, yuanbin  wrote:
>> initialization of global variable?
>
> No, just define a macro.
>
>>
>>
>> 2010/6/10 Andreas Schwab :
>>> yuanbin  writes:
>>>
 but i want default format:
 CThis t={0, 1, 1}; //simple
>>>
>>> Define a suitable constructor.
>>
>
> Wojciech
>


Re: Scheduling x86 dispatch windows

2010-06-10 Thread Quentin Neill
Cross-posting Reza's call for feedback to the binutils list since it
is relevant -
see the last few paragraphs regarding how to "solve the alignment problem".

Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402

Not sure if followups should occur on one list or both.
--
Quentin Neill


On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani  wrote:
> Hi,
>
> We are in the process of adding a feature to GCC to take advantage
> of a new hardware feature in the latest AMD micro processor. This
> feature requires a certain mix, ordering and alignments in
> instruction sequences to obtain the expected hardware performance.
>
> I am asking the community to review this high level implementation
> design and give me direction or advice.
>
> The new hardware issues two windows of the size N bytes of
> instructions in every cycle. It goes into accelerate mode if the
> windows have the right combination of instructions or alignments. Our
> goal is to maximize the IPC by proper instruction scheduling and
> alignments.
>
> Here is a summary of the most important requirements:
>
> a) Maximum of N instructions per window.
> b) An instruction may cross the first window.
> c) Each window can have maximum of x memory loads and y memory
>stores .
> d) The total number of immediate constants in the instructions
>of a window should not exceed k.
> e) The first window must be aligned on 16 byte boundary.
> f) A Window set terminates when a branch exists in a window.
> g) The number of allowed prefixes varies for instructions.
> h) A window set needs to be padded by prefixes in instructions
>or terminated by nops to ensure adherence to the rules.
>
> We have the following implementation plan for GCC:
>
> 1) Modify the Haifa scheduler to make the desired arrangement of
>instructions for the two dispatch windows. The scheduler is called
>once before and once after register allocation as usual. In both
>cases it performs dispatch scheduling along with its normal job of
>instruction scheduling.
>
> The advantage of doing it before register allocation is avoiding
> extra dependencies caused by register allocation which may become
> an obstacle to movement of instructions.  The advantage of doing
> it after register allocation is a consideration for spilling code
> which may be generated by the register allocator.
>
> The algorithm we use is:
>
> a) Considering the current dispatch window set, choose the first
>instruction from ready queue that does not violate dispatch rules.
> b) When an instruction is selected and scheduled, inform the
>dispatcher code about the instruction. This step keeps track of the
>instruction content of windows for future evaluation. It also manages
>the window set by closing and opening new virtual dispatch windows.
>
> 2) Insertion of alignment code.
>
> In x86 alignment is done by inserting prefixes or by generating
> nops. As the object code is generated by the assembler in GCC, some
> information such as sizes of branches are unknown until assembly or
> link time. To do alignments related to dispatch correctly in GCC,
> we need to iteratively compute prefixes and branch sizes until
> its convergence. This pass currently does not exist in GCC, but it
> exists in the assembler.
>
> There are two possible approaches to solve alignment problem.
>
> a)  Let the assembler performs the alignments and padding needed
> to adhere with the new machine dispatching rules and avoid an extra
> pass in GCC.
> b)  Add a new pass to mimic what assembler does before generating
> the assembly listing in GCC and insert the required alignments.
>
> I appreciate your comments on the proposed implementation procedure
> and the choices a or b above.
>
> Reza Yazdani


Re: Patch pinging

2010-06-10 Thread Gerald Pfeifer
On Wed, 9 Jun 2010, Dave Korn wrote:
>> Here are a few of the people with access to the copyright list: me, Ian,
>> Benjamin Koznik, David Edelsohn, Andreas Schwab, Joseph Myers, Ralf
>> Wildenhues.  This is not a complete list, just people that I remember.
> I also have access and am happy to be asked to check the list to help 
> get a patch cleared.

Same here, and I am doing this somewhat regularily, in fact. :-)

Gerald


Re: Scheduling x86 dispatch windows

2010-06-10 Thread H.J. Lu
On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
 wrote:
> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law  wrote:
>> On 06/10/10 13:52, H.J. Lu wrote:
>>> On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
>>>   wrote:
 Cross-posting Reza's call for feedback to the binutils list since it
 is relevant - s ee the last few paragraphs regarding how to
 "solve the alignment problem".

 Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402

 On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani
  wrote:
> Hi,
>
> We are in the process of adding a feature to GCC to take advantage
> of a new hardware feature in the latest AMD micro processor. This
> feature requires a certain mix, ordering and alignments in
> instruction sequences to obtain the expected hardware performance.
>
> I am asking the community to review this high level implementation
> design and give me direction or advice.
>
> The new hardware issues two windows of the size N bytes of
> instructions in every cycle. It goes into accelerate mode if the
> windows have the right combination of instructions or alignments. Our
> goal is to maximize the IPC by proper instruction scheduling and
> alignments.
>
> Here is a summary of the most important requirements:
>
> a) Maximum of N instructions per window.
> b) An instruction may cross the first window.
> c) Each window can have maximum of x memory loads and y memory
>    stores .
> d) The total number of immediate constants in the instructions
>    of a window should not exceed k.
> e) The first window must be aligned on 16 byte boundary.
> f) A Window set terminates when a branch exists in a window.
> g) The number of allowed prefixes varies for instructions.
> h) A window set needs to be padded by prefixes in instructions
>    or terminated by nops to ensure adherence to the rules.
>
> We have the following implementation plan for GCC:
>
> 1) Modify the Haifa scheduler to make the desired arrangement of
>    instructions for the two dispatch windows. The scheduler is called
>    once before and once after register allocation as usual. In both
>    cases it performs dispatch scheduling along with its normal job of
>    instruction scheduling.
>
> The advantage of doing it before register allocation is avoiding
> extra dependencies caused by register allocation which may become
> an obstacle to movement of instructions.  The advantage of doing
> it after register allocation is a consideration for spilling code
> which may be generated by the register allocator.
>
> The algorithm we use is:
>
> a) Considering the current dispatch window set, choose the first
>    instruction from ready queue that does not violate dispatch rules.
> b) When an instruction is selected and scheduled, inform the
>    dispatcher code about the instruction. This step keeps track of the
>    instruction content of windows for future evaluation. It also manages
>    the window set by closing and opening new virtual dispatch windows.
>
> 2) Insertion of alignment code.
>
> In x86 alignment is done by inserting prefixes or by generating
> nops. As the object code is generated by the assembler in GCC, some
> information such as sizes of branches are unknown until assembly or
> link time. To do alignments related to dispatch correctly in GCC,
> we need to iteratively compute prefixes and branch sizes until
> its convergence. This pass currently does not exist in GCC, but it
> exists in the assembler.
>
> There are two possible approaches to solve alignment problem.
>
> a)  Let the assembler performs the alignments and padding needed
>     to adhere with the new machine dispatching rules and avoid an extra
>     pass in GCC.
> b)  Add a new pass to mimic what assembler does before generating
>     the assembly listing in GCC and insert the required alignments.
>
> I appreciate your comments on the proposed implementation procedure
> and the choices a or b above.
>
>>>
>>> I don't this should be done in assembler. Assembler should just assemble
>>> the assembly input.
>>
>> That adds quite a bit of complication to the compiler though -- getting the
>> instruction lengths right (and thus proper packing & alignment) can be
>> extremely difficult.  I did some experiments with this on a target with
>> *fixed* instruction lengths a while back and even though the port tried hard
>> to get lengths right, it would routinely miss something.  Ultimately I
>> decided that it forcing the compiler to know instruction lengths with a very
>> high degree of accuracy wasn't a sane thing to do.    Dealing with variable
>> instruction lengths just adds yet another complexity to the situation.  Then
>> add the complication of needing to

Re: Scheduling x86 dispatch windows

2010-06-10 Thread H.J. Lu
On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
 wrote:
> Cross-posting Reza's call for feedback to the binutils list since it
> is relevant -
> see the last few paragraphs regarding how to "solve the alignment problem".
>
> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>
> Not sure if followups should occur on one list or both.
> --
> Quentin Neill
>
>
> On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani  wrote:
>> Hi,
>>
>> We are in the process of adding a feature to GCC to take advantage
>> of a new hardware feature in the latest AMD micro processor. This
>> feature requires a certain mix, ordering and alignments in
>> instruction sequences to obtain the expected hardware performance.
>>
>> I am asking the community to review this high level implementation
>> design and give me direction or advice.
>>
>> The new hardware issues two windows of the size N bytes of
>> instructions in every cycle. It goes into accelerate mode if the
>> windows have the right combination of instructions or alignments. Our
>> goal is to maximize the IPC by proper instruction scheduling and
>> alignments.
>>
>> Here is a summary of the most important requirements:
>>
>> a) Maximum of N instructions per window.
>> b) An instruction may cross the first window.
>> c) Each window can have maximum of x memory loads and y memory
>>    stores .
>> d) The total number of immediate constants in the instructions
>>    of a window should not exceed k.
>> e) The first window must be aligned on 16 byte boundary.
>> f) A Window set terminates when a branch exists in a window.
>> g) The number of allowed prefixes varies for instructions.
>> h) A window set needs to be padded by prefixes in instructions
>>    or terminated by nops to ensure adherence to the rules.
>>
>> We have the following implementation plan for GCC:
>>
>> 1) Modify the Haifa scheduler to make the desired arrangement of
>>    instructions for the two dispatch windows. The scheduler is called
>>    once before and once after register allocation as usual. In both
>>    cases it performs dispatch scheduling along with its normal job of
>>    instruction scheduling.
>>
>> The advantage of doing it before register allocation is avoiding
>> extra dependencies caused by register allocation which may become
>> an obstacle to movement of instructions.  The advantage of doing
>> it after register allocation is a consideration for spilling code
>> which may be generated by the register allocator.
>>
>> The algorithm we use is:
>>
>> a) Considering the current dispatch window set, choose the first
>>    instruction from ready queue that does not violate dispatch rules.
>> b) When an instruction is selected and scheduled, inform the
>>    dispatcher code about the instruction. This step keeps track of the
>>    instruction content of windows for future evaluation. It also manages
>>    the window set by closing and opening new virtual dispatch windows.
>>
>> 2) Insertion of alignment code.
>>
>> In x86 alignment is done by inserting prefixes or by generating
>> nops. As the object code is generated by the assembler in GCC, some
>> information such as sizes of branches are unknown until assembly or
>> link time. To do alignments related to dispatch correctly in GCC,
>> we need to iteratively compute prefixes and branch sizes until
>> its convergence. This pass currently does not exist in GCC, but it
>> exists in the assembler.
>>
>> There are two possible approaches to solve alignment problem.
>>
>> a)  Let the assembler performs the alignments and padding needed
>>     to adhere with the new machine dispatching rules and avoid an extra
>>     pass in GCC.
>> b)  Add a new pass to mimic what assembler does before generating
>>     the assembly listing in GCC and insert the required alignments.
>>
>> I appreciate your comments on the proposed implementation procedure
>> and the choices a or b above.

I don't this should be done in assembler. Assembler should just assemble
the assembly input.

-- 
H.J.


Re: Patch pinging

2010-06-10 Thread Quentin Neill
On Tue, Jun 8, 2010 at 6:30 AM, Jonathan Wakely  wrote:
> On 7 June 2010 22:43, Ian Lance Taylor wrote:
>>
>> The patch tracker (http://gcc.gnu.org/wiki/GCC_Patch_Tracking) is not
>> currently operating.
>>
>> Would anybody like to volunteer to get it working again?
>
> I'm not volunteering, but I might look into it one day
>
> If dberlin doesn't still have the code, it shouldn't be too hard...
>
> ... a script which periodically crawls the gcc-patches archive might 
> suffice...

I have a python script which crawls, caches, and parses the gcc-cvs
(and binutils-cvs) email archive pages.  I wrote it to help another
script that correlates patch revisions in a branch (where the
Changelog refers to revisions on the trunk) back to the useful
Changelog entries in the trunk.

I could submit that to contrib, it could be modified to scrape most of
the information above into a single monthly report.

Any interest?
-- 
Quentin Neill


Re: Scheduling x86 dispatch windows

2010-06-10 Thread Joern Rennecke

Quoting Jeff Law :


That adds quite a bit of complication to the compiler though -- getting
the instruction lengths right (and thus proper packing & alignment) can
be extremely difficult.  I did some experiments with this on a target
with *fixed* instruction lengths a while back and even though the port
tried hard to get lengths right, it would routinely miss something.
Ultimately I decided that it forcing the compiler to know instruction
lengths with a very high degree of accuracy wasn't a sane thing to do.
  Dealing with variable instruction lengths just adds yet another
complexity to the situation.  Then add the complication of needing to
add specific prefixes or nops and it just gets downright ugly.


I did add alignment-aware & exact branch shortening to the ARCompact port,
but ultimately the added complexity due to this was also a factor why the
port couldn't go into mainline without an active maintainer.
The code is available on branches.
See PR target/39303.


hot/cold pointer annotation

2010-06-10 Thread Andi Kleen
Hi Honza,

Here's an idea to make it easier to manually annotate
large C code bases for hot/cold functions where
it's too difficult to use profile feedback.

It's fairly common here to call function through
function pointers in manual method tables.

A lot of code is targetted by a few function pointers
(think like backends or drivers) 

Some of these function pointers always point to cold
code (e.g. init/exit code) while others are usually
hot.

Now as an alternative to manually annotate the hot/cold 
functions it would be much simpler to annotate the function
pointers and let the functions that get assigned to 
inherit that.

So for example 

struct ops {
void (*init)() __attribute__((cold));
void (*exit)() __attribute__((cold));
void (*hot_op)() __attribute__((hot));
};

void init_a(void) {} 
void exit_a(void) {}
void hot_op(void) {} 

const struct ops objecta = {
.init = init_a,
.exit = exit_a,
.hot_op = hot_op_a
};

/* lots of similar objects with struct ops method tables */

init_a, exit_a and their callees (if they are not
called by anything else) would automatically become all cold,
and hot_op_a (and unique callees) hot, because they
are assigned to a cold or hot function pointer.

Basically the hot/coldness would be inheritted from
a function pointer assignment too.

Do you think a scheme like this would be possible to implement?

Thanks,
-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.


Re: Scheduling x86 dispatch windows

2010-06-10 Thread Quentin Neill
On Thu, Jun 10, 2010 at 4:08 PM, H.J. Lu  wrote:
> On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
>  wrote:
>> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law  wrote:
>>> On 06/10/10 13:52, H.J. Lu wrote:
 On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
   wrote:
> Cross-posting Reza's call for feedback to the binutils list since it
> is relevant - s ee the last few paragraphs regarding how to
> "solve the alignment problem".
>
> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>
> On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani
>  wrote:
>> Hi,
>>
>> We are in the process of adding a feature to GCC to take advantage
>> of a new hardware feature in the latest AMD micro processor. This
>> feature requires a certain mix, ordering and alignments in
>> instruction sequences to obtain the expected hardware performance.
>>
>> I am asking the community to review this high level implementation
>> design and give me direction or advice.
>>
>> The new hardware issues two windows of the size N bytes of
>> instructions in every cycle. It goes into accelerate mode if the
>> windows have the right combination of instructions or alignments. Our
>> goal is to maximize the IPC by proper instruction scheduling and
>> alignments.
>>
>> Here is a summary of the most important requirements:
>>
>> a) Maximum of N instructions per window.
>> b) An instruction may cross the first window.
>> c) Each window can have maximum of x memory loads and y memory
>>    stores .
>> d) The total number of immediate constants in the instructions
>>    of a window should not exceed k.
>> e) The first window must be aligned on 16 byte boundary.
>> f) A Window set terminates when a branch exists in a window.
>> g) The number of allowed prefixes varies for instructions.
>> h) A window set needs to be padded by prefixes in instructions
>>    or terminated by nops to ensure adherence to the rules.
>>
>> We have the following implementation plan for GCC:
>>
>> 1) Modify the Haifa scheduler to make the desired arrangement of
>>    instructions for the two dispatch windows. The scheduler is called
>>    once before and once after register allocation as usual. In both
>>    cases it performs dispatch scheduling along with its normal job of
>>    instruction scheduling.
>>
>> The advantage of doing it before register allocation is avoiding
>> extra dependencies caused by register allocation which may become
>> an obstacle to movement of instructions.  The advantage of doing
>> it after register allocation is a consideration for spilling code
>> which may be generated by the register allocator.
>>
>> The algorithm we use is:
>>
>> a) Considering the current dispatch window set, choose the first
>>    instruction from ready queue that does not violate dispatch rules.
>> b) When an instruction is selected and scheduled, inform the
>>    dispatcher code about the instruction. This step keeps track of the
>>    instruction content of windows for future evaluation. It also manages
>>    the window set by closing and opening new virtual dispatch windows.
>>
>> 2) Insertion of alignment code.
>>
>> In x86 alignment is done by inserting prefixes or by generating
>> nops. As the object code is generated by the assembler in GCC, some
>> information such as sizes of branches are unknown until assembly or
>> link time. To do alignments related to dispatch correctly in GCC,
>> we need to iteratively compute prefixes and branch sizes until
>> its convergence. This pass currently does not exist in GCC, but it
>> exists in the assembler.
>>
>> There are two possible approaches to solve alignment problem.
>>
>> a)  Let the assembler performs the alignments and padding needed
>>     to adhere with the new machine dispatching rules and avoid an extra
>>     pass in GCC.
>> b)  Add a new pass to mimic what assembler does before generating
>>     the assembly listing in GCC and insert the required alignments.
>>
>> I appreciate your comments on the proposed implementation procedure
>> and the choices a or b above.
>>

 I don't this should be done in assembler. Assembler should just assemble
 the assembly input.
>>>
>>> That adds quite a bit of complication to the compiler though -- getting the
>>> instruction lengths right (and thus proper packing & alignment) can be
>>> extremely difficult.  I did some experiments with this on a target with
>>> *fixed* instruction lengths a while back and even though the port tried hard
>>> to get lengths right, it would routinely miss something.  Ultimately I
>>> decided that it forcing the compiler to know instruction lengths with a very
>>> high degree of accuracy wasn't a sane thing

Re: Minor issue with recent code to twiddle costs of pseudos with invariant equivalents

2010-06-10 Thread Bernd Schmidt
On 06/10/2010 10:37 PM, Jeff Law wrote:
> 
> Compile the attached with -O2 on x86-unknown-linux-gnu and review the
> .ira dump for main()
> 
> starting the processing of deferred insns
> ending the processing of deferred insns
> df_analyze called
> Building IRA IR
> starting the processing of deferred insns
> ending the processing of deferred insns
> df_analyze called
> init_insns for 59: (insn_list:REG_DEP_TRUE 5 (nil))
> Reg 59 has equivalence, initial gains 4000

[...]

> r59: preferred NO_REGS, alternative NO_REGS, cover NO_REGS

[...]
> Disposition:
> 0:r59  l0   mem

> Ultimately I think reload is cleaning this up, but it seems awful
> strange to have a pseudo/allocno which clearly should be allocated to a
> hard GPR preferring NO_REGS and from an allocation standpoint living in
> memory.

>From the above, I don't see the problem.  Reg 59 is detected as
reg_equiv_invariant, which means if we don't allocate a hard reg to it,
we can substitute the invariant everywhere and save the initializing
instruction.  As far as I can tell this is working exactly as intended.


Bernd



Re: Scheduling x86 dispatch windows

2010-06-10 Thread Jeff Law

On 06/10/10 13:52, H.J. Lu wrote:

On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
  wrote:
   

Cross-posting Reza's call for feedback to the binutils list since it
is relevant -
see the last few paragraphs regarding how to "solve the alignment problem".

Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402

Not sure if followups should occur on one list or both.
--
Quentin Neill


On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani  wrote:
 

Hi,

We are in the process of adding a feature to GCC to take advantage
of a new hardware feature in the latest AMD micro processor. This
feature requires a certain mix, ordering and alignments in
instruction sequences to obtain the expected hardware performance.

I am asking the community to review this high level implementation
design and give me direction or advice.

The new hardware issues two windows of the size N bytes of
instructions in every cycle. It goes into accelerate mode if the
windows have the right combination of instructions or alignments. Our
goal is to maximize the IPC by proper instruction scheduling and
alignments.

Here is a summary of the most important requirements:

a) Maximum of N instructions per window.
b) An instruction may cross the first window.
c) Each window can have maximum of x memory loads and y memory
stores .
d) The total number of immediate constants in the instructions
of a window should not exceed k.
e) The first window must be aligned on 16 byte boundary.
f) A Window set terminates when a branch exists in a window.
g) The number of allowed prefixes varies for instructions.
h) A window set needs to be padded by prefixes in instructions
or terminated by nops to ensure adherence to the rules.

We have the following implementation plan for GCC:

1) Modify the Haifa scheduler to make the desired arrangement of
instructions for the two dispatch windows. The scheduler is called
once before and once after register allocation as usual. In both
cases it performs dispatch scheduling along with its normal job of
instruction scheduling.

The advantage of doing it before register allocation is avoiding
extra dependencies caused by register allocation which may become
an obstacle to movement of instructions.  The advantage of doing
it after register allocation is a consideration for spilling code
which may be generated by the register allocator.

The algorithm we use is:

a) Considering the current dispatch window set, choose the first
instruction from ready queue that does not violate dispatch rules.
b) When an instruction is selected and scheduled, inform the
dispatcher code about the instruction. This step keeps track of the
instruction content of windows for future evaluation. It also manages
the window set by closing and opening new virtual dispatch windows.

2) Insertion of alignment code.

In x86 alignment is done by inserting prefixes or by generating
nops. As the object code is generated by the assembler in GCC, some
information such as sizes of branches are unknown until assembly or
link time. To do alignments related to dispatch correctly in GCC,
we need to iteratively compute prefixes and branch sizes until
its convergence. This pass currently does not exist in GCC, but it
exists in the assembler.

There are two possible approaches to solve alignment problem.

a)  Let the assembler performs the alignments and padding needed
 to adhere with the new machine dispatching rules and avoid an extra
 pass in GCC.
b)  Add a new pass to mimic what assembler does before generating
 the assembly listing in GCC and insert the required alignments.

I appreciate your comments on the proposed implementation procedure
and the choices a or b above.
   

I don't this should be done in assembler. Assembler should just assemble
the assembly input.
   
That adds quite a bit of complication to the compiler though -- getting 
the instruction lengths right (and thus proper packing & alignment) can 
be extremely difficult.  I did some experiments with this on a target 
with *fixed* instruction lengths a while back and even though the port 
tried hard to get lengths right, it would routinely miss something.  
Ultimately I decided that it forcing the compiler to know instruction 
lengths with a very high degree of accuracy wasn't a sane thing to 
do.Dealing with variable instruction lengths just adds yet another 
complexity to the situation.  Then add the complication of needing to 
add specific prefixes or nops and it just gets downright ugly.


I'd probably approach this by having the compiler emit a directive which 
states what the desired alignment at a particular point should be, then 
allow the assembler to select the best method to get the desired alignment.


jeff





Re: Please support coo.h

2010-06-10 Thread Dave Korn
On 10/06/2010 18:07, yuanbin wrote:
> This compiler's extension is valuable

  No, it isn't very valuable, sorry to be blunt.  I think you are following a
really wrong path here.  You are trying to implement a C++-alike
object-oriented system in C.  That makes sense as far as it goes, but if you
find yourself having to propose modifying the C compiler in a direction that
basically makes it speak C++, you might as well just use C++ in the first
place.  You want the compiler to automatically choose one of several different
ways to initialise a union according to the data type of the argument you use
to initialise it with; basically, that means you want overloaded constructors.
 So you should just use C++, which already is C with overloaded constructors.
 And it also already has all the other features that you'll discover you need
in the compiler as you carry along this path.

  By the time you get to the end of your journey, "coo.h" will be an empty
file and all the functionality will have been added to the C compiler until it
turns into a C++ compiler.  I think you need to choose a different plan.

cheers,
  DaveK



Re: Scheduling x86 dispatch windows

2010-06-10 Thread H.J. Lu
On Thu, Jun 10, 2010 at 3:09 PM, Quentin Neill
 wrote:
> On Thu, Jun 10, 2010 at 4:08 PM, H.J. Lu  wrote:
>> On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
>>  wrote:
>>> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law  wrote:
 On 06/10/10 13:52, H.J. Lu wrote:
> On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
>   wrote:
>> Cross-posting Reza's call for feedback to the binutils list since it
>> is relevant - s ee the last few paragraphs regarding how to
>> "solve the alignment problem".
>>
>> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>>
>> On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani
>>  wrote:
>>> Hi,
>>>
>>> We are in the process of adding a feature to GCC to take advantage
>>> of a new hardware feature in the latest AMD micro processor. This
>>> feature requires a certain mix, ordering and alignments in
>>> instruction sequences to obtain the expected hardware performance.
>>>
>>> I am asking the community to review this high level implementation
>>> design and give me direction or advice.
>>>
>>> The new hardware issues two windows of the size N bytes of
>>> instructions in every cycle. It goes into accelerate mode if the
>>> windows have the right combination of instructions or alignments. Our
>>> goal is to maximize the IPC by proper instruction scheduling and
>>> alignments.
>>>
>>> Here is a summary of the most important requirements:
>>>
>>> a) Maximum of N instructions per window.
>>> b) An instruction may cross the first window.
>>> c) Each window can have maximum of x memory loads and y memory
>>>    stores .
>>> d) The total number of immediate constants in the instructions
>>>    of a window should not exceed k.
>>> e) The first window must be aligned on 16 byte boundary.
>>> f) A Window set terminates when a branch exists in a window.
>>> g) The number of allowed prefixes varies for instructions.
>>> h) A window set needs to be padded by prefixes in instructions
>>>    or terminated by nops to ensure adherence to the rules.
>>>
>>> We have the following implementation plan for GCC:
>>>
>>> 1) Modify the Haifa scheduler to make the desired arrangement of
>>>    instructions for the two dispatch windows. The scheduler is called
>>>    once before and once after register allocation as usual. In both
>>>    cases it performs dispatch scheduling along with its normal job of
>>>    instruction scheduling.
>>>
>>> The advantage of doing it before register allocation is avoiding
>>> extra dependencies caused by register allocation which may become
>>> an obstacle to movement of instructions.  The advantage of doing
>>> it after register allocation is a consideration for spilling code
>>> which may be generated by the register allocator.
>>>
>>> The algorithm we use is:
>>>
>>> a) Considering the current dispatch window set, choose the first
>>>    instruction from ready queue that does not violate dispatch rules.
>>> b) When an instruction is selected and scheduled, inform the
>>>    dispatcher code about the instruction. This step keeps track of the
>>>    instruction content of windows for future evaluation. It also manages
>>>    the window set by closing and opening new virtual dispatch windows.
>>>
>>> 2) Insertion of alignment code.
>>>
>>> In x86 alignment is done by inserting prefixes or by generating
>>> nops. As the object code is generated by the assembler in GCC, some
>>> information such as sizes of branches are unknown until assembly or
>>> link time. To do alignments related to dispatch correctly in GCC,
>>> we need to iteratively compute prefixes and branch sizes until
>>> its convergence. This pass currently does not exist in GCC, but it
>>> exists in the assembler.
>>>
>>> There are two possible approaches to solve alignment problem.
>>>
>>> a)  Let the assembler performs the alignments and padding needed
>>>     to adhere with the new machine dispatching rules and avoid an extra
>>>     pass in GCC.
>>> b)  Add a new pass to mimic what assembler does before generating
>>>     the assembly listing in GCC and insert the required alignments.
>>>
>>> I appreciate your comments on the proposed implementation procedure
>>> and the choices a or b above.
>>>
>
> I don't this should be done in assembler. Assembler should just assemble
> the assembly input.

 That adds quite a bit of complication to the compiler though -- getting the
 instruction lengths right (and thus proper packing & alignment) can be
 extremely difficult.  I did some experiments with this on a target with
 *fixed* instruction lengths a while back and even though the port tried 
 hard
 to get lengths right, it would routinel

Minor issue with recent code to twiddle costs of pseudos with invariant equivalents

2010-06-10 Thread Jeff Law


Compile the attached with -O2 on x86-unknown-linux-gnu and review the 
.ira dump for main()


starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
Building IRA IR
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
init_insns for 59: (insn_list:REG_DEP_TRUE 5 (nil))
Reg 59 has equivalence, initial gains 4000

Pass 0 for finding pseudo/allocno costs

a0 (r59,l0) best NO_REGS, cover NO_REGS

  a0(r59,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 
DIREG:-1000,-1000 AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 
NON_Q_REGS:0,0 LEGACY_REGS:0,0 GENERAL_REGS:0,0 
SSE_FIRST_REG:26000,26000 SSE_REGS:26000,26000 MMX_REGS:26000,26000 MEM:7000


Pass 1 for finding pseudo/allocno costs

r59: preferred NO_REGS, alternative NO_REGS, cover NO_REGS

  a0(r59,l0) costs: AREG:8000,8000 DREG:8000,8000 CREG:8000,8000 
BREG:8000,8000 SIREG:8000,8000 DIREG:3000,3000 AD_REGS:8000,8000 
CLOBBERED_REGS:8000,8000 Q_REGS:8000,8000 NON_Q_REGS:8000,8000 
LEGACY_REGS:8000,8000 GENERAL_REGS:8000,8000 SSE_FIRST_REG:34000,34000 
SSE_REGS:34000,34000 MMX_REGS:34000,34000 MEM:15000


[ ... ]
  Loop 0 (parent -1, header bb0, depth 0)
bbs: 2
all: 0r59
modified regnos: 59
border:
Pressure: GENERAL_REGS=4
  Spill a0(r59,l0)
Disposition:
0:r59  l0   mem

Ultimately I think reload is cleaning this up, but it seems awful 
strange to have a pseudo/allocno which clearly should be allocated to a 
hard GPR preferring NO_REGS and from an allocation standpoint living in 
memory.


Jeff
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;
typedef signed char __int8_t;
typedef unsigned char __uint8_t;
typedef signed short int __int16_t;
typedef unsigned short int __uint16_t;
typedef signed int __int32_t;
typedef unsigned int __uint32_t;
typedef signed long int __int64_t;
typedef unsigned long int __uint64_t;
typedef long int __quad_t;
typedef unsigned long int __u_quad_t;
typedef unsigned long int __dev_t;
typedef unsigned int __uid_t;
typedef unsigned int __gid_t;
typedef unsigned long int __ino_t;
typedef unsigned long int __ino64_t;
typedef unsigned int __mode_t;
typedef unsigned long int __nlink_t;
typedef long int __off_t;
typedef long int __off64_t;
typedef int __pid_t;
typedef struct { int __val[2]; } __fsid_t;
typedef long int __clock_t;
typedef unsigned long int __rlim_t;
typedef unsigned long int __rlim64_t;
typedef unsigned int __id_t;
typedef long int __time_t;
typedef unsigned int __useconds_t;
typedef long int __suseconds_t;
typedef int __daddr_t;
typedef long int __swblk_t;
typedef int __key_t;
typedef int __clockid_t;
typedef void * __timer_t;
typedef long int __blksize_t;
typedef long int __blkcnt_t;
typedef long int __blkcnt64_t;
typedef unsigned long int __fsblkcnt_t;
typedef unsigned long int __fsblkcnt64_t;
typedef unsigned long int __fsfilcnt_t;
typedef unsigned long int __fsfilcnt64_t;
typedef long int __ssize_t;
typedef __off64_t __loff_t;
typedef __quad_t *__qaddr_t;
typedef char *__caddr_t;
typedef long int __intptr_t;
typedef unsigned int __socklen_t;
typedef long unsigned int size_t;
typedef __time_t time_t;
struct timespec
  {
__time_t tv_sec;
long int tv_nsec;
  };
typedef __pid_t pid_t;
struct sched_param
  {
int __sched_priority;
  };
extern int clone (int (*__fn) (void *__arg), void *__child_stack,
int __flags, void *__arg, ...) __attribute__ ((__nothrow__));
extern int unshare (int __flags) __attribute__ ((__nothrow__));
extern int sched_getcpu (void) __attribute__ ((__nothrow__));
struct __sched_param
  {
int __sched_priority;
  };
typedef unsigned long int __cpu_mask;
typedef struct
{
  __cpu_mask __bits[1024 / (8 * sizeof (__cpu_mask))];
} cpu_set_t;
extern int __sched_cpucount (size_t __setsize, const cpu_set_t *__setp)
  __attribute__ ((__nothrow__));
extern cpu_set_t *__sched_cpualloc (size_t __count) __attribute__ ((__nothrow__)) ;
extern void __sched_cpufree (cpu_set_t *__set) __attribute__ ((__nothrow__));
extern int sched_setparam (__pid_t __pid, __const struct sched_param *__param)
 __attribute__ ((__nothrow__));
extern int sched_getparam (__pid_t __pid, struct sched_param *__param) __attribute__ ((__nothrow__));
extern int sched_setscheduler (__pid_t __pid, int __policy,
  __const struct sched_param *__param) __attribute__ ((__nothrow__));
extern int sched_getscheduler (__pid_t __pid) __attribute__ ((__nothrow__));
extern int sched_yield (void) __attribute__ ((__nothrow__));
extern int sched_get_priority_max (int __algorithm) __attribute__ ((__nothrow__));
extern int sched_get_priority_min (int __algorithm) __attribute__ ((__nothrow__));
extern int sched_rr_get_interval (__pid_t __pid, struct timespec *__t) __attribute__ ((__nothrow__));
typedef __clock_t clock_t;
typedef __clockid_t clockid_t;
typedef __timer_t timer_t;
struct tm
{
  int tm_sec;
  int tm_min;

Re: Scheduling x86 dispatch windows

2010-06-10 Thread Quentin Neill
On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law  wrote:
> On 06/10/10 13:52, H.J. Lu wrote:
>> On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
>>   wrote:
>>> Cross-posting Reza's call for feedback to the binutils list since it
>>> is relevant - s ee the last few paragraphs regarding how to
>>> "solve the alignment problem".
>>>
>>> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>>>
>>> On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani
>>>  wrote:
 Hi,

 We are in the process of adding a feature to GCC to take advantage
 of a new hardware feature in the latest AMD micro processor. This
 feature requires a certain mix, ordering and alignments in
 instruction sequences to obtain the expected hardware performance.

 I am asking the community to review this high level implementation
 design and give me direction or advice.

 The new hardware issues two windows of the size N bytes of
 instructions in every cycle. It goes into accelerate mode if the
 windows have the right combination of instructions or alignments. Our
 goal is to maximize the IPC by proper instruction scheduling and
 alignments.

 Here is a summary of the most important requirements:

 a) Maximum of N instructions per window.
 b) An instruction may cross the first window.
 c) Each window can have maximum of x memory loads and y memory
    stores .
 d) The total number of immediate constants in the instructions
    of a window should not exceed k.
 e) The first window must be aligned on 16 byte boundary.
 f) A Window set terminates when a branch exists in a window.
 g) The number of allowed prefixes varies for instructions.
 h) A window set needs to be padded by prefixes in instructions
    or terminated by nops to ensure adherence to the rules.

 We have the following implementation plan for GCC:

 1) Modify the Haifa scheduler to make the desired arrangement of
    instructions for the two dispatch windows. The scheduler is called
    once before and once after register allocation as usual. In both
    cases it performs dispatch scheduling along with its normal job of
    instruction scheduling.

 The advantage of doing it before register allocation is avoiding
 extra dependencies caused by register allocation which may become
 an obstacle to movement of instructions.  The advantage of doing
 it after register allocation is a consideration for spilling code
 which may be generated by the register allocator.

 The algorithm we use is:

 a) Considering the current dispatch window set, choose the first
    instruction from ready queue that does not violate dispatch rules.
 b) When an instruction is selected and scheduled, inform the
    dispatcher code about the instruction. This step keeps track of the
    instruction content of windows for future evaluation. It also manages
    the window set by closing and opening new virtual dispatch windows.

 2) Insertion of alignment code.

 In x86 alignment is done by inserting prefixes or by generating
 nops. As the object code is generated by the assembler in GCC, some
 information such as sizes of branches are unknown until assembly or
 link time. To do alignments related to dispatch correctly in GCC,
 we need to iteratively compute prefixes and branch sizes until
 its convergence. This pass currently does not exist in GCC, but it
 exists in the assembler.

 There are two possible approaches to solve alignment problem.

 a)  Let the assembler performs the alignments and padding needed
     to adhere with the new machine dispatching rules and avoid an extra
     pass in GCC.
 b)  Add a new pass to mimic what assembler does before generating
     the assembly listing in GCC and insert the required alignments.

 I appreciate your comments on the proposed implementation procedure
 and the choices a or b above.

>>
>> I don't this should be done in assembler. Assembler should just assemble
>> the assembly input.
>
> That adds quite a bit of complication to the compiler though -- getting the
> instruction lengths right (and thus proper packing & alignment) can be
> extremely difficult.  I did some experiments with this on a target with
> *fixed* instruction lengths a while back and even though the port tried hard
> to get lengths right, it would routinely miss something.  Ultimately I
> decided that it forcing the compiler to know instruction lengths with a very
> high degree of accuracy wasn't a sane thing to do.    Dealing with variable
> instruction lengths just adds yet another complexity to the situation.  Then
> add the complication of needing to add specific prefixes or nops and it just
> gets downright ugly.
>
> I'd probably approach this by having the compiler emit a directive which
> states wha

Re: Scheduling x86 dispatch windows

2010-06-10 Thread Daniel Jacobowitz
On Thu, Jun 10, 2010 at 02:03:03PM -0600, Jeff Law wrote:
> That adds quite a bit of complication to the compiler though --
> getting the instruction lengths right (and thus proper packing &
> alignment) can be extremely difficult.  I did some experiments with
> this on a target with *fixed* instruction lengths a while back and
> even though the port tried hard to get lengths right, it would
> routinely miss something.  Ultimately I decided that it forcing the
> compiler to know instruction lengths with a very high degree of
> accuracy wasn't a sane thing to do.

FWIW, my opinion (and I think Jakub has expressed a similar opinion
and/or tool in the past) is that there is a sane way to do this: put
assertions in the assembler output and have the assembler validate
them.

On the other hand, I'm not going to argue that it's a lot of work.

-- 
Daniel Jacobowitz
CodeSourcery


gcc-4.5-20100610 is now available

2010-06-10 Thread gccadmin
Snapshot gcc-4.5-20100610 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20100610/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch 
revision 160582

You'll find:

gcc-4.5-20100610.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.5-20100610.tar.bz2 C front end and core compiler

gcc-ada-4.5-20100610.tar.bz2  Ada front end and runtime

gcc-fortran-4.5-20100610.tar.bz2  Fortran front end and runtime

gcc-g++-4.5-20100610.tar.bz2  C++ front end and runtime

gcc-java-4.5-20100610.tar.bz2 Java front end and runtime

gcc-objc-4.5-20100610.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.5-20100610.tar.bz2The GCC testsuite

Diffs from 4.5-20100603 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Please support coo.h

2010-06-10 Thread yuanbin
typedef struct CBase { int i; } CBase;
typedef struct CT1 { EXTENDS(CBase) ... } CT1;
typedef struct CT2 { EXTENDS(CT1) ... } CT2;
...
typedef struct CTN { EXTENDS(CTN_1) ... } CTN;
CTN t;
t.i=1; //need not t.CTN_1CT2.CT1.CBase.i ---complex
CBase* p=&t.CBase; //need not t.CTN_1CT2.CT1.CBase, need not
(CBase*)&t ---not safe


2010/6/11 Dave Korn :
> On 10/06/2010 18:07, yuanbin wrote:
>> This compiler's extension is valuable
>
>  No, it isn't very valuable, sorry to be blunt.  I think you are following a
> really wrong path here.  You are trying to implement a C++-alike
> object-oriented system in C.  That makes sense as far as it goes, but if you
> find yourself having to propose modifying the C compiler in a direction that
> basically makes it speak C++, you might as well just use C++ in the first
> place.  You want the compiler to automatically choose one of several different
> ways to initialise a union according to the data type of the argument you use
> to initialise it with; basically, that means you want overloaded constructors.
>  So you should just use C++, which already is C with overloaded constructors.
>  And it also already has all the other features that you'll discover you need
> in the compiler as you carry along this path.
>
>  By the time you get to the end of your journey, "coo.h" will be an empty
> file and all the functionality will have been added to the C compiler until it
> turns into a C++ compiler.  I think you need to choose a different plan.
>
>    cheers,
>      DaveK
>
>


Re: Scheduling x86 dispatch windows

2010-06-10 Thread Quentin Neill
On Thu, Jun 10, 2010 at 5:40 PM, Daniel Jacobowitz  
wrote:
> On Thu, Jun 10, 2010 at 02:03:03PM -0600, Jeff Law wrote:
>> That adds quite a bit of complication to the compiler though --
>> getting the instruction lengths right (and thus proper packing &
>> alignment) can be extremely difficult.  I did some experiments with
>> this on a target with *fixed* instruction lengths a while back and
>> even though the port tried hard to get lengths right, it would
>> routinely miss something.  Ultimately I decided that it forcing the
>> compiler to know instruction lengths with a very high degree of
>> accuracy wasn't a sane thing to do.
>
> FWIW, my opinion (and I think Jakub has expressed a similar opinion
> and/or tool in the past) is that there is a sane way to do this: put
> assertions in the assembler output and have the assembler validate
> them.
>
> On the other hand, I'm not going to argue that it's a lot of work.
> --
> Daniel Jacobowitz
> CodeSourcery

When you say "put assertions in the assembler output" I understood it
to mean "in the assembly source code output by the compiler", not "the
output produced by the assembler".

Does this qualify as a form of what you are suggesting?  Because this
is exactly what is being proposed:

.balign 8  # start window
insn op, op  # 67 67 XX YY ZZ  - padded with 2 prefixes to make 8
insn2 op, op# AA BB CC
.padalign 8  # window boundary
insn4 op
. . .

-- 
Quentin Neill


Re: Please support coo.h

2010-06-10 Thread Magnus Fromreide
On Fri, 2010-06-11 at 08:44 +0800, yuanbin wrote:
> typedef struct CBase { int i; } CBase;
> typedef struct CT1 { EXTENDS(CBase) ... } CT1;
> typedef struct CT2 { EXTENDS(CT1) ... } CT2;
> ...
> typedef struct CTN { EXTENDS(CTN_1) ... } CTN;
> CTN t;
> t.i=1; //need not t.CTN_1CT2.CT1.CBase.i ---complex
> CBase* p=&t.CBase; //need not t.CTN_1CT2.CT1.CBase, need not
> (CBase*)&t ---not safe

struct CBase { int i; };
struct CT1 : CBase { ... };
struct CT2 : CT1 { ... };
struct CTN : CTN_1 { ... };
CTN t;

t.i = 1; // assumes this is in function scope
CBase* p = &t; // Even simpler than your proposal and still safe.

Yep, this is valid C++. I think Dave is right, you really want C++.

/MF

> 
> 2010/6/11 Dave Korn :
> > On 10/06/2010 18:07, yuanbin wrote:
> >> This compiler's extension is valuable
> >
> >  No, it isn't very valuable, sorry to be blunt.  I think you are following a
> > really wrong path here.  You are trying to implement a C++-alike
> > object-oriented system in C.  That makes sense as far as it goes, but if you
> > find yourself having to propose modifying the C compiler in a direction that
> > basically makes it speak C++, you might as well just use C++ in the first
> > place.  You want the compiler to automatically choose one of several 
> > different
> > ways to initialise a union according to the data type of the argument you 
> > use
> > to initialise it with; basically, that means you want overloaded 
> > constructors.
> >  So you should just use C++, which already is C with overloaded 
> > constructors.
> >  And it also already has all the other features that you'll discover you 
> > need
> > in the compiler as you carry along this path.
> >
> >  By the time you get to the end of your journey, "coo.h" will be an empty
> > file and all the functionality will have been added to the C compiler until 
> > it
> > turns into a C++ compiler.  I think you need to choose a different plan.
> >
> >cheers,
> >  DaveK
> >
> >




Re: Please support coo.h

2010-06-10 Thread Ian Lance Taylor
yuanbin  writes:

> typedef struct CBase { int i; } CBase;
> typedef struct CT1 { EXTENDS(CBase) ... } CT1;
> typedef struct CT2 { EXTENDS(CT1) ... } CT2;
> ...
> typedef struct CTN { EXTENDS(CTN_1) ... } CTN;
> CTN t;
> t.i=1; //need not t.CTN_1CT2.CT1.CBase.i ---complex
> CBase* p=&t.CBase; //need not t.CTN_1CT2.CT1.CBase, need not
> (CBase*)&t ---not safe

http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Unnamed-Fields.html

Ian