Re: Possible Bug with darwin_asm_named_section() in gcc/config/darwin.c

2010-07-02 Thread Andrew Pinski



On Jul 1, 2010, at 11:29 PM, Eric Siroker  wrote:


Hello Darwin port maintainers,

I'm getting the frontend for the Go programming language to work in
Darwin.  I encountered what looks like a bug in Darwin-specific gcc
code.

The Go frontend saves language-specific export information in the
object file inside a special section.  This works fine in Linux/ELF.
In Darwin/Mach-O, the following assembly is generated for the section
header (.go_export is the segment name):

.section .go_export


You need the segment name besides just the section name. Mach-o is  
different from elf that it has segments and sections. Look into the  
trunk to see how the lto sections are defined and use that as a  
template.





The Darwin assembler installed on my Mac (cctools-773~33, GNU
assembler version 1.38) however doesn't like this statement.  It
produces this error message when fed the aforementioned assembly:

Expected comma after segment-name

If I modify darwin_asm_named_section() in gcc/config/darwin.c:

- fprintf (asm_out_file, "\t.section %s\n", name);
+fprintf (asm_out_file, "\t.section %s,\n", name);

The following assembly is produced by gcc and accepted by the  
assembler:


.section .go_export,

Apple's documentation confirms a comma is required for .section:
http://developer.apple.com/mac/library/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html

Can you please explain either why darwin_asm_named_section() is
correct as written or can you apply this fix?  Thank you.

If you're interested in observing the issue yourself, the Go branch is
located in the main gcc source repository in this subdirectory:

/branches/gccgo

Here is a sample Go program that, when compiled with gccgo, will
exhibit the issue:

package main
func Main() {}

I use the following configure arguments: --disable-bootstrap
--enable-languages=go

Eric


Re: Trunk frozen for mem-ref2 merge starting Wed, Jun 28th 20:00 UTC

2010-07-02 Thread Richard Guenther
On Mon, 28 Jun 2010, Richard Guenther wrote:

> The trunk is frozen for all changes starting this Wednesday, 20:00 UTC
> in preparation for merging the mem-ref2 branch.  The freeze is expected
> to last until early Friday morning.  An explicit un-freeze mail will
> be sent as a reply to this mail.

The trunk is now open again for development.

Thanks,
Richard.


Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Frank Ch. Eigler
Joern Rennecke  writes:

> [...]  But if the function is very simple, the only reason to keep
> it would be if its address was taken somewhere, or if we tailcall
> it.

... or to make it available from gdb as an inferior call.

- FChE


Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Richard Guenther
On Fri, Jul 2, 2010 at 4:06 PM, Frank Ch. Eigler  wrote:
> Joern Rennecke  writes:
>
>> [...]  But if the function is very simple, the only reason to keep
>> it would be if its address was taken somewhere, or if we tailcall
>> it.
>
> ... or to make it available from gdb as an inferior call.

We do end up eliminating it from the TUs object file, just inline
limits try to limit program growth.  Now the issue with the past
heuristic is that if you have a very large .comdat function that is
called once in a TU and once in another TU we think that
inlining it in both TUs will decrease the size of the program.
But in fact it does not as it wasn't really a function called once.

Richard.

> - FChE
>


Re: Plug-ins on Windows

2010-07-02 Thread Kyle Girard

> The main utility of plugins is that they make developing, testing and
> deploying modifications to gcc easier.  I don't think that MS windows users
> will miss too much if they can't dynamically load the plugins, as long
> as their sysadmin can provide them with a linked-in version - the sysadmin
> might actually be happier about the greater control, as plugins are
> potential virus/trojan vectors.
> It makes sense to write a plugin in such a way that it can alternatively
> be used as a GCC patch - in the sense that you add the code alongside
> the main gcc code and it gets linked in.
> Maybe we should create an interface for linked-in plugin code, i.e.
> arrays of plugin names and their plugin_init function so that they can
> be activated with a matching -fplugin=built-in:NAME (or propose a different
> syntax if you think of a better one) and get passed any plugin options
> like a dynamically loaded one would.
> 

A generic linked-in plugin ability would definitely solve my
plugin-on-windows problem.  From what I've been reading on this list it
looks like I'm going to have to do some sort of similar hack to gcc to
get my plugin working on windows at least in the short term.  

Is there so little demand for dynamically loaded plugin on windows
platform that a generic linked-in plugin framework could be how plugins
are supported on windows?  Just drop the plugin source and a config of
some sort in a directory of the main gcc build and it's pulled in
auto-magically? 

Kyle



Re: Power/PowerPC RIOS/RIOS2 obsolescence

2010-07-02 Thread Segher Boessenkool

[I proposed removing RIOS support, since it heavily gets in the way
for my project exposing the XER[CA] flag].


My argument is simply this, sorry if it wasn't clear in the last
email, bottom line up front:
- It can just as easily be removed in the future if it is broken for
more than one release rather than evicting support.


I can guarantee you that the changes I am trying to make _will_ break
RIOS support.  RIOS handles the carry flag quite differently from
PowerPC and Power Architecture, and I have no way to test RIOS either.
Since there is no significant community using these chips anymore,
removing RIOS support seemed like the best course of action.


- It shouldn't add unwieldy maintenance overhead.


It already HAS been unwieldy maintenance overhead for years.


  The old stuff can
be walled off, conditionally built, and otherwise removed from the
main focus.


You obviously haven't looked at the code in detail, if you think this.


- The code is already written and just needs a maintainer.
- I have the hardware and desire to maintain it.


Feel free to split off a new backend then.  The current intertwined
mess cannot be maintained properly.


Segher



Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Jan Hubicka
> Quoting Richard Guenther :
>
>> That is, we no longer optimistically assume that comdat functions
>> can be eliminated if there are no callers in the local TU in 4.5
>> (but we did in previous releases).
>
> But if the function is very simple, the only reason to keep it would be
> if its address was taken somewhere, or if we tailcall it.

Since there seems to be bit confussion, perhaps it would make sense to summarize
how the whole process works.

Inliner estimates whole program size change by inlining all invocations of each
function (overall_growth) and inline all functions that results in expected
shrinking.

The process is as follows.  In inline_param3 dump we get estimates for every
statement time and size:

  Analyzing function body size: Container::Container()
freq:  1000 size:  3 time: 12 D.2145_1 = operator new (4);
freq:  1000 size:  1 time:  1 MEM[(struct Container *)this_3(D)].member = 
D.2145_1;
  Likely eliminated
freq:  1000 size:  0 time:  0 return;
  Likely eliminated
  Overall function body time: 13-1 size: 4-1

So in our simplified vision, Container() function will occupy 4 units of size
and execute for 13 units of time (not completely related to real bytes of
cycles, since our IL is too highlevel at this point).

Some statements are assumed to go away after inlining.  This is the case of
memory store that we expect will somehow get combined after inlining. This is
just a guess that attempts to convince inliner to get rid of more C++
abstraction penalty and allow more scalar replacement.  So We believe that by
inlining function we save the store to .member field.

Next the function call overhead of the function is accounted (since inlining
removes one call) and we get:

  With function call overhead time: 13-12 size: 4-3

So inliner thinks that by inlining we save 12 units of execution time
and we increase code size by 1 unit (4-3).  The overall time (13) is not
really used.

The one extra byte is for passing value of 4 into new() call.

When inlining for size (that happens for all calls considered to be hot that is
just all calls at -Os), the heruistc actually compute estimate program size
change and inline function when inlining it to specific caller reduce code
size.  This never happens for Container() because code of caller grows (it
needs to pass extra value of 4).

Next we try to see if inlining into all callers would reduce program size by
eliminating the offline copy.  This would hit for Container if it was static
because it is called just once and the growth in caller by 1 byte is smaller
than the overall size of Container().  Because Continer is COMDAT, we don't do
that so we never inline it.  This is seen later in .inline dump:

Considering Container::Container() with 4 size
 to be inlined into int gimme() in t.C:26
 Estimated growth after inlined into all callees is +1 insns.
 Estimated badness is 2, frequency 1.00.
 inline_failed:call is unlikely and code size would grow.

The behaviour change is about COMDAT functions that are larger than call
overhead but either called just once or small enough so code growth caused
by inlining is smaller than the function body size itself.  In these cases
we made the assumption that overall program size will change and inlined
in previous GCC releases.

This asusmption is not correct (it is correct for static functions and also for
size of .o file, but not for whole binary) and the problem can be demonstrated
by making very large comdat function that is used once in very many units.
Thus I've changed the behaviour in GCC 4.5 since it is more safe.

So to get around one need either -fwhole-program, or use always_inline 
attribute,
or if the actual size of .o file shrinks after inlining because of other 
optimizations
we can see if we can extend heruistics to forecast this and account in inlining 
decisions.

The last alternative is what I would be happy to look into, but in this testcase
we don't get any simplification, so local behaviour of inliner is correct.

I guess we might experiment with allowing some very limited code size growth
for inlining COMDAT functions if this turns out to be real problem. ALso we 
might add
some biass into the logic accounting removal of offline copy: obviously offline
copy is little bit bigger than the instructions themselves having 
prologue/epilogue
and alignment. This would help static functions, but accounting this 
realistically
is tricky becuase the cost are architecture dependent.

I might also make patch for you to revert this behaviour.  However it would be
interesting to have -finline-limit testcase.  It is bit surprising this changes
behaviour for you: -finline-limit is now obsolette way of controlling
inline-insns-single and inline-insns-auto parameters.  Setting it to 50 has the
effect of reducing them to 25. (from 400 and 50 respectively).

Those limits limit function size of functions to be considered as inlining
candidates.  At -Os you should always get inlining trottl

Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Joern Rennecke

Quoting Jan Hubicka :


The behaviour change is about COMDAT functions that are larger than call
overhead but either called just once or small enough so code growth caused
by inlining is smaller than the function body size itself.  In these cases
we made the assumption that overall program size will change and inlined
in previous GCC releases.


I.e., we assumed that the number of TU this function was used was 1.

This asusmption is not correct (it is correct for static functions   
and also for
size of .o file, but not for whole binary) and the problem can be   
demonstrated

by making very large comdat function that is used once in very many units.
Thus I've changed the behaviour in GCC 4.5 since it is more safe.


It seems to be equivalent to assume the number of TU this function is used
in is infinity.

How about a parameter for the number of TU we guess the function will be
needed in?


Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Jan Hubicka
> Quoting Jan Hubicka :
>
>> The behaviour change is about COMDAT functions that are larger than call
>> overhead but either called just once or small enough so code growth caused
>> by inlining is smaller than the function body size itself.  In these cases
>> we made the assumption that overall program size will change and inlined
>> in previous GCC releases.
>
> I.e., we assumed that the number of TU this function was used was 1.

Yes, we assumed that all COMDAT functions was static and thus somewhat
defeating purpose of COMDAT at first place.
>
>> This asusmption is not correct (it is correct for static functions   
>> and also for
>> size of .o file, but not for whole binary) and the problem can be   
>> demonstrated
>> by making very large comdat function that is used once in very many units.
>> Thus I've changed the behaviour in GCC 4.5 since it is more safe.
>
> It seems to be equivalent to assume the number of TU this function is used
> in is infinity.
>
> How about a parameter for the number of TU we guess the function will be
> needed in?

You are interested in amount of call sites for specific function, having
some generic measurement like "usual comdat function is called 300 times
on Mozilla build" won't help you, since setting this argument for 2 and
more would disable most of inlining again.

There are cases where function called twice is still inlined since offline
copy + expected benefit compensates the caller size growth, but it is
rather rare.

Honza


gengtype & many GTY tags for same union component?

2010-07-02 Thread Basile Starynkevitch
Hello All

My understanding of the description of the tag GTY option in
http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html#GTY-Options is that a
given discriminated union case can have several tags. so in MELT I did code
(file gcc/melt-runtime.h near line 1108 of rev 161710)

[The // c++ like comments are comments in this mail, not in MELT sources]


  / our union for everything ***/
  /* never use an array of melt_un, only array of pointers melt_ptr_t */
  typedef union
  GTY ((desc ("%0.u_discr->object_magic"))) 
  melt_un
  {
meltobject_ptr_t GTY ((skip)) u_discr;
struct meltforward_st GTY ((skip)) u_forward;
struct meltobject_st GTY ((tag ("OBMAG_OBJECT"))) u_object;
 some cases skipped 
struct meltpair_st GTY ((tag ("OBMAG_PAIR"))) u_pair;
struct meltspecial_st
  GTY ((tag ("OBMAG_SPEC_FILE"),
tag ("OBMAG_SPEC_RAWFILE"),
tag ("OBMAG_SPEC_MPFR"),
tag ("OBMAG_SPECPPL_COEFFICIENT"),
tag ("OBMAG_SPECPPL_LINEAR_EXPRESSION"),
tag ("OBMAG_SPECPPL_CONSTRAINT"),
tag ("OBMAG_SPECPPL_CONSTRAINT_SYSTEM"),
tag ("OBMAG_SPECPPL_GENERATOR"),
tag ("OBMAG_SPECPPL_GENERATOR_SYSTEM"),
tag ("OBMAG_SPECPPL_POLYHEDRON"))
  ) u_special;
struct meltstring_st GTY ((tag ("OBMAG_STRING"))) u_string;
/// some other cases skipped 
  } melt_un_t;

Notice that I did use several tags for the u_special case. I thought
is is permissible, and I was expecting that the gengtype generated
marking routine marks the u_special union component for all the
OBMAG_SPEC* cases listed above.

However, this is not the case. The generated marking routine (in
gcc/gtype-desc.c in the build tree) contains

  void
  gt_ggc_mx_melt_un (void *x_p)
  {
union melt_un * const x = (union melt_un *)x_p;
if (ggc_test_and_set_mark (x))
  {
switch ((*x).u_discr->object_magic)
  {
  case OBMAG_OBJECT:
gt_ggc_m_13meltobject_st ((*x).u_object.obj_class);
{
  size_t i0;
  size_t l0 = (size_t)(((*x).u_object).obj_len);
  for (i0 = 0; i0 != l0; i0++) {
gt_ggc_m_7melt_un ((*x).u_object.obj_vartab[i0]);
  }
}
break;
// some cases skipped
  case OBMAG_PAIR:
gt_ggc_m_13meltobject_st ((*x).u_pair.discr);
gt_ggc_m_7melt_un ((*x).u_pair.hd);
gt_ggc_m_11meltpair_st ((*x).u_pair.tl);
break;
  case OBMAG_SPEC_FILE:
gt_ggc_m_13meltobject_st ((*x).u_special.discr);
break;
//  cases OBMAG_SPEC_RAWFILE and others are missing!
  case OBMAG_STRING:
gt_ggc_m_13meltobject_st ((*x).u_string.discr);
break;
/// some other cases skipped
   default:
 break;
   }
   }
   }


Notice that all the OBMAG_SPEC_RAWFILE, OBMAG_SPEC_MPFR etc... cases
are absent in the generated marking routine gt_ggc_mx_melt_un above. I
was painfully surprised to find this (this is the hardest bug in MELT
I've got so far).


  /* never use an array of melt_un, only array of pointers melt_ptr_t */
  typedef union
  GTY ((desc ("%0.u_discr->object_magic"))) 
  melt_un
  {
meltobject_ptr_t GTY ((skip)) u_discr;
struct meltforward_st GTY ((skip)) u_forward;
struct meltobject_st GTY ((tag ("OBMAG_OBJECT"))) u_object;
 some cases skipped 
struct meltbox_st GTY ((tag ("OBMAG_BOX"))) u_box;
struct meltpair_st GTY ((tag ("OBMAG_PAIR"))) u_pair;
/* The struct meltspecial_st share several GTY tag-s, but gengtype need to 
have one case per tag! */
struct meltspecial_st GTY ((tag ("OBMAG_SPEC_FILE"))) u_special_file;
struct meltspecial_st GTY ((tag ("OBMAG_SPEC_RAWFILE"))) u_special_rawfile;
struct meltspecial_st GTY ((tag ("OBMAG_SPEC_MPFR"))) u_special_mpfr;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_COEFFICIENT"))) 
u_special_ppl_coefficient;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_LINEAR_EXPRESSION"))) 
u_special_ppl_linear_expression;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_CONSTRAINT"))) 
u_special_ppl_constraint;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_CONSTRAINT_SYSTEM"))) 
u_special_ppl_constraint_system;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_GENERATOR"))) 
u_special_ppl_generator;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_GENERATOR_SYSTEM"))) 
u_special_ppl_generator_system;
struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_POLYHEDRON"))) 
u_special_ppl_polyhedron;
/* for simplicity and compatibility with previous code, we can just write 
u_special */
struct meltspecial_st  GTY ((skip)) u_special;
struct meltstring_st GTY ((tag ("OBMAG_STRING"))) u_string;
/// some other cases skipped 
} melt_un_t;

By making this correction, the gengtype generated marking routine has
indeed all the cases I was expecting.


Do you think it is only my misunderstanding