Re: Possible Bug with darwin_asm_named_section() in gcc/config/darwin.c
On Jul 1, 2010, at 11:29 PM, Eric Siroker wrote: Hello Darwin port maintainers, I'm getting the frontend for the Go programming language to work in Darwin. I encountered what looks like a bug in Darwin-specific gcc code. The Go frontend saves language-specific export information in the object file inside a special section. This works fine in Linux/ELF. In Darwin/Mach-O, the following assembly is generated for the section header (.go_export is the segment name): .section .go_export You need the segment name besides just the section name. Mach-o is different from elf that it has segments and sections. Look into the trunk to see how the lto sections are defined and use that as a template. The Darwin assembler installed on my Mac (cctools-773~33, GNU assembler version 1.38) however doesn't like this statement. It produces this error message when fed the aforementioned assembly: Expected comma after segment-name If I modify darwin_asm_named_section() in gcc/config/darwin.c: - fprintf (asm_out_file, "\t.section %s\n", name); +fprintf (asm_out_file, "\t.section %s,\n", name); The following assembly is produced by gcc and accepted by the assembler: .section .go_export, Apple's documentation confirms a comma is required for .section: http://developer.apple.com/mac/library/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html Can you please explain either why darwin_asm_named_section() is correct as written or can you apply this fix? Thank you. If you're interested in observing the issue yourself, the Go branch is located in the main gcc source repository in this subdirectory: /branches/gccgo Here is a sample Go program that, when compiled with gccgo, will exhibit the issue: package main func Main() {} I use the following configure arguments: --disable-bootstrap --enable-languages=go Eric
Re: Trunk frozen for mem-ref2 merge starting Wed, Jun 28th 20:00 UTC
On Mon, 28 Jun 2010, Richard Guenther wrote: > The trunk is frozen for all changes starting this Wednesday, 20:00 UTC > in preparation for merging the mem-ref2 branch. The freeze is expected > to last until early Friday morning. An explicit un-freeze mail will > be sent as a reply to this mail. The trunk is now open again for development. Thanks, Richard.
Re: Crucial C++ inlining broken under -Os
Joern Rennecke writes: > [...] But if the function is very simple, the only reason to keep > it would be if its address was taken somewhere, or if we tailcall > it. ... or to make it available from gdb as an inferior call. - FChE
Re: Crucial C++ inlining broken under -Os
On Fri, Jul 2, 2010 at 4:06 PM, Frank Ch. Eigler wrote: > Joern Rennecke writes: > >> [...] But if the function is very simple, the only reason to keep >> it would be if its address was taken somewhere, or if we tailcall >> it. > > ... or to make it available from gdb as an inferior call. We do end up eliminating it from the TUs object file, just inline limits try to limit program growth. Now the issue with the past heuristic is that if you have a very large .comdat function that is called once in a TU and once in another TU we think that inlining it in both TUs will decrease the size of the program. But in fact it does not as it wasn't really a function called once. Richard. > - FChE >
Re: Plug-ins on Windows
> The main utility of plugins is that they make developing, testing and > deploying modifications to gcc easier. I don't think that MS windows users > will miss too much if they can't dynamically load the plugins, as long > as their sysadmin can provide them with a linked-in version - the sysadmin > might actually be happier about the greater control, as plugins are > potential virus/trojan vectors. > It makes sense to write a plugin in such a way that it can alternatively > be used as a GCC patch - in the sense that you add the code alongside > the main gcc code and it gets linked in. > Maybe we should create an interface for linked-in plugin code, i.e. > arrays of plugin names and their plugin_init function so that they can > be activated with a matching -fplugin=built-in:NAME (or propose a different > syntax if you think of a better one) and get passed any plugin options > like a dynamically loaded one would. > A generic linked-in plugin ability would definitely solve my plugin-on-windows problem. From what I've been reading on this list it looks like I'm going to have to do some sort of similar hack to gcc to get my plugin working on windows at least in the short term. Is there so little demand for dynamically loaded plugin on windows platform that a generic linked-in plugin framework could be how plugins are supported on windows? Just drop the plugin source and a config of some sort in a directory of the main gcc build and it's pulled in auto-magically? Kyle
Re: Power/PowerPC RIOS/RIOS2 obsolescence
[I proposed removing RIOS support, since it heavily gets in the way for my project exposing the XER[CA] flag]. My argument is simply this, sorry if it wasn't clear in the last email, bottom line up front: - It can just as easily be removed in the future if it is broken for more than one release rather than evicting support. I can guarantee you that the changes I am trying to make _will_ break RIOS support. RIOS handles the carry flag quite differently from PowerPC and Power Architecture, and I have no way to test RIOS either. Since there is no significant community using these chips anymore, removing RIOS support seemed like the best course of action. - It shouldn't add unwieldy maintenance overhead. It already HAS been unwieldy maintenance overhead for years. The old stuff can be walled off, conditionally built, and otherwise removed from the main focus. You obviously haven't looked at the code in detail, if you think this. - The code is already written and just needs a maintainer. - I have the hardware and desire to maintain it. Feel free to split off a new backend then. The current intertwined mess cannot be maintained properly. Segher
Re: Crucial C++ inlining broken under -Os
> Quoting Richard Guenther : > >> That is, we no longer optimistically assume that comdat functions >> can be eliminated if there are no callers in the local TU in 4.5 >> (but we did in previous releases). > > But if the function is very simple, the only reason to keep it would be > if its address was taken somewhere, or if we tailcall it. Since there seems to be bit confussion, perhaps it would make sense to summarize how the whole process works. Inliner estimates whole program size change by inlining all invocations of each function (overall_growth) and inline all functions that results in expected shrinking. The process is as follows. In inline_param3 dump we get estimates for every statement time and size: Analyzing function body size: Container::Container() freq: 1000 size: 3 time: 12 D.2145_1 = operator new (4); freq: 1000 size: 1 time: 1 MEM[(struct Container *)this_3(D)].member = D.2145_1; Likely eliminated freq: 1000 size: 0 time: 0 return; Likely eliminated Overall function body time: 13-1 size: 4-1 So in our simplified vision, Container() function will occupy 4 units of size and execute for 13 units of time (not completely related to real bytes of cycles, since our IL is too highlevel at this point). Some statements are assumed to go away after inlining. This is the case of memory store that we expect will somehow get combined after inlining. This is just a guess that attempts to convince inliner to get rid of more C++ abstraction penalty and allow more scalar replacement. So We believe that by inlining function we save the store to .member field. Next the function call overhead of the function is accounted (since inlining removes one call) and we get: With function call overhead time: 13-12 size: 4-3 So inliner thinks that by inlining we save 12 units of execution time and we increase code size by 1 unit (4-3). The overall time (13) is not really used. The one extra byte is for passing value of 4 into new() call. When inlining for size (that happens for all calls considered to be hot that is just all calls at -Os), the heruistc actually compute estimate program size change and inline function when inlining it to specific caller reduce code size. This never happens for Container() because code of caller grows (it needs to pass extra value of 4). Next we try to see if inlining into all callers would reduce program size by eliminating the offline copy. This would hit for Container if it was static because it is called just once and the growth in caller by 1 byte is smaller than the overall size of Container(). Because Continer is COMDAT, we don't do that so we never inline it. This is seen later in .inline dump: Considering Container::Container() with 4 size to be inlined into int gimme() in t.C:26 Estimated growth after inlined into all callees is +1 insns. Estimated badness is 2, frequency 1.00. inline_failed:call is unlikely and code size would grow. The behaviour change is about COMDAT functions that are larger than call overhead but either called just once or small enough so code growth caused by inlining is smaller than the function body size itself. In these cases we made the assumption that overall program size will change and inlined in previous GCC releases. This asusmption is not correct (it is correct for static functions and also for size of .o file, but not for whole binary) and the problem can be demonstrated by making very large comdat function that is used once in very many units. Thus I've changed the behaviour in GCC 4.5 since it is more safe. So to get around one need either -fwhole-program, or use always_inline attribute, or if the actual size of .o file shrinks after inlining because of other optimizations we can see if we can extend heruistics to forecast this and account in inlining decisions. The last alternative is what I would be happy to look into, but in this testcase we don't get any simplification, so local behaviour of inliner is correct. I guess we might experiment with allowing some very limited code size growth for inlining COMDAT functions if this turns out to be real problem. ALso we might add some biass into the logic accounting removal of offline copy: obviously offline copy is little bit bigger than the instructions themselves having prologue/epilogue and alignment. This would help static functions, but accounting this realistically is tricky becuase the cost are architecture dependent. I might also make patch for you to revert this behaviour. However it would be interesting to have -finline-limit testcase. It is bit surprising this changes behaviour for you: -finline-limit is now obsolette way of controlling inline-insns-single and inline-insns-auto parameters. Setting it to 50 has the effect of reducing them to 25. (from 400 and 50 respectively). Those limits limit function size of functions to be considered as inlining candidates. At -Os you should always get inlining trottl
Re: Crucial C++ inlining broken under -Os
Quoting Jan Hubicka : The behaviour change is about COMDAT functions that are larger than call overhead but either called just once or small enough so code growth caused by inlining is smaller than the function body size itself. In these cases we made the assumption that overall program size will change and inlined in previous GCC releases. I.e., we assumed that the number of TU this function was used was 1. This asusmption is not correct (it is correct for static functions and also for size of .o file, but not for whole binary) and the problem can be demonstrated by making very large comdat function that is used once in very many units. Thus I've changed the behaviour in GCC 4.5 since it is more safe. It seems to be equivalent to assume the number of TU this function is used in is infinity. How about a parameter for the number of TU we guess the function will be needed in?
Re: Crucial C++ inlining broken under -Os
> Quoting Jan Hubicka : > >> The behaviour change is about COMDAT functions that are larger than call >> overhead but either called just once or small enough so code growth caused >> by inlining is smaller than the function body size itself. In these cases >> we made the assumption that overall program size will change and inlined >> in previous GCC releases. > > I.e., we assumed that the number of TU this function was used was 1. Yes, we assumed that all COMDAT functions was static and thus somewhat defeating purpose of COMDAT at first place. > >> This asusmption is not correct (it is correct for static functions >> and also for >> size of .o file, but not for whole binary) and the problem can be >> demonstrated >> by making very large comdat function that is used once in very many units. >> Thus I've changed the behaviour in GCC 4.5 since it is more safe. > > It seems to be equivalent to assume the number of TU this function is used > in is infinity. > > How about a parameter for the number of TU we guess the function will be > needed in? You are interested in amount of call sites for specific function, having some generic measurement like "usual comdat function is called 300 times on Mozilla build" won't help you, since setting this argument for 2 and more would disable most of inlining again. There are cases where function called twice is still inlined since offline copy + expected benefit compensates the caller size growth, but it is rather rare. Honza
gengtype & many GTY tags for same union component?
Hello All My understanding of the description of the tag GTY option in http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html#GTY-Options is that a given discriminated union case can have several tags. so in MELT I did code (file gcc/melt-runtime.h near line 1108 of rev 161710) [The // c++ like comments are comments in this mail, not in MELT sources] / our union for everything ***/ /* never use an array of melt_un, only array of pointers melt_ptr_t */ typedef union GTY ((desc ("%0.u_discr->object_magic"))) melt_un { meltobject_ptr_t GTY ((skip)) u_discr; struct meltforward_st GTY ((skip)) u_forward; struct meltobject_st GTY ((tag ("OBMAG_OBJECT"))) u_object; some cases skipped struct meltpair_st GTY ((tag ("OBMAG_PAIR"))) u_pair; struct meltspecial_st GTY ((tag ("OBMAG_SPEC_FILE"), tag ("OBMAG_SPEC_RAWFILE"), tag ("OBMAG_SPEC_MPFR"), tag ("OBMAG_SPECPPL_COEFFICIENT"), tag ("OBMAG_SPECPPL_LINEAR_EXPRESSION"), tag ("OBMAG_SPECPPL_CONSTRAINT"), tag ("OBMAG_SPECPPL_CONSTRAINT_SYSTEM"), tag ("OBMAG_SPECPPL_GENERATOR"), tag ("OBMAG_SPECPPL_GENERATOR_SYSTEM"), tag ("OBMAG_SPECPPL_POLYHEDRON")) ) u_special; struct meltstring_st GTY ((tag ("OBMAG_STRING"))) u_string; /// some other cases skipped } melt_un_t; Notice that I did use several tags for the u_special case. I thought is is permissible, and I was expecting that the gengtype generated marking routine marks the u_special union component for all the OBMAG_SPEC* cases listed above. However, this is not the case. The generated marking routine (in gcc/gtype-desc.c in the build tree) contains void gt_ggc_mx_melt_un (void *x_p) { union melt_un * const x = (union melt_un *)x_p; if (ggc_test_and_set_mark (x)) { switch ((*x).u_discr->object_magic) { case OBMAG_OBJECT: gt_ggc_m_13meltobject_st ((*x).u_object.obj_class); { size_t i0; size_t l0 = (size_t)(((*x).u_object).obj_len); for (i0 = 0; i0 != l0; i0++) { gt_ggc_m_7melt_un ((*x).u_object.obj_vartab[i0]); } } break; // some cases skipped case OBMAG_PAIR: gt_ggc_m_13meltobject_st ((*x).u_pair.discr); gt_ggc_m_7melt_un ((*x).u_pair.hd); gt_ggc_m_11meltpair_st ((*x).u_pair.tl); break; case OBMAG_SPEC_FILE: gt_ggc_m_13meltobject_st ((*x).u_special.discr); break; // cases OBMAG_SPEC_RAWFILE and others are missing! case OBMAG_STRING: gt_ggc_m_13meltobject_st ((*x).u_string.discr); break; /// some other cases skipped default: break; } } } Notice that all the OBMAG_SPEC_RAWFILE, OBMAG_SPEC_MPFR etc... cases are absent in the generated marking routine gt_ggc_mx_melt_un above. I was painfully surprised to find this (this is the hardest bug in MELT I've got so far). /* never use an array of melt_un, only array of pointers melt_ptr_t */ typedef union GTY ((desc ("%0.u_discr->object_magic"))) melt_un { meltobject_ptr_t GTY ((skip)) u_discr; struct meltforward_st GTY ((skip)) u_forward; struct meltobject_st GTY ((tag ("OBMAG_OBJECT"))) u_object; some cases skipped struct meltbox_st GTY ((tag ("OBMAG_BOX"))) u_box; struct meltpair_st GTY ((tag ("OBMAG_PAIR"))) u_pair; /* The struct meltspecial_st share several GTY tag-s, but gengtype need to have one case per tag! */ struct meltspecial_st GTY ((tag ("OBMAG_SPEC_FILE"))) u_special_file; struct meltspecial_st GTY ((tag ("OBMAG_SPEC_RAWFILE"))) u_special_rawfile; struct meltspecial_st GTY ((tag ("OBMAG_SPEC_MPFR"))) u_special_mpfr; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_COEFFICIENT"))) u_special_ppl_coefficient; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_LINEAR_EXPRESSION"))) u_special_ppl_linear_expression; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_CONSTRAINT"))) u_special_ppl_constraint; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_CONSTRAINT_SYSTEM"))) u_special_ppl_constraint_system; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_GENERATOR"))) u_special_ppl_generator; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_GENERATOR_SYSTEM"))) u_special_ppl_generator_system; struct meltspecial_st GTY ((tag ("OBMAG_SPECPPL_POLYHEDRON"))) u_special_ppl_polyhedron; /* for simplicity and compatibility with previous code, we can just write u_special */ struct meltspecial_st GTY ((skip)) u_special; struct meltstring_st GTY ((tag ("OBMAG_STRING"))) u_string; /// some other cases skipped } melt_un_t; By making this correction, the gengtype generated marking routine has indeed all the cases I was expecting. Do you think it is only my misunderstanding