[RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-09-24 Thread David Wohlferd
Hans-Peter Nilsson: I should have listened to you back when you raised 
concerns about this.  My apologies for ever doubting you.


In summary:

- The "trick" in the docs for using an arbitrarily sized struct to force 
register flushes for inline asm does not work.
- Placing the inline asm in a separate routine can sometimes mask the 
problem with the trick not working.
- The sample that has been in the docs forever performs an unhelpful, 
unexpected, and probably unwanted stack allocation + memcpy.


Details:

Here is the text from the docs:

---
One trick to avoid [using the "memory" clobber] is available if the size 
of the memory being accessed is known at compile time. For example, if 
accessing ten bytes of a string, use a memory input like:


"m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )
---

When I did the re-write of gcc's inline asm docs, I left the description 
for this (essentially) untouched.  I just took it on faith that "magic 
happens" and the right code gets generated.  But reading a recent post 
raised questions for me, so I tried it.  And what I found was that not 
only does this not work, it actually just makes a mess.


I started with some code that I knew required some memory clobbering:

#include 

int main(int argc, char* argv[])
{
  struct
  {
int a;
int b;
  } c;

  c.a = 1;
  c.b = 2;

  int Count = sizeof(c);
  void *Dest;

  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count)
   : "0" (&c), "a" (0)
   //: "memory"
  );

  printf("%u %u\n", c.a, c.b);
}

As written, this x64 code (compiled with -O2) will print out "1 2", even 
though someone might (incorrectly) expect the asm to overwrite the 
struct with zeros.  Adding the memory clobber allows this code to work 
as expected (printing "0 0").


Now that I have code I can use to see if registers are getting flushed, 
I removed the memory clobber, and tried just 'clobbering' the struct:


#include 

int main(int argc, char* argv[])
{
  struct
  {
int a;
int b;
  } c;

  c.a = 1;
  c.b = 2;

  int Count = sizeof(c);
  void *Dest;

  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count)
   : "0" (&c), "a" (0),
   "m" ( ({ struct foo { char x[8]; } *p = (struct foo *)&c ; 
*p; }) )

  );

  printf("%u %u\n", c.a, c.b);
}

I'm using a named struct (foo) to avoid some compiler messages, but 
other than that, I believe this is the same as what's in the docs. And 
it doesn't work.  I still get "1 2".


At this point I realized that code I've seen using this trick usually 
has the asm in its own routine.  When I try this, it still fails.  
Unless I start cranking up the size of x from 8 to ~250.  At ~250, 
suddenly it starts working.  Apparently this is because at this point, 
gcc decides not to inline the routine anymore, and flushes the registers 
before calling the non-inline code.


And why does changing the size of the structure we are pointing to 
result in increases in the size of the routine?  Reading the -S output, 
the "*p" at the end of this constraint generates a call to memcpy the 
250 characters onto the stack, which it passes to the asm as %4, which 
is never used.  Argh!


Conclusion:

What I expected when using that sample code from the docs was that any 
registers that contain values from the struct would get flushed to 
memory.  This was intended to be a 'cheaper' alternative to doing a 
full-on "memory" clobber.  What I got instead was an unexpected/unneeded 
stack allocation and memcpy, and STILL didn't get the values flushed.  
Yeah, not exactly the 'cheaper' I was hoping for.


Is the example in the docs just written incorrectly?  Did this get 
broken somewhere along the line?  Or am I just using it wrong?


I'm using gcc version 4.9.0 (x86_64-win32-seh-rev2, Built by MinGW-W64 
project).  Remember to compile these x64 samples with -O2.


dw


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Richard Biener
On Wed, Sep 24, 2014 at 7:46 AM, Jan Hubicka  wrote:
>
> Hi,
> This patch is something I was playing around with assistance of Ian Taylor.
> It seems I need bit more help though :)
>
> It adds support for direct output of SLIM LTO files to the compiler binary.
> It works as proof of concept, but there are two key parts missing
>  1) extension of libiberty's simple file to handle output symbols into COMMON.
> This is needed to output __gnu_lto_v1 and __gnu_lto_slim
> Search for TODO in the patch bellow.
>  2) Support in driver to properly execute *1 binary.
>
> I also disabled outputting ident directive, but I think that one may not be 
> necessary
> because the files are identified by the gnu_lto_v1 symbols. We could add it 
> later.
>
> Currently the path bypassing asm stage can be tested as follows:
>
> jan@linux-ujxe:~/trunk/build/gcc> cat a.c
> main ()
> {
>   printf ("Hello world\n");
> }
> jan@linux-ujxe:~/trunk/build/gcc> ./xgcc -B ./ -O3 a.c -flto -S 
> -fbypass-asm=crtbegin.o  -o a.o
> jan@linux-ujxe:~/trunk/build/gcc> ./xgcc -B ./ -O2 a.o -flto
> jan@linux-ujxe:~/trunk/build/gcc> ./a.out
> Hello world
>
> The implementation is pretty straighforward except for -fbypass-asm requiring
> one existing OBJ file to fetch target's file attributes from.  This is
> definitly not optimal, but libiberty currently can't build output files from
> scratch. As Ian suggested, I plan to simply arrange the driver to pass 
> crtbegin
> around at least to start with. We may want to bypass this later and storing
> proper attributes into the binary.
>
> Ian, would you be so kind and implement ability to output those two symbols
> into lto-object-simple?  I think we can start with ELF only support.
>
> The large chunk just moves lto-object around with very small changes in it, 
> so the
> patch is fairly easy.
>
> I did just quick benchmark with unoptimized cc1 binary compiling the file 
> above.
> For 1000 invocations with bypass I get:
>
> real0m14.186s
> user0m10.957s
> sys 0m2.424s
>
> While the default path gets:
>
> real0m21.913s
> user0m13.856s
> sys 0m5.705s
>
> With OpenSUSE 13.1 default GCC 4.8.3 build:
>
> real   0m15.160s
> user   0m8.481s
> sys0m5.159s
>
> (the difference here is most likely optimizer WRT unoptimized binary, perf 
> shows
> contains_struct_check quite top, so startup overhead still dominates)
>
> And with clang-3.4:
>
> real   0m30.097s
> user   0m22.012s
> sys0m6.649s
>
> That is fairly nice speedup IMO.  With optimized build the difference should
> be more visible because CC1 startup issues will become less important.
> I definitely see ASM file overhead as mesaurable issue with real world 
> benchmarks
> (libreoffice build). Clearly we produce several GBs of object file going 
> through
> crappy and bloated text encoding just for sake of doing it.

Shouldn't -fbypass-asm be simply "mangled" by the driver?  That is,
the user simply specifies -fbypass-asm and via spec magic the driver
substitutes this with -fbypass-asm=crtbegin.o?  That way at least
the user interface should be stable (as we're supposedly removing
the requirement for that existing object file at some point).

Btw, with early debug info we also need to store dwarf somewhere.
Either we drop the support for fat LTO objects and thus can store
the dwarf alongside the GIMPLE IL and simply link with these
files at the end or we need to support a separate set of files to
store the DWARF.  If we need separate files then why not store
the GIMPLE IL data into separate objects in the first place and
output a reference to it into the main object file?  That way we
don't need any special "attributes" - the linker plugin simply
opens the main object file, extracts the reference to the IL file
and passes that along.

Btw, the patch is very hard to read as it moves (and modifies?) files
at the same time.  What's this magic "file attributes" we need?

Thanks,
Richard.

> Honza
>
> Index: Makefile.in
> ===
> --- Makefile.in (revision 215518)
> +++ Makefile.in (working copy)
> @@ -1300,6 +1300,7 @@
> lto-section-out.o \
> lto-opts.o \
> lto-compress.o \
> +   lto-object.o \
> mcf.o \
> mode-switching.o \
> modulo-sched.o \
> Index: common.opt
> ===
> --- common.opt  (revision 215518)
> +++ common.opt  (working copy)
> @@ -923,6 +923,9 @@
>  Common Report Var(flag_btr_bb_exclusive) Optimization
>  Restrict target load migration not to re-use registers in any basic block
>
> +fbypass-asm=
> +Common Joined Var(flag_bypass_asm)
> +
>  fcall-saved-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -fcall-saved-Mark  as being preserved across 
> functions
> Index: langhooks.c
> =

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-09-24 Thread Richard Biener
On Wed, Sep 24, 2014 at 9:43 AM, David Wohlferd  wrote:
> Hans-Peter Nilsson: I should have listened to you back when you raised
> concerns about this.  My apologies for ever doubting you.
>
> In summary:
>
> - The "trick" in the docs for using an arbitrarily sized struct to force
> register flushes for inline asm does not work.
> - Placing the inline asm in a separate routine can sometimes mask the
> problem with the trick not working.
> - The sample that has been in the docs forever performs an unhelpful,
> unexpected, and probably unwanted stack allocation + memcpy.
>
> Details:
>
> Here is the text from the docs:
>
> ---
> One trick to avoid [using the "memory" clobber] is available if the size of
> the memory being accessed is known at compile time. For example, if
> accessing ten bytes of a string, use a memory input like:
>
> "m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )

Well - this can't work because you essentially are using a _value_
here (looking at the GIMPLE - I'm not sure if a statement expression
evaluates to an lvalue.

It should work if you simply do this without a stmt expression:

  "m" (*(struct { char x[10]; } *)ptr)

because that's clearly an lvalue (and the GIMPLE correctly says so):

  :
  c.a = 1;
  c.b = 2;
  __asm__ __volatile__("rep; stosb" : "=D" Dest_4, "=c" Count_5 : "0"
&c, "a" 0, "m" MEM[(struct foo *)&c], "1" 8);
  printf ("%u %u\n", 1, 2);

note that we still constant propagated 1 and 2 for the reason that
the asm didn't get any VDEF.  That's because you do not have any
memory output!  So while it keeps 'c' live it doesn't consider it
modified by the asm.  You'd still need to clobber the memory,
but "m" clobbers are not supported, only "memory".

Thus fixed asm:


  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count)
   : "0" (&c), "a" (0),
   "m" (*( struct foo { char x[8]; } *)&c)
   : "memory"
  );

where I'm not 100% sure if the "m" input is now pointless (that is,
if a "memory" clobber also constitutes a use of all memory).

Richard.

> ---
>
> When I did the re-write of gcc's inline asm docs, I left the description for
> this (essentially) untouched.  I just took it on faith that "magic happens"
> and the right code gets generated.  But reading a recent post raised
> questions for me, so I tried it.  And what I found was that not only does
> this not work, it actually just makes a mess.
>
> I started with some code that I knew required some memory clobbering:
>
> #include 
>
> int main(int argc, char* argv[])
> {
>   struct
>   {
> int a;
> int b;
>   } c;
>
>   c.a = 1;
>   c.b = 2;
>
>   int Count = sizeof(c);
>   void *Dest;
>
>   __asm__ __volatile__ ("rep; stosb"
>: "=D" (Dest), "+c" (Count)
>: "0" (&c), "a" (0)
>//: "memory"
>   );
>
>   printf("%u %u\n", c.a, c.b);
> }
>
> As written, this x64 code (compiled with -O2) will print out "1 2", even
> though someone might (incorrectly) expect the asm to overwrite the struct
> with zeros.  Adding the memory clobber allows this code to work as expected
> (printing "0 0").
>
> Now that I have code I can use to see if registers are getting flushed, I
> removed the memory clobber, and tried just 'clobbering' the struct:
>
> #include 
>
> int main(int argc, char* argv[])
> {
>   struct
>   {
> int a;
> int b;
>   } c;
>
>   c.a = 1;
>   c.b = 2;
>
>   int Count = sizeof(c);
>   void *Dest;
>
>   __asm__ __volatile__ ("rep; stosb"
>: "=D" (Dest), "+c" (Count)
>: "0" (&c), "a" (0),
>"m" ( ({ struct foo { char x[8]; } *p = (struct foo *)&c ; *p; })
> )
>   );
>
>   printf("%u %u\n", c.a, c.b);
> }
>
> I'm using a named struct (foo) to avoid some compiler messages, but other
> than that, I believe this is the same as what's in the docs. And it doesn't
> work.  I still get "1 2".
>
> At this point I realized that code I've seen using this trick usually has
> the asm in its own routine.  When I try this, it still fails.  Unless I
> start cranking up the size of x from 8 to ~250.  At ~250, suddenly it starts
> working.  Apparently this is because at this point, gcc decides not to
> inline the routine anymore, and flushes the registers before calling the
> non-inline code.
>
> And why does changing the size of the structure we are pointing to result in
> increases in the size of the routine?  Reading the -S output, the "*p" at
> the end of this constraint generates a call to memcpy the 250 characters
> onto the stack, which it passes to the asm as %4, which is never used.
> Argh!
>
> Conclusion:
>
> What I expected when using that sample code from the docs was that any
> registers that contain values from the struct would get flushed to memory.
> This was intended to be a 'cheaper' alternative to doing a full-on "memory"
> clobbe

Re: [GSoC] Status - 20140901 FINAL

2014-09-24 Thread Roman Gareev
> Hi Community!
>
> Google Summer of Code 2014 has come to an end.  We've got some very good 
> results this year -- with code from 4 out of 5 projects checked in to either 
> GCC trunk or topic branch.  Congratulations to students and mentors for their 
> great work!
>
> Even more impressive is the fact that [according to student self-evaluations] 
> most of the students intend to continue GCC development outside of the 
> program.
>
> I encourage both mentors and students to echo their feedback about GCC's GSoC 
> in this thread.  The evaluations you posted on the GSoC website is visible to 
> only a few people, and there are good comments and thoughts there that 
> deserve a wider audience.

I participated in GSoC 2014, working on «Integration of ISL code
generator into Graphite». I would like to thank Tobias Grosser,
Richard Biener, Maxim Kuvyrkov and gcc community for your advices,
ideas, comments, reviews and the opportunity to work on this project!

You can find the description of the project on the following link
https://gcc.gnu.org/wiki/ISLCodeGenerator.

I’m very happy to announce that all the required deliverables from my
proposal were implemented: «Graphite must become fully independent of
CLooG library", "GCC should be able to bootstrap",  "Pass regression
tests", "Add new tests to testsuite».

It also should be noted that the number of code lines decreased by
41.42 per cent compared with the previous  version of generator. The
amount of comments account for 20.6349 per cent of the text of the
generator (by comparison, this number is 15,8073 for the previous
version).

I’m planning to continue working on this project. We have the
following goals: 1) Finish the computation of types (in case of
completion of its integration into ISL).  2) Profile the generator. 3)
Make execution-time and compile-time performance comparisons between
CLooG and ISL code generation in Graphite. 4) Use the full/partial
tile separation for the tiles generated by the isl scheduler.

--
Cheers, Roman Gareev.


Re: Denormals and underflow control (gradual vs. aburpt) in soft-fp library

2014-09-24 Thread Uros Bizjak
On Tue, Sep 23, 2014 at 7:13 PM, Joseph S. Myers
 wrote:

>> Joseph, is there any support for underflow control in soft-fp library?
>> >From a private correspondence with FX about implementing gfortran IEEE
>> support for extended modes, soft-fp that implements 128bit support on
>> x86 could read this setting from FPU control registers and handle
>> denormals accordingly.
>
> My current series of soft-fp patches pending review on libc-alpha includes
> one for control of handling of *input* subnormals
> , because that
> feature is in the kernel version (and David Miller volunteered to help get
> the Linux kernel using the current version of soft-fp, given the features
> from the kernel version added to it
> ).
>
> As in the kernel version, that implementation sets the denormal operand
> exception in this case, although it appears SSE does not do that (so
> further conditioning would need to be added for this to be an accurate x86
> emulation; different architectures do different things in this case).
>
> Neither version has anything to control underflow on *output*.  A
> flush-to-zero mode as on x86 appears to be different in directed rounding
> modes from abruptUnderflow as defined in IEEE 754-2008 (flush-to-zero
> always produces zero, abruptUnderflow can produce the smallest normal
> result depending on the rounding mode).  Such flush-to-zero or
> abruptUnderflow control could of course be implemented; there might be a
> need for further conditions on whether the underflow and inexact exception
> flags are set on flushing to zero (again, there are architecture
> variations).
>
> (Note that such underflow control on output, at least in the architecture
> manuals I checked, means underflow when the architecture-specific tininess
> condition is met - not when the rounding result ends up subnormal - so
> it's not possible to implement it with a late check of the rounded result;
> soft-fp would need to check at the point where it decides whether the
> result is tiny.)
>
> I don't know whether you'd want flush-to-zero emulating architecture
> semantics, abruptUnderflow, or both.

Looking at the standard at [1], the mode is flush-to-zero on output,
which fits SSE as well.

> (When used on x86 for operations involving binary128, there's also the
> point that only SSE not x87 has such modes, so you'd need to consider when
> it's correct to apply them to binary128 operations - is it appropriate for
> conversions between TFmode and XFmode or not?  Of course applying to all
> operations is easiest and avoids needing the soft-fp conditionals to
> depend on the operation / types involved, or having different
> sfp-machine.h definitions used in different source files.)

[1] http://j3-fortran.org/doc/year/03/03-131r1.txt

Uros.


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Ian Lance Taylor
Richard Biener  writes:

> Btw, the patch is very hard to read as it moves (and modifies?) files
> at the same time.  What's this magic "file attributes" we need?

The file attributes issue is the ELF machine number, class, OSABI,
flags, and endianness.  When generating an ELF file it has to have this
information, and it has to match the objects generated by the assembler.
If it doesn't, the linker won't accept it and pass it to the plugin as
we require.  We could of course build a large table of those numbers and
keep it updated for each target.  But it's simpler to extract the
numbers from an existing object file that we know must be valid.

Ian


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Richard Biener
On Wed, Sep 24, 2014 at 2:40 PM, Ian Lance Taylor  wrote:
> Richard Biener  writes:
>
>> Btw, the patch is very hard to read as it moves (and modifies?) files
>> at the same time.  What's this magic "file attributes" we need?
>
> The file attributes issue is the ELF machine number, class, OSABI,
> flags, and endianness.  When generating an ELF file it has to have this
> information, and it has to match the objects generated by the assembler.
> If it doesn't, the linker won't accept it and pass it to the plugin as
> we require.  We could of course build a large table of those numbers and
> keep it updated for each target.  But it's simpler to extract the
> numbers from an existing object file that we know must be valid.

I see.  Thanks for the explanation.

Richard.

> Ian


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> 
> Shouldn't -fbypass-asm be simply "mangled" by the driver?  That is,
> the user simply specifies -fbypass-asm and via spec magic the driver
> substitutes this with -fbypass-asm=crtbegin.o?  That way at least
> the user interface should be stable (as we're supposedly removing
> the requirement for that existing object file at some point).

The idea is to make -fbypass-asm internal and never exposed to user.
That is, default to it with slim LTO unless user asks for "assembler"
via -S
> 
> Btw, with early debug info we also need to store dwarf somewhere.
> Either we drop the support for fat LTO objects and thus can store

I think fat LTO files are useful for LIPO that hopefully will once hit
mainline and for other tricks, so I think we want to keep it.
Hopefully pickling dwarf so two of them can coexists won't be that
hard.

> the dwarf alongside the GIMPLE IL and simply link with these
> files at the end or we need to support a separate set of files to
> store the DWARF.  If we need separate files then why not store
> the GIMPLE IL data into separate objects in the first place and
> output a reference to it into the main object file?  That way we
> don't need any special "attributes" - the linker plugin simply
> opens the main object file, extracts the reference to the IL file
> and passes that along.

I do not like much the iea of separate files, as make clean will not
be happy.  Having evertyhing in one file seems to make sense.
The attributes are needed to make the file acceptable for the linker/archiver.
> 
> Btw, the patch is very hard to read as it moves (and modifies?) files

Basically no modifications there (I believe I did try to set attributes there
and then reverted the change), I will send explicit diff to that file.

> at the same time.  What's this magic "file attributes" we need?

What type of ELF you produce (32bit/64bit etc.)

Honza
> 
> Thanks,
> Richard.
> 
> > Honza
> >
> > Index: Makefile.in
> > ===
> > --- Makefile.in (revision 215518)
> > +++ Makefile.in (working copy)
> > @@ -1300,6 +1300,7 @@
> > lto-section-out.o \
> > lto-opts.o \
> > lto-compress.o \
> > +   lto-object.o \
> > mcf.o \
> > mode-switching.o \
> > modulo-sched.o \
> > Index: common.opt
> > ===
> > --- common.opt  (revision 215518)
> > +++ common.opt  (working copy)
> > @@ -923,6 +923,9 @@
> >  Common Report Var(flag_btr_bb_exclusive) Optimization
> >  Restrict target load migration not to re-use registers in any basic block
> >
> > +fbypass-asm=
> > +Common Joined Var(flag_bypass_asm)
> > +
> >  fcall-saved-
> >  Common Joined RejectNegative Var(common_deferred_options) Defer
> >  -fcall-saved-Mark  as being preserved across 
> > functions
> > Index: langhooks.c
> > ===
> > --- langhooks.c (revision 215518)
> > +++ langhooks.c (working copy)
> > @@ -40,6 +40,10 @@
> >  #include "cgraph.h"
> >  #include "timevar.h"
> >  #include "output.h"
> > +#include "tree-ssa-alias.h"
> > +#include "gimple-expr.h"
> > +#include "gimple.h"
> > +#include "lto-streamer.h"
> >
> >  /* Do nothing; in many cases the default hook.  */
> >
> > @@ -653,6 +657,19 @@
> >  {
> >section *section;
> >
> > +  if (flag_bypass_asm)
> > +{
> > +  static int initialized = false;
> > +  if (!initialized)
> > +   {
> > + gcc_assert (asm_out_file == NULL);
> > +  lto_set_current_out_file (lto_obj_file_open (asm_file_name, 
> > true));
> > + initialized = true;
> > +   }
> > +  lto_obj_begin_section (name);
> > +  return;
> > +}
> > +
> >/* Save the old section so we can restore it in lto_end_asm_section.  */
> >gcc_assert (!saved_section);
> >saved_section = in_section;
> > @@ -669,8 +686,13 @@
> > implementation just calls assemble_string.  */
> >
> >  void
> > -lhd_append_data (const void *data, size_t len, void *)
> > +lhd_append_data (const void *data, size_t len, void *v)
> >  {
> > +  if (flag_bypass_asm)
> > +{
> > +  lto_obj_append_data (data, len, v);
> > +  return;
> > +}
> >if (data)
> >  assemble_string ((const char *)data, len);
> >  }
> > @@ -683,6 +705,11 @@
> >  void
> >  lhd_end_section (void)
> >  {
> > +  if (flag_bypass_asm)
> > +{
> > +  lto_obj_end_section ();
> > +  return;
> > +}
> >if (saved_section)
> >  {
> >switch_to_section (saved_section);
> > Index: lto/Make-lang.in
> > ===
> > --- lto/Make-lang.in(revision 215518)
> > +++ lto/Make-lang.in(working copy)
> > @@ -22,7 +22,7 @@
> >  # The name of the LTO compiler.
> >  LTO_EXE = lto1$(exeext)
> >  # The LTO-specific object files inclued in $(LTO_EXE).
> > -LTO_OBJS = lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o 
> > 

Re: [RFC] Dealing with ODR violations in GCC

2014-09-24 Thread Jonathan Wakely
On 12 September 2014 06:40, Jan Hubicka wrote:
> Hi,
> I went through excercise of running LTO bootstrap with ODR verification on.  
> There are some typename clashes
> I guess we want to fix.  I wonder what approach is preferred, do we want to 
> introduce anonymous
> namespaces for those?
> /usr/bin/ld.gold.real: warning: using 'GLIBCXX_3.4' as version for 
> '_ZNKSt15basic_stringbufIwSt11char_traitsIwESaIwEE3strEv' which is also named 
> in version 'GLIBCXX_3.4.6' in script
> /usr/bin/ld.gold.real: warning: using 'GLIBCXX_3.4' as version for 
> '_ZNKSs11_M_disjunctEPKc' which is also named in version 'GLIBCXX_3.4.5' in 
> script
> /usr/bin/ld.gold.real: warning: using 'GLIBCXX_3.4' as version for 
> '_ZNKSbIwSt11char_traitsIwESaIwEE11_M_disjunctEPKw' which is also named in 
> version 'GLIBCXX_3.4.5' in script

I think these are caused by two bugs in the libstdc++ linker script.
It looks like the duplicates should have been guarded by #ifndef
HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT (as per the attached
patch), but it's too late now and we export the symbols at GLIBCXX_3.4
unconditionally. Applying the patch now would alter the symbol version
for the wstringbuf::str() symbol.

Maybe we should just remove the duplicates in the later symbol version
sections, but I'm not confident enough to do it!
commit bd23198d87a6c04ea55410221f85ed79786b6478
Author: Jonathan Wakely 
Date:   Fri Sep 12 11:27:32 2014 +0100

* config/abi/pre/gnu.ver: Prevent symbols appearing in two versions,
to fix warning from gold linker.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 58c90d6..5f96a09 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -317,7 +317,10 @@ GLIBCXX_3.4 {
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9]seek*;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9]set*;
 _ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv;
+
+#ifndef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT
 _ZNKSt15basic_stringbufIwSt11char_traitsIwESaIwEE3strEv;
+#endif
 _ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strERKSs;
 _ZNSt15basic_stringbufIwSt11char_traitsIwESaIwEE3strERKSbIwS1_S2_E;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9][t-z]*;
@@ -864,6 +867,7 @@ GLIBCXX_3.4.4 {
 
 GLIBCXX_3.4.5 {
 
+#ifndef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT
 # std::string
 _ZNKSs11_M_disjunctEPKc;
 _ZNKSs15_M_check_lengthE[jmy][jmy]PKc;
@@ -888,6 +892,7 @@ GLIBCXX_3.4.5 {
 _ZNSt13basic_istreamIwSt11char_traitsIwEE6ignoreE[ilvx];
 
 _ZNSt11char_traitsI[cw]E2eqERK[cw]S2_;
+#endif
 
 # Those template instantiations weren't exported on Solaris in GCC 4.6
 # and aren't necessary for correct operation, so don't emit them now


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Andi Kleen
Jan Hubicka  writes:

Nice patch.

> The implementation is pretty straighforward except for -fbypass-asm requiring
> one existing OBJ file to fetch target's file attributes from.  This is
> definitly not optimal, but libiberty currently can't build output files from
> scratch. As Ian suggested, I plan to simply arrange the driver to pass 
> crtbegin
> around at least to start with. We may want to bypass this later and storing
> proper attributes into the binary.

I wonder how hard it would be to fix simple-object to be able to create
from scratch. From a quick look it would be mostly adding the right
values into the header? That would need some defines per target.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Ian Lance Taylor
On Wed, Sep 24, 2014 at 7:47 AM, Andi Kleen  wrote:
>
> I wonder how hard it would be to fix simple-object to be able to create
> from scratch. From a quick look it would be mostly adding the right
> values into the header? That would need some defines per target.

It could be done, of course.  It would mean maintaining a new set of
tables and updating them for each target.  The specific table to use
would depend on the command line options.  It turns into yet another
data structure to update.

Ian


Re: Enable EBX for x86 in 32bits PIC code

2014-09-24 Thread Jeff Law

On 09/24/14 00:56, Ilya Enkovich wrote:

2014-09-23 20:10 GMT+04:00 Jeff Law :

On 09/23/14 10:03, Jakub Jelinek wrote:


On Tue, Sep 23, 2014 at 10:00:00AM -0600, Jeff Law wrote:


On 09/23/14 08:34, Jakub Jelinek wrote:


On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:


use fixed EBX at least until we make sure pseudo PIC doesn't harm debug
info generation.  If we have such option then gcc.target/i386/pic-1.c
and



For debug info, it seems you are already handling this in
delegitimize_address target hook, I'd suggest just building some very
large
shared library at -O2 -g -fpic on i?86 and either look at the
sizes of .debug_info/.debug_loc sections with/without the patch,
or use the locstat utility from elfutils (talk to Petr Machata if
needed).


Can't hurt, but I really don't see how changing from a fixed to an
allocatable register is going to muck up debug info in any significant
way.



What matters is if the delegitimize_address target hook is as efficient in
delegitimization as before.  E.g. if it previously matched only when
seeing
%ebx + gotoff or similar, and wouldn't match anything now, some vars could
have debug locations including UNSPEC and be dropped on the floor.


Ah, yea, that makes sense.

jeff



After register allocation we have no idea where GOT address is and
therefore delegitimize_address target hook becomes less efficient and
cannot remove UNSPECs. That's what I see now when build GCC with patch
applied:

In theory this shouldn't be too hard to fix.

I haven't looked at the code, but it might be something looking 
explicitly for ebx by register #, or something similar.  Which case 
within delegitimize_address isn't firing as it should after your changes?


jeff



Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> On Wed, Sep 24, 2014 at 7:47 AM, Andi Kleen  wrote:
> >
> > I wonder how hard it would be to fix simple-object to be able to create
> > from scratch. From a quick look it would be mostly adding the right
> > values into the header? That would need some defines per target.
> 
> It could be done, of course.  It would mean maintaining a new set of
> tables and updating them for each target.  The specific table to use
> would depend on the command line options.  It turns into yet another
> data structure to update.

Yep, i think the crtstuff hack is pretty good for now (well under assumption
I won't have too hard time to get it working in the driver).  I think the only
real blocker is the lack of simple-object API to create the two common
symbols we need to make the object fiels compliant. I really hope Ian
will help me on this, please;)

Honza
> 
> Ian


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> > On Wed, Sep 24, 2014 at 7:47 AM, Andi Kleen  wrote:
> > >
> > > I wonder how hard it would be to fix simple-object to be able to create
> > > from scratch. From a quick look it would be mostly adding the right
> > > values into the header? That would need some defines per target.
> > 
> > It could be done, of course.  It would mean maintaining a new set of
> > tables and updating them for each target.  The specific table to use
> > would depend on the command line options.  It turns into yet another
> > data structure to update.
> 
> Yep, i think the crtstuff hack is pretty good for now (well under assumption
> I won't have too hard time to get it working in the driver).  I think the only
> real blocker is the lack of simple-object API to create the two common
> symbols we need to make the object fiels compliant. I really hope Ian
> will help me on this, please;)

Just for some data, I did compile time comparsion at libreoffce
http://hubicka.blogspot.ca/2014/09/linktime-optimization-in-gcc-part-3.html
and firefox
http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html

My general plan is to try to make LTO compile time faster than non-LTO and
possibly clang's on my setup (i.e. with WHOPR parallelism).  It is already
faster than clan'g LTO. Also SPEC build times are now faster than non-LTO
ones. 

Libreoffice shows that GCC needs about twice as much of system time. According
to profiles, good part is the ugly way we pass stuff down to assembler and
other part is memory use during the copmilation stage.
I fixed most of the botlenecks seen in GCC 4.9 - ineffeciencies in hashing
for streaming, unnecesary initialization of the backend, inliner and other
stuff.

Funilly enough I benchmarked LTO build with mainline and GCC 4.9 and the times
are almost exactly the same on both Firefox and libreoffice. There are some
slowdowns too - the speculative devirtualization issues I plan to fix today,
extra streaming needed, and slowdowns in C++ FE/preprocessor...  I will
bechmark last two bit more curefuly ;) But this also means that non-LTO got
slower in 5.0 so I am probably closer to reaching the goal.

Honza
> 
> Honza
> > 
> > Ian


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka  wrote:
> Libreoffice shows that GCC needs about twice as much of system time. According
> to profiles, good part is the ugly way we pass stuff down to assembler and
> other part is memory use during the copmilation stage.

Are you using -pipe? AFAIR this still isn't the default, even on
GNU/Linux, but it is typically a lot faster than without.

Ciao!
Steven


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka  wrote:
> > Libreoffice shows that GCC needs about twice as much of system time. 
> > According
> > to profiles, good part is the ugly way we pass stuff down to assembler and
> > other part is memory use during the copmilation stage.
> 
> Are you using -pipe? AFAIR this still isn't the default, even on
> GNU/Linux, but it is typically a lot faster than without.

I use libreoffice's default flags. Will check what they do.
Given that -pipe is around for many years and works well, what about making it 
defualt
to justify GCC 5 release?

honza
> 
> Ciao!
> Steven


insert global variable declaration with gcc plugin

2014-09-24 Thread Pedro Paredes
I would know if it's possible to insert a global variable declaration
with a gcc plugin. For example if I got de following code:

---test.c

int main(void) {
  return 0;
}



and I want to transform it with a plugin into:



int fake_var;

int main(void) {
  return 0;
}



Is that possible? If it's possible, in wich pass a how can I do it?


Re: Enable EBX for x86 in 32bits PIC code

2014-09-24 Thread Ilya Enkovich
2014-09-24 19:27 GMT+04:00 Jeff Law :
> On 09/24/14 00:56, Ilya Enkovich wrote:
>>
>> 2014-09-23 20:10 GMT+04:00 Jeff Law :
>>>
>>> On 09/23/14 10:03, Jakub Jelinek wrote:


 On Tue, Sep 23, 2014 at 10:00:00AM -0600, Jeff Law wrote:
>
>
> On 09/23/14 08:34, Jakub Jelinek wrote:
>>
>>
>> On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:
>>>
>>>
>>> use fixed EBX at least until we make sure pseudo PIC doesn't harm
>>> debug
>>> info generation.  If we have such option then gcc.target/i386/pic-1.c
>>> and
>>
>>
>>
>> For debug info, it seems you are already handling this in
>> delegitimize_address target hook, I'd suggest just building some very
>> large
>> shared library at -O2 -g -fpic on i?86 and either look at the
>> sizes of .debug_info/.debug_loc sections with/without the patch,
>> or use the locstat utility from elfutils (talk to Petr Machata if
>> needed).
>
>
> Can't hurt, but I really don't see how changing from a fixed to an
> allocatable register is going to muck up debug info in any significant
> way.



 What matters is if the delegitimize_address target hook is as efficient
 in
 delegitimization as before.  E.g. if it previously matched only when
 seeing
 %ebx + gotoff or similar, and wouldn't match anything now, some vars
 could
 have debug locations including UNSPEC and be dropped on the floor.
>>>
>>>
>>> Ah, yea, that makes sense.
>>>
>>> jeff
>>
>>
>>
>> After register allocation we have no idea where GOT address is and
>> therefore delegitimize_address target hook becomes less efficient and
>> cannot remove UNSPECs. That's what I see now when build GCC with patch
>> applied:
>
> In theory this shouldn't be too hard to fix.
>
> I haven't looked at the code, but it might be something looking explicitly
> for ebx by register #, or something similar.  Which case within
> delegitimize_address isn't firing as it should after your changes?

It is the case I had to fix:

@@ -14415,7 +14433,8 @@ ix86_delegitimize_address (rtx x)
 ...
 movl foo@GOTOFF(%ecx), %edx
 in which case we return (%ecx - %ebx) + foo.  */
-  if (pic_offset_table_rtx)
+  if (pic_offset_table_rtx
+ && (!reload_completed || !ix86_use_pseudo_pic_reg ()))
 result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx (addend),
 pic_offset_table_rtx),
   result);

Originally if there is a UNSPEC_GOTOFFSET but no EBX usage then we
just remove this UNSPEC and substract EBX value.  With pseudo PIC reg
we should use PIC register instead of EBX but it is unclear what to
use after register allocation.

Ilya

>
> jeff
>


Re: Enable EBX for x86 in 32bits PIC code

2014-09-24 Thread Jeff Law

On 09/24/14 14:32, Ilya Enkovich wrote:

2014-09-24 19:27 GMT+04:00 Jeff Law :

On 09/24/14 00:56, Ilya Enkovich wrote:




After register allocation we have no idea where GOT address is and
therefore delegitimize_address target hook becomes less efficient and
cannot remove UNSPECs. That's what I see now when build GCC with patch
applied:


In theory this shouldn't be too hard to fix.

I haven't looked at the code, but it might be something looking explicitly
for ebx by register #, or something similar.  Which case within
delegitimize_address isn't firing as it should after your changes?


It is the case I had to fix:

@@ -14415,7 +14433,8 @@ ix86_delegitimize_address (rtx x)
  ...
  movl foo@GOTOFF(%ecx), %edx
  in which case we return (%ecx - %ebx) + foo.  */
-  if (pic_offset_table_rtx)
+  if (pic_offset_table_rtx
+ && (!reload_completed || !ix86_use_pseudo_pic_reg ()))
  result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx (addend),
  pic_offset_table_rtx),
result);

Originally if there is a UNSPEC_GOTOFFSET but no EBX usage then we
just remove this UNSPEC and substract EBX value.  With pseudo PIC reg
we should use PIC register instead of EBX but it is unclear what to
use after register allocation.
What's the RTL before & after allocation?  Feel free to just pass along 
the dump files for sum_r4 that you referenced in a prior message.


jeff


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Ian Lance Taylor
On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher  wrote:
> On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka  wrote:
>> Libreoffice shows that GCC needs about twice as much of system time. 
>> According
>> to profiles, good part is the ugly way we pass stuff down to assembler and
>> other part is memory use during the copmilation stage.
>
> Are you using -pipe? AFAIR this still isn't the default, even on
> GNU/Linux, but it is typically a lot faster than without.

Is that true even when TMPDIR is on a ram disk?  There's no obvious
reason that it should be true in a parallel build.  Using -pipe
effectively constrains communication between the compiler and the
assembler to work in PIPE_BUF blocks.  Using TMPDIR introduces no such
constraints, and in a big program a parallel build should obscure the
fact that the compiler and assembler are serialized for each
individual compilation unit.

Ian


Problems building the latest gcc

2014-09-24 Thread George R Goffe
Hi,

I'm having trouble building the latest gcc on my fedora 19 x86_64 system. It's 
probably something I'm doing wrong but I can't seem to find what. Maybe it is a 
bug? Could I get someone to look at the problem please? I have a complete build 
log if that's necessary.

Regards and THANKS for your help,

George...



ept.c
../../gcc/gcc/emit-rtl.c: In function ‘rtx_insn* try_split(rtx, rtx, int)’:
../../gcc/gcc/emit-rtl.c:3810:16: error: ‘class rtx_insn’ has no member named 
‘deleted’
 if (! tem->deleted () && INSN_P (tem))
^
In file included from ../../gcc/gcc/emit-rtl.c:35:0:
../../gcc/gcc/emit-rtl.c: In function ‘void add_insn_after_nobb(rtx_insn*, 
rtx_insn*)’:
../../gcc/gcc/emit-rtl.c:3987:36: error: ‘class rtx_insn’ has no member named 
‘deleted’
   gcc_assert (!optimize || !after->deleted ());
^
../../gcc/gcc/system.h:697:14: note: in definition of macro ‘gcc_assert’
((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0 : 0))
  ^
../../gcc/gcc/emit-rtl.c: In function ‘void add_insn_before_nobb(rtx_insn*, 
rtx_insn*)’:
../../gcc/gcc/emit-rtl.c:4016:37: error: ‘class rtx_insn’ has no member named 
‘deleted’
   gcc_assert (!optimize || !before->deleted ());
 ^
../../gcc/gcc/system.h:697:14: note: in definition of macro ‘gcc_assert’
((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0 : 0))
  ^
make[3]: *** [emit-rtl.o] Error 1
make[3]: *** Waiting for unfinished jobs
In file included from ../../gcc/gcc/dwarf2out.c:59:0:
../../gcc/gcc/dwarf2out.c: In function ‘void gen_label_die(tree, dw_die_ref)’:
../../gcc/gcc/dwarf2out.c:19053:42: error: ‘class rtx_insn’ has no member named 
‘deleted’
gcc_assert (!as_a (insn)->deleted ());
  ^
../../gcc/gcc/system.h:697:14: note: in definition of macro ‘gcc_assert’
((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0 : 0))
  ^
../../gcc/gcc/dwarf2out.c: In function ‘void dwarf2out_var_location(rtx_insn*)’:
../../gcc/gcc/dwarf2out.c:21330:21: error: ‘class rtx_insn’ has no member named 
‘deleted’
   || next_note->deleted ()
 ^
make[3]: *** [dwarf2out.o] Error 1
rm gcov-tool.pod gcov.pod fsf-funding.pod cpp.pod gfdl.pod gcc.pod
make[3]: Leaving directory 
`/sdc1/exphome/clipper/export/home/tools/gcc/obj-i686-pc-linux-gnu/gcc'
make[2]: *** [all-stage1-gcc] Error 2
make[2]: Leaving directory 
`/sdc1/exphome/clipper/export/home/tools/gcc/obj-i686-pc-linux-gnu'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory 
`/sdc1/exphome/clipper/export/home/tools/gcc/obj-i686-pc-linux-gnu'
make: *** [all] Error 2


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher  
> wrote:
> > On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka  wrote:
> >> Libreoffice shows that GCC needs about twice as much of system time. 
> >> According
> >> to profiles, good part is the ugly way we pass stuff down to assembler and
> >> other part is memory use during the copmilation stage.
> >
> > Are you using -pipe? AFAIR this still isn't the default, even on
> > GNU/Linux, but it is typically a lot faster than without.
> 
> Is that true even when TMPDIR is on a ram disk?  There's no obvious
> reason that it should be true in a parallel build.  Using -pipe
> effectively constrains communication between the compiler and the
> assembler to work in PIPE_BUF blocks.  Using TMPDIR introduces no such
> constraints, and in a big program a parallel build should obscure the
> fact that the compiler and assembler are serialized for each
> individual compilation unit.

Actually I mount /tmp as tmpfs, so this should not be an issue.
Oviously for slim LTO we get more benefits from outputting binary data directly
rather than spending time to printf and scanf them ;)

Honza


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 11:47 PM, Ian Lance Taylor wrote:
> On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher wrote:
>> Are you using -pipe? AFAIR this still isn't the default, even on
>> GNU/Linux, but it is typically a lot faster than without.
>
> Is that true even when TMPDIR is on a ram disk?  There's no obvious
> reason that it should be true in a parallel build.  Using -pipe
> effectively constrains communication between the compiler and the
> assembler to work in PIPE_BUF blocks.  Using TMPDIR introduces no such
> constraints, and in a big program a parallel build should obscure the
> fact that the compiler and assembler are serialized for each
> individual compilation unit.

I've done my most recent timings on a machine that has /dev/md3
mounted on /tmp. That's gcc110 on the compile farm. With/without -pipe
made a significant difference.

If TMPDIR is a tmpfs or other kind of ram disk, I suppose the benefits
would be less (to the point of vanishing). Unfortunately I can't test
it...

Ciao!
Steven


gcc-4.9-20140924 is now available

2014-09-24 Thread gccadmin
Snapshot gcc-4.9-20140924 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20140924/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 215571

You'll find:

 gcc-4.9-20140924.tar.bz2 Complete GCC

  MD5=9609f1056eb487bd0e6f9a582d1a81e8
  SHA1=7c4cabd3792b0a2aa56575045e3cab0f406bc6cc

Diffs from 4.9-20140917 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Problems building the latest gcc

2014-09-24 Thread Jonathan Wakely
On 24 September 2014 22:49, George R Goffe wrote:
> Hi,
>
> I'm having trouble building the latest gcc on my fedora 19 x86_64 system.

This mailing list is for discussing development of gcc itself, please
use the gcc-help list for help building or using gcc.

Please send your question there instead, and be sure to include how
you configured GCC and what exactly you mean by "the latest gcc" (the
latest release, the subversion trunk or something else).


Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
> On Wed, Sep 24, 2014 at 11:47 PM, Ian Lance Taylor wrote:
> > On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher wrote:
> >> Are you using -pipe? AFAIR this still isn't the default, even on
> >> GNU/Linux, but it is typically a lot faster than without.
> >
> > Is that true even when TMPDIR is on a ram disk?  There's no obvious
> > reason that it should be true in a parallel build.  Using -pipe
> > effectively constrains communication between the compiler and the
> > assembler to work in PIPE_BUF blocks.  Using TMPDIR introduces no such
> > constraints, and in a big program a parallel build should obscure the
> > fact that the compiler and assembler are serialized for each
> > individual compilation unit.
> 
> I've done my most recent timings on a machine that has /dev/md3
> mounted on /tmp. That's gcc110 on the compile farm. With/without -pipe
> made a significant difference.
> 
> If TMPDIR is a tmpfs or other kind of ram disk, I suppose the benefits
> would be less (to the point of vanishing). Unfortunately I can't test
> it...
OK, I tried it on my hello world benchmark with tmpfs and -fpipe really seems
like a small loss. I wonder if we can work out better defaults that works for
most of people.  I use tmpfs as I am worried about my notebook ssd still
being alive and well in 3 years, but it is still far from mainstream.

Honza


Re: Problems building the latest gcc

2014-09-24 Thread George R Goffe
Jonathan,

Thank you for your response.

Since I build from what I believe is the main trunk, I thought that developers 
might be interested in this situation. I WILL try the help path as you suggest.

Thanks again for your time,

George...

svn info
Path: .
Working Copy Root Path: /sdc1/exphome/clipper/export/home/tools/gcc/gcc
URL: svn://gcc.gnu.org/svn/gcc/trunk
Relative URL: ^/trunk
Repository Root: svn://gcc.gnu.org/svn/gcc
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 215540
Node Kind: directory
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 215242
Last Changed Date: 2014-09-13 12:00:28 -0700 (Sat, 13 Sep 2014)





- Original Message -
From: Jonathan Wakely 
To: George R Goffe 
Cc: "gcc@gcc.gnu.org" 
Sent: Wednesday, September 24, 2014 4:36 PM
Subject: Re: Problems building the latest gcc

On 24 September 2014 22:49, George R Goffe wrote:



> Hi,
>
> I'm having trouble building the latest gcc on my fedora 19 x86_64 system.

This mailing list is for discussing development of gcc itself, please
use the gcc-help list for help building or using gcc.

Please send your question there instead, and be sure to include how
you configured GCC and what exactly you mean by "the latest gcc" (the
latest release, the subversion trunk or something else).



Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Janne Blomqvist
On Thu, Sep 25, 2014 at 12:47 AM, Ian Lance Taylor  wrote:
> Is that true even when TMPDIR is on a ram disk?  There's no obvious
> reason that it should be true in a parallel build.  Using -pipe
> effectively constrains communication between the compiler and the
> assembler to work in PIPE_BUF blocks.  Using TMPDIR introduces no such
> constraints, and in a big program a parallel build should obscure the
> fact that the compiler and assembler are serialized for each
> individual compilation unit.

As an aside, I think what matters is the capacity of the pipe rather
than PIPE_BUF. PIPE_BUF is the largest chunk that can be written
atomically, but since we don't have a case of multiple processes
writing to the same pipe(???), it doesn't matter. On a typical
x86(-64) Linux system, PIPE_BUF is 4k while the capacity is by default
64k (can be increased with fcntl(fd, F_SETPIPE_SZ, ...), perhaps worth
trying to see if it makes any difference?).

Still, it seems to me that making -pipe the default would make sense,
if the tradeoff appears to be a small loss in case when /tmp is a
tmpfs vs. a much larger gain when /tmp is a normal fs.


-- 
Janne Blomqvist