gas new (2.20) rebuffer_line() excessive read() syscalls

2013-03-19 Thread Jim Tison

Hi,

I can't get a web browser connection to sourceware.org, so I can't open 
a PR on this (I will as soon as I can unless you advise me not to) but I 
wanted to warn you guys this is coming.


I've got a C language source code TU that is 1.3 MiB large. I didn't 
write it (I know better), but I have to make it coexist in a rather 
lengthy system build, and the leader of the system build team is 
justifably upset with me. Using binutils-2.19, I see this in an strace 
-c summary:


jtison@tstlinux ~ 13:24:40$ strace -c as -o offldr.o2.o 
-alshd=offldr.o2.lst offldr.O2.s

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 51.750.033194  25  1333   write
 19.960.012801  10  1337   read
 15.860.010175  23   438   134 open
  3.160.002026   7   300   munmap
  1.780.001144   9   129   127 stat
  1.740.001114   3   322   mmap
  1.660.001067   4   304   close
  1.410.000902   2   388   brk
  1.340.000858   3   303   fstat
  0.850.000546   2   312   lseek
  0.270.000170 170 1   execve
  0.170.000108  27 4 fadvise64
  0.050.34   6 6   mprotect
  0.010.05   5 1 1 access
  0.000.01   1 1 getrusage
  0.000.01   1 1   uname
-- --- --- - - 
100.000.064146  5180   262 total

All is still well with as-2.19. But now I run the 'upgrade' (as-2.21), 
and I see this:
jtison@linuxtpf ~ 16:46:36$ time as-2.21 -o offldr.o2.o 
-alshd=offldr.o2.lst offldr.o2.s


real4m18.258s
user4m12.802s
sys 0m3.251s

Note the total assembly time; see how the clear majority of time is 
spent in userspace.For now, I've fallen back to as-2.19 (the assembler 
only, I brought along a copy of libbfd and libopcodes that match) just 
to get past this behavior.


When I build 2.20[1] without trying to fix it, I see this for the same 
C-generated code at -O2:


jtison@linuxtpf ~ 13:27:23$ strace -c as-2.20 -o offldr.o2.o 
-alshd=offldr.o2.lst offldr.o2.s

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 87.963.508911   3   1020658   read
  3.580.142899  11 13246   munmap
  3.170.126437   9 13386   136 open
  2.040.081426  51  1599   write
  1.350.053725   4 13250   close
  0.920.036537   3 13272   mmap
  0.530.021277   1 26204   lseek
  0.340.013416   1 13250   fstat
  0.060.002233 447 5 fadvise64
  0.020.000840   2   387   brk
  0.020.000824   6   129   126 stat
  0.010.000589 589 1   execve
  0.010.000217 217 1   unlink
  0.000.51   7 7   mprotect
  0.000.08   8 1 1 access
  0.000.02   1 2   fcntl
  0.000.02   2 1   lstat
  0.000.01   1 1 getrusage
-- --- --- - - 
100.003.989395   1115400   263 total

As we see, the compilation time grows by a factor of 15X. I see the 
rebuffer_line() (listing.c:541) routine doing an ftell(FILE *, last_pos, 
SEEK_SET) which sets the file pointer close to and before the line it 
seeks, but then it issues an ftell(FILE*) so we "remember" where we are 
(why?), then we reset all the way back to zero with fseek(FILE *, 0, 
SEEK_SET) and only now it starts reading forward up to the line number 
it seeks. In effect, the C source file is being read over and over again 
(unnecessarily?) all the way from offset 0 every time it executes that 
fseek/ftell/fseek code. This explains the dramatic increase in read 
syscalls, almost a million. The behavior is consistent from version 2.20 
through 2.23 for this particular TU.


It looks like most of listing.c was rewritten for 2.20. I tried removing 
the fseek(FILE *, 0, SEEK_SET) call and made a couple of refactoring 
changes to make sure I'm seeking  the proper source line. The syscall 
increase reduces dramatically; but I'm not sure if this change works for 
all assemblies or not. I have attached the patch anyway in hopes that it 
will be useful. I'm changing a brand new code routine and I'm not sure 
that I'm not breaking something -- this particular use case came out 
okay vis-a-vis the listing, at least that I could tell, but I have n

[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread petechou at gmail dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #11 from pete  2013-03-20 02:04:21 UTC 
---
Hello Ian,

Would you please comment on the issue and patch?

-
Thanks,
Pete

(In reply to comment #10)
> I have no object to what is proposed but the decision is Ian's.
> 
> On Tue, Mar 12, 2013 at 7:03 PM, petechou at gmail dot com
>  wrote:
> > http://sourceware.org/bugzilla/show_bug.cgi?id=15200
> >
> > --- Comment #9 from pete  2013-03-13 02:03:32 
> > UTC ---
> > If there is a libfoo.so, it's possible to implement part of functions in
> > libfoo.so and then link against libfoo.so again to produce a new library. I
> > think the "linker-defined" symbols should also have higher priority than the
> > symbols in DSO.
> >
> > And I also think we should have the same behavior on defining symbol value
> > whether only_if_ref is set or not?
> >
> > The patch will affect 4 ways to define output symbols:
> > 1. do_define_in_output_data
> > 2. do_define_in_output_segment
> > 3. do_define_as_constant
> > 4. add_undefined_symbol_from_command_line
> >
> > For 1, 2, and 3, it makes sense to use the given output section, segment, 
> > and
> > constant to define symbol values if the "linker-defined" symbols have higher
> > priority.
> >
> > As for 4, I think it doesn't matter.
> >
> > -
> > Thanks,
> > Pete
> >
> > (In reply to comment #8)
> >> Hi Pete,
> >>
> >> 1. My concern of your patch is that it affects all targets, not just
> >> ARM.  While it may fix the problem on ARM, I cannot speak for authors
> >> of other backends.
> >> 2. I am don't have power approve a patch, you need to convince Ian
> >> Taylor that this is the right thing to do for all targets.
> >> 3. This bug only happens when DSOs incorrectly export __exidx_start &
> >> __exidx_end.  These DSOs are considered broken and I strongly
> >> recommend you moving to a newer NDK version with DSOs that have proper
> >> visibility.  Both ld & gold no longer export these symbols.  The
> >> reason for not exporting is explained here in the ld patch:
> >>
> >> http://sourceware.org/ml/binutils/2009-11/msg00367.html
> >>
> >> -Doug
> >>
> >>
> >> On Wed, Mar 6, 2013 at 10:43 PM, petechou at gmail dot com
> >>  wrote:
> >> > http://sourceware.org/bugzilla/show_bug.cgi?id=15200
> >> >
> >> > --- Comment #7 from pete  2013-03-07 06:43:08 
> >> > UTC ---
> >> > I can think visibility does matter, but only for exporting the symbol or 
> >> > not.
> >> > In the previous case, gold does define the __bss symbols but not use the
> >> > definition in DSO, so I think we should still define section/segment 
> >> > symbols
> >> > locally.
> >> >
> >> > -Pete
> >> >
> >> > (In reply to comment #6)
> >> >> The symbols do not have same visibility.  gold in general is
> >> >> compatible with GNU ld.  If you look at src/ld/emulparams/, you will
> >> >> set that the __bss symbols are exported but the __exidx symbols are
> >> >> defined hidden.  gold matches the behaviour of ld.  If you look at
> >> >> defstd.cc in gold, you will again see that some of these symbols are
> >> >> hidden but some are not.
> >> >>
> >> >> -Doug
> >> >>
> >> >>
> >> >> On Wed, Mar 6, 2013 at 9:31 PM, petechou at gmail dot com
> >> >>  wrote:
> >> >> > http://sourceware.org/bugzilla/show_bug.cgi?id=15200
> >> >> >
> >> >> > --- Comment #5 from pete  2013-03-07 
> >> >> > 05:31:25 UTC ---
> >> >> > (In reply to comment #4)
> >> >> >> The symbols __exidx_end & __exidx_start were exported incorrectly in
> >> >> >> the past.  This is fixed in both ld & gold in the sense that these
> >> >> >> symbols are defined by not exported.  I think defining these symbols
> >> >> >> when there is a non-weak definition in a DSO is not the right thing 
> >> >> >> to
> >> >> >> do.  This is basically multiple definition and usually is an error.
> >> >> >
> >> >> > If so, all section/segment symbols should be defined (if needed) by 
> >> >> > not
> >> >> > exported?
> >> >> >
> >> >> > But when there is a non-weak definition in a DSO, it's still possible 
> >> >> > to find
> >> >> > ld.gold define and export segment symbols like __bss_start (See
> >> >> > gold.defstd.cc:214. only_if_ref is set to false). And there is no 
> >> >> > multiple
> >> >> > definition error.
> >> >> >
> >> >> > $ readelf -Ds libc.so | grep __bss_start
> >> >> >   268 311: d898 0 NOTYPE  GLOBAL DEFAULT ABS __bss_start__
> >> >> >   619 470: d898 0 NOTYPE  GLOBAL DEFAULT ABS __bss_start
> >> >> >
> >> >> > $ arm-linux-androideabi-ld.gold -shared -o libplasma.so plasma.o 
> >> >> > libc.so
> >> >> >
> >> >> > $ readelf -s libplasma.so | grep __bss_start
> >> >> > 73: 4148 0 NOTYPE  GLOBAL DEFAULT  ABS __bss_start
> >> >> >168: 4148 0 NOTYPE  GLOBAL DEFAULT  ABS __bss_start
> >> >> >
> >> >> >> I am not working on the Android NDK so I can't give you a good
> >> >> >> suggestion here.  Does using the latest NDK fixes your problem?
> >> >> >>
> >> >> >
> >> >> > Yes, there is no this issue with the latest ND

[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread ian at airs dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #12 from Ian Lance Taylor  2013-03-20 03:29:49 
UTC ---
oldsym->in_dyn() will return true if the symbol was seen in a dynamic object. 
I don't see why we should create the symbol if it is seen in a dynamic object.

It seems that the code should be something like

if (oldsym == NULL)
  return NULL;
if (oldsym->is_undefined())
  ;
else if (oldsym->is_from_dynobj())
  ;
else
  return NULL;

but that's not right either.  We should only create the symbol if it is
referenced by a regular object.  So it really needs to be

if (oldsym == NULL)
  return NULL;

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread ian at airs dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #13 from Ian Lance Taylor  2013-03-20 03:32:47 
UTC ---
oldsym->in_dyn() will return true if the symbol was seen in a dynamic object. 
I don't see why we should create the symbol if it is seen in a dynamic object.

It seems that the code should be something like

if (oldsym == NULL)
  return NULL;
if (oldsym->is_undefined())
  ;
else if (oldsym->is_from_dynobj())
  ;
else
  return NULL;

but that's not right either.  We should only create the symbol if it is
referenced by a regular object.  So perhaps it really needs to be something
like

if (oldsym == NULL)
  return NULL;
if (oldsym->source() == Symbol::IS_UNDEFINED)
  ;
else if (!oldsym->in_reg())
  return NULL;
else if (oldsym->is_from_dynobj())
  ;
else
  return NULL;

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread dougkwan at google dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #14 from Doug Kwan  2013-03-20 04:41:03 
UTC ---
The special symbols in question __exidx_{start,end} were exported from
DSO's by mistake in both ld & gold and the mistake has been fixed in
both linkers.  We really don't want references of such symbols in the
main executable to be bound to the definitions in a DSO.  So Peter
wants to define those symbols even when they are seen in a DSO.  Doing
so makes it possible to use DSO's generated by older versions of gold
& ld on ARM but I really don't know whether this affects correctness
in general.

-Doug

On Tue, Mar 19, 2013 at 8:29 PM, ian at airs dot com
 wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15200
>
> --- Comment #12 from Ian Lance Taylor  2013-03-20 
> 03:29:49 UTC ---
> oldsym->in_dyn() will return true if the symbol was seen in a dynamic object.
> I don't see why we should create the symbol if it is seen in a dynamic object.
>
> It seems that the code should be something like
>
> if (oldsym == NULL)
>   return NULL;
> if (oldsym->is_undefined())
>   ;
> else if (oldsym->is_from_dynobj())
>   ;
> else
>   return NULL;
>
> but that's not right either.  We should only create the symbol if it is
> referenced by a regular object.  So it really needs to be
>
> if (oldsym == NULL)
>   return NULL;
>
> --
> Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You are on the CC list for the bug.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread ian at airs dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #15 from Ian Lance Taylor  2013-03-20 04:44:36 
UTC ---
I think that is what my patch does.  Did I get it wrong?

I think it probably makes sense in general to create an only_if_ref
linker-defined symbol if there is a ref in the regular objects and a definition
in a shared object.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread petechou at gmail dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #16 from pete  2013-03-20 05:38:56 UTC 
---
(In reply to comment #15)
> I think that is what my patch does.  Did I get it wrong?

I'm thinking the patch does the same thing as mine.

I have 2 questions:
1. in_reg() and in_dyn() seem to be mutual exclusive?
2. is it possible to see in_dyn() and is_from_dynobj() return different truth
values for a symbol?
Did I misunderstand something?

> 
> I think it probably makes sense in general to create an only_if_ref
> linker-defined symbol if there is a ref in the regular objects and a 
> definition
> in a shared object.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread petechou at gmail dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #17 from pete  2013-03-20 05:54:20 UTC 
---
(In reply to comment #16)
> (In reply to comment #15)
> > I think that is what my patch does.  Did I get it wrong?

And we should return NULL if oldsym->in_reg() is true instead?

if (oldsym == NULL)
  return NULL;
if (oldsym->source() == Symbol::IS_UNDEFINED)
  ;
else if (oldsym->in_reg())
  return NULL;
else if (oldsym->is_from_dynobj())
  ;
else
  return NULL;

> 
> I'm thinking the patch does the same thing as mine.
> 
> I have 2 questions:
> 1. in_reg() and in_dyn() seem to be mutual exclusive?
> 2. is it possible to see in_dyn() and is_from_dynobj() return different truth
> values for a symbol?
> Did I misunderstand something?
> 
> > 
> > I think it probably makes sense in general to create an only_if_ref
> > linker-defined symbol if there is a ref in the regular objects and a 
> > definition
> > in a shared object.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils


[Bug gold/15200] Runtime undefined reference to __exidx_start/_end

2013-03-19 Thread petechou at gmail dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=15200

--- Comment #18 from pete  2013-03-20 06:20:57 UTC 
---
Sorry! Yours is correct. Forget about this.

(In reply to comment #17)
> 
> And we should return NULL if oldsym->in_reg() is true instead?
> 
> if (oldsym == NULL)
>   return NULL;
> if (oldsym->source() == Symbol::IS_UNDEFINED)
>   ;
> else if (oldsym->in_reg())
>   return NULL;
> else if (oldsym->is_from_dynobj())
>   ;
> else
>   return NULL;
>

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils