Hi,
I can't get a web browser connection to sourceware.org, so I can't open
a PR on this (I will as soon as I can unless you advise me not to) but I
wanted to warn you guys this is coming.
I've got a C language source code TU that is 1.3 MiB large. I didn't
write it (I know better), but I have to make it coexist in a rather
lengthy system build, and the leader of the system build team is
justifably upset with me. Using binutils-2.19, I see this in an strace
-c summary:
jtison@tstlinux ~ 13:24:40$ strace -c as -o offldr.o2.o
-alshd=offldr.o2.lst offldr.O2.s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
51.75 0.033194 25 1333 write
19.96 0.012801 10 1337 read
15.86 0.010175 23 438 134 open
3.16 0.002026 7 300 munmap
1.78 0.001144 9 129 127 stat
1.74 0.001114 3 322 mmap
1.66 0.001067 4 304 close
1.41 0.000902 2 388 brk
1.34 0.000858 3 303 fstat
0.85 0.000546 2 312 lseek
0.27 0.000170 170 1 execve
0.17 0.000108 27 4 fadvise64
0.05 0.000034 6 6 mprotect
0.01 0.000005 5 1 1 access
0.00 0.000001 1 1 getrusage
0.00 0.000001 1 1 uname
------ ----------- ----------- --------- --------- ----------------
100.00 0.064146 5180 262 total
All is still well with as-2.19. But now I run the 'upgrade' (as-2.21),
and I see this:
jtison@linuxtpf ~ 16:46:36$ time as-2.21 -o offldr.o2.o
-alshd=offldr.o2.lst offldr.o2.s
real 4m18.258s
user 4m12.802s
sys 0m3.251s
Note the total assembly time; see how the clear majority of time is
spent in userspace.For now, I've fallen back to as-2.19 (the assembler
only, I brought along a copy of libbfd and libopcodes that match) just
to get past this behavior.
When I build 2.20[1] without trying to fix it, I see this for the same
C-generated code at -O2:
jtison@linuxtpf ~ 13:27:23$ strace -c as-2.20 -o offldr.o2.o
-alshd=offldr.o2.lst offldr.o2.s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
87.96 3.508911 3 1020658 read
3.58 0.142899 11 13246 munmap
3.17 0.126437 9 13386 136 open
2.04 0.081426 51 1599 write
1.35 0.053725 4 13250 close
0.92 0.036537 3 13272 mmap
0.53 0.021277 1 26204 lseek
0.34 0.013416 1 13250 fstat
0.06 0.002233 447 5 fadvise64
0.02 0.000840 2 387 brk
0.02 0.000824 6 129 126 stat
0.01 0.000589 589 1 execve
0.01 0.000217 217 1 unlink
0.00 0.000051 7 7 mprotect
0.00 0.000008 8 1 1 access
0.00 0.000002 1 2 fcntl
0.00 0.000002 2 1 lstat
0.00 0.000001 1 1 getrusage
------ ----------- ----------- --------- --------- ----------------
100.00 3.989395 1115400 263 total
As we see, the compilation time grows by a factor of 15X. I see the
rebuffer_line() (listing.c:541) routine doing an ftell(FILE *, last_pos,
SEEK_SET) which sets the file pointer close to and before the line it
seeks, but then it issues an ftell(FILE*) so we "remember" where we are
(why?), then we reset all the way back to zero with fseek(FILE *, 0,
SEEK_SET) and only now it starts reading forward up to the line number
it seeks. In effect, the C source file is being read over and over again
(unnecessarily?) all the way from offset 0 every time it executes that
fseek/ftell/fseek code. This explains the dramatic increase in read
syscalls, almost a million. The behavior is consistent from version 2.20
through 2.23 for this particular TU.
It looks like most of listing.c was rewritten for 2.20. I tried removing
the fseek(FILE *, 0, SEEK_SET) call and made a couple of refactoring
changes to make sure I'm seeking the proper source line. The syscall
increase reduces dramatically; but I'm not sure if this change works for
all assemblies or not. I have attached the patch anyway in hopes that it
will be useful. I'm changing a brand new code routine and I'm not sure
that I'm not breaking something -- this particular use case came out
okay vis-a-vis the listing, at least that I could tell, but I have no
guarantees that I've generated correct relocatable object. Application
of the attached patch turns in the following strace numbers:
jtison@linuxtpf ~ 14:24:52$ strace -c as-2.20-1 -o offldr.o2.o
-alshd=offldr.o2.lst offldr.o2.s
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
36.98 0.190710 5 39720 read
15.66 0.080763 6 13246 munmap
13.93 0.071843 5 13450 201 open
7.65 0.039442 3 13267 mmap
7.51 0.038721 24 1600 write
6.48 0.033411 2 19731 lseek
6.41 0.033073 2 13249 close
4.72 0.024317 2 13249 fstat
0.23 0.001172 6 192 189 stat
0.22 0.001147 3 388 brk
0.14 0.000737 737 1 unlink
0.04 0.000230 230 1 execve
0.01 0.000046 12 4 fadvise64
0.01 0.000033 6 6 mprotect
0.00 0.000005 5 1 1 access
0.00 0.000005 3 2 fcntl
0.00 0.000002 2 1 getrusage
0.00 0.000002 2 1 lstat
------ ----------- ----------- --------- --------- ----------------
100.00 0.515659 128109 391 total
There is still an increase in read syscalls, but its nowhere as radical
as before; with an attendant reduction in userspace residence time. I've
sent you my very simplistic patch only as a suggestion, I'm sure there's
more work to be done to take care of corner cases, & c. I confess to not
knowing what I'm doing inside this code and I need you guys to determine
what the correct fix is. This particular patch gives me a total assembly
time of:
real 0m3.244s
user 0m2.400s
sys 0m0.363s
I would really love to attach this test case I have, but it's IBM
licensed (i.e. proprietary) code so I can't do that and still keep my
head on the end of my neck. Might I suggest you find a monolithic C
program someplace (there have got to be some you know of) to the order
of 1 MiB+ and use that? This behavior should show itself with a
monolithic program of any size.
I would have opened a PR on this, but I can't seem to get to
sourceware.org today. I'll keep checking until I can get to bugzilla.
Thanks,
--Jim--
work: jti...@us.ibm.com (203) 885-7335
home: jti...@charter.net (860) 946-8088
[1] We actually upgraded to binutils-2.21 with SLES 11 SP2. But I
suspected the change happened in 2.20 thanks to the ChangeLog showing a
lot of activity around its last change date, and it turned out to be
true. This is probably a good candidate for backporting, provided I
understand the problem correctly (of which there are no guarantees).
--- binutils-2.20.1/gas/listing.c.old 2010-01-14 05:48:31.000000000 -0500
+++ binutils-2.20.1/gas/listing.c 2013-03-18 14:35:22.000000000 -0400
@@ -544,7 +544,6 @@ rebuffer_line (file_info_type * file,
unsigned int size)
{
unsigned int count = 0;
- unsigned int current_line = 1;
char * p = buffer;
long pos;
int c;
@@ -577,14 +576,8 @@ rebuffer_line (file_info_type * file,
fseek (last_open_file, file->pos, SEEK_SET);
}
- /* Remember where we are in the current file. */
- pos = ftell (last_open_file);
-
- /* Go back to the beginning. */
- fseek (last_open_file, 0, SEEK_SET);
-
/* Skip lines prior to the one we are interested in. */
- while (current_line < linenum)
+ while (file->linenum < linenum)
{
/* fgets only stops on newlines and has a size limit,
so we read one character at a time instead. */
@@ -594,7 +587,7 @@ rebuffer_line (file_info_type * file,
}
while (c != EOF && c != '\n' && c != '\r');
- ++ current_line;
+ ++ file->linenum;
if (c == '\r' || c == '\n')
{
_______________________________________________
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils