Hi,

I can't get a web browser connection to sourceware.org, so I can't open a PR on this (I will as soon as I can unless you advise me not to) but I wanted to warn you guys this is coming.

I've got a C language source code TU that is 1.3 MiB large. I didn't write it (I know better), but I have to make it coexist in a rather lengthy system build, and the leader of the system build team is justifably upset with me. Using binutils-2.19, I see this in an strace -c summary:

jtison@tstlinux ~ 13:24:40$ strace -c as -o offldr.o2.o -alshd=offldr.o2.lst offldr.O2.s
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 51.75    0.033194          25      1333           write
 19.96    0.012801          10      1337           read
 15.86    0.010175          23       438       134 open
  3.16    0.002026           7       300           munmap
  1.78    0.001144           9       129       127 stat
  1.74    0.001114           3       322           mmap
  1.66    0.001067           4       304           close
  1.41    0.000902           2       388           brk
  1.34    0.000858           3       303           fstat
  0.85    0.000546           2       312           lseek
  0.27    0.000170         170         1           execve
  0.17    0.000108          27         4 fadvise64
  0.05    0.000034           6         6           mprotect
  0.01    0.000005           5         1         1 access
  0.00    0.000001           1         1 getrusage
  0.00    0.000001           1         1           uname
------ ----------- ----------- --------- --------- ----------------
100.00    0.064146                  5180       262 total

All is still well with as-2.19. But now I run the 'upgrade' (as-2.21), and I see this: jtison@linuxtpf ~ 16:46:36$ time as-2.21 -o offldr.o2.o -alshd=offldr.o2.lst offldr.o2.s

real    4m18.258s
user    4m12.802s
sys     0m3.251s

Note the total assembly time; see how the clear majority of time is spent in userspace.For now, I've fallen back to as-2.19 (the assembler only, I brought along a copy of libbfd and libopcodes that match) just to get past this behavior.

When I build 2.20[1] without trying to fix it, I see this for the same C-generated code at -O2:

jtison@linuxtpf ~ 13:27:23$ strace -c as-2.20 -o offldr.o2.o -alshd=offldr.o2.lst offldr.o2.s
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 87.96    3.508911           3   1020658           read
  3.58    0.142899          11     13246           munmap
  3.17    0.126437           9     13386       136 open
  2.04    0.081426          51      1599           write
  1.35    0.053725           4     13250           close
  0.92    0.036537           3     13272           mmap
  0.53    0.021277           1     26204           lseek
  0.34    0.013416           1     13250           fstat
  0.06    0.002233         447         5 fadvise64
  0.02    0.000840           2       387           brk
  0.02    0.000824           6       129       126 stat
  0.01    0.000589         589         1           execve
  0.01    0.000217         217         1           unlink
  0.00    0.000051           7         7           mprotect
  0.00    0.000008           8         1         1 access
  0.00    0.000002           1         2           fcntl
  0.00    0.000002           2         1           lstat
  0.00    0.000001           1         1 getrusage
------ ----------- ----------- --------- --------- ----------------
100.00    3.989395               1115400       263 total

As we see, the compilation time grows by a factor of 15X. I see the rebuffer_line() (listing.c:541) routine doing an ftell(FILE *, last_pos, SEEK_SET) which sets the file pointer close to and before the line it seeks, but then it issues an ftell(FILE*) so we "remember" where we are (why?), then we reset all the way back to zero with fseek(FILE *, 0, SEEK_SET) and only now it starts reading forward up to the line number it seeks. In effect, the C source file is being read over and over again (unnecessarily?) all the way from offset 0 every time it executes that fseek/ftell/fseek code. This explains the dramatic increase in read syscalls, almost a million. The behavior is consistent from version 2.20 through 2.23 for this particular TU.

It looks like most of listing.c was rewritten for 2.20. I tried removing the fseek(FILE *, 0, SEEK_SET) call and made a couple of refactoring changes to make sure I'm seeking the proper source line. The syscall increase reduces dramatically; but I'm not sure if this change works for all assemblies or not. I have attached the patch anyway in hopes that it will be useful. I'm changing a brand new code routine and I'm not sure that I'm not breaking something -- this particular use case came out okay vis-a-vis the listing, at least that I could tell, but I have no guarantees that I've generated correct relocatable object. Application of the attached patch turns in the following strace numbers:

jtison@linuxtpf ~ 14:24:52$ strace -c as-2.20-1 -o offldr.o2.o -alshd=offldr.o2.lst offldr.o2.s
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 36.98    0.190710           5     39720           read
 15.66    0.080763           6     13246           munmap
 13.93    0.071843           5     13450       201 open
  7.65    0.039442           3     13267           mmap
  7.51    0.038721          24      1600           write
  6.48    0.033411           2     19731           lseek
  6.41    0.033073           2     13249           close
  4.72    0.024317           2     13249           fstat
  0.23    0.001172           6       192       189 stat
  0.22    0.001147           3       388           brk
  0.14    0.000737         737         1           unlink
  0.04    0.000230         230         1           execve
  0.01    0.000046          12         4           fadvise64
  0.01    0.000033           6         6           mprotect
  0.00    0.000005           5         1         1 access
  0.00    0.000005           3         2           fcntl
  0.00    0.000002           2         1           getrusage
  0.00    0.000002           2         1           lstat
------ ----------- ----------- --------- --------- ----------------
100.00    0.515659                128109       391 total

There is still an increase in read syscalls, but its nowhere as radical as before; with an attendant reduction in userspace residence time. I've sent you my very simplistic patch only as a suggestion, I'm sure there's more work to be done to take care of corner cases, & c. I confess to not knowing what I'm doing inside this code and I need you guys to determine what the correct fix is. This particular patch gives me a total assembly time of:

real    0m3.244s
user    0m2.400s
sys     0m0.363s

I would really love to attach this test case I have, but it's IBM licensed (i.e. proprietary) code so I can't do that and still keep my head on the end of my neck. Might I suggest you find a monolithic C program someplace (there have got to be some you know of) to the order of 1 MiB+ and use that? This behavior should show itself with a monolithic program of any size.

I would have opened a PR on this, but I can't seem to get to sourceware.org today. I'll keep checking until I can get to bugzilla.

Thanks,
--Jim--
work: jti...@us.ibm.com  (203) 885-7335
home: jti...@charter.net (860) 946-8088

[1] We actually upgraded to binutils-2.21 with SLES 11 SP2. But I suspected the change happened in 2.20 thanks to the ChangeLog showing a lot of activity around its last change date, and it turned out to be true. This is probably a good candidate for backporting, provided I understand the problem correctly (of which there are no guarantees).
--- binutils-2.20.1/gas/listing.c.old   2010-01-14 05:48:31.000000000 -0500
+++ binutils-2.20.1/gas/listing.c       2013-03-18 14:35:22.000000000 -0400
@@ -544,7 +544,6 @@ rebuffer_line (file_info_type *  file,
               unsigned int      size)
 {
   unsigned int count = 0;
-  unsigned int current_line = 1;
   char * p = buffer;
   long pos;
   int c;
@@ -577,14 +576,8 @@ rebuffer_line (file_info_type *  file,
        fseek (last_open_file, file->pos, SEEK_SET);
     }
 
-  /* Remember where we are in the current file.  */
-  pos = ftell (last_open_file);
-
-  /* Go back to the beginning.  */
-  fseek (last_open_file, 0, SEEK_SET);
-
   /* Skip lines prior to the one we are interested in.  */
-   while (current_line < linenum)
+  while (file->linenum < linenum)
     {
       /* fgets only stops on newlines and has a size limit,
         so we read one character at a time instead.  */
@@ -594,7 +587,7 @@ rebuffer_line (file_info_type *  file,
        }
       while (c != EOF && c != '\n' && c != '\r');
 
-      ++ current_line;
+      ++ file->linenum;
 
       if (c == '\r' || c == '\n')
        {
_______________________________________________
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils

Reply via email to