gas new (2.20) rebuffer_line() excessive read() syscalls
Hi, I can't get a web browser connection to sourceware.org, so I can't open a PR on this (I will as soon as I can unless you advise me not to) but I wanted to warn you guys this is coming. I've got a C language source code TU that is 1.3 MiB large. I didn't write it (I know better), but I have to make it coexist in a rather lengthy system build, and the leader of the system build team is justifably upset with me. Using binutils-2.19, I see this in an strace -c summary: jtison@tstlinux ~ 13:24:40$ strace -c as -o offldr.o2.o -alshd=offldr.o2.lst offldr.O2.s % time seconds usecs/call callserrors syscall -- --- --- - - 51.750.033194 25 1333 write 19.960.012801 10 1337 read 15.860.010175 23 438 134 open 3.160.002026 7 300 munmap 1.780.001144 9 129 127 stat 1.740.001114 3 322 mmap 1.660.001067 4 304 close 1.410.000902 2 388 brk 1.340.000858 3 303 fstat 0.850.000546 2 312 lseek 0.270.000170 170 1 execve 0.170.000108 27 4 fadvise64 0.050.34 6 6 mprotect 0.010.05 5 1 1 access 0.000.01 1 1 getrusage 0.000.01 1 1 uname -- --- --- - - 100.000.064146 5180 262 total All is still well with as-2.19. But now I run the 'upgrade' (as-2.21), and I see this: jtison@linuxtpf ~ 16:46:36$ time as-2.21 -o offldr.o2.o -alshd=offldr.o2.lst offldr.o2.s real4m18.258s user4m12.802s sys 0m3.251s Note the total assembly time; see how the clear majority of time is spent in userspace.For now, I've fallen back to as-2.19 (the assembler only, I brought along a copy of libbfd and libopcodes that match) just to get past this behavior. When I build 2.20[1] without trying to fix it, I see this for the same C-generated code at -O2: jtison@linuxtpf ~ 13:27:23$ strace -c as-2.20 -o offldr.o2.o -alshd=offldr.o2.lst offldr.o2.s % time seconds usecs/call callserrors syscall -- --- --- - - 87.963.508911 3 1020658 read 3.580.142899 11 13246 munmap 3.170.126437 9 13386 136 open 2.040.081426 51 1599 write 1.350.053725 4 13250 close 0.920.036537 3 13272 mmap 0.530.021277 1 26204 lseek 0.340.013416 1 13250 fstat 0.060.002233 447 5 fadvise64 0.020.000840 2 387 brk 0.020.000824 6 129 126 stat 0.010.000589 589 1 execve 0.010.000217 217 1 unlink 0.000.51 7 7 mprotect 0.000.08 8 1 1 access 0.000.02 1 2 fcntl 0.000.02 2 1 lstat 0.000.01 1 1 getrusage -- --- --- - - 100.003.989395 1115400 263 total As we see, the compilation time grows by a factor of 15X. I see the rebuffer_line() (listing.c:541) routine doing an ftell(FILE *, last_pos, SEEK_SET) which sets the file pointer close to and before the line it seeks, but then it issues an ftell(FILE*) so we "remember" where we are (why?), then we reset all the way back to zero with fseek(FILE *, 0, SEEK_SET) and only now it starts reading forward up to the line number it seeks. In effect, the C source file is being read over and over again (unnecessarily?) all the way from offset 0 every time it executes that fseek/ftell/fseek code. This explains the dramatic increase in read syscalls, almost a million. The behavior is consistent from version 2.20 through 2.23 for this particular TU. It looks like most of listing.c was rewritten for 2.20. I tried removing the fseek(FILE *, 0, SEEK_SET) call and made a couple of refactoring changes to make sure I'm seeking the proper source line. The syscall increase reduces dramatically; but I'm not sure if this change works for all assemblies or not. I have attached the patch anyway in hopes that it will be useful. I'm changing a brand new code routine and I'm not sure that I'm not breaking something -- this particular use case came out okay vis-a-vis the listing, at least that I could tell, but I have n
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #11 from pete 2013-03-20 02:04:21 UTC --- Hello Ian, Would you please comment on the issue and patch? - Thanks, Pete (In reply to comment #10) > I have no object to what is proposed but the decision is Ian's. > > On Tue, Mar 12, 2013 at 7:03 PM, petechou at gmail dot com > wrote: > > http://sourceware.org/bugzilla/show_bug.cgi?id=15200 > > > > --- Comment #9 from pete 2013-03-13 02:03:32 > > UTC --- > > If there is a libfoo.so, it's possible to implement part of functions in > > libfoo.so and then link against libfoo.so again to produce a new library. I > > think the "linker-defined" symbols should also have higher priority than the > > symbols in DSO. > > > > And I also think we should have the same behavior on defining symbol value > > whether only_if_ref is set or not? > > > > The patch will affect 4 ways to define output symbols: > > 1. do_define_in_output_data > > 2. do_define_in_output_segment > > 3. do_define_as_constant > > 4. add_undefined_symbol_from_command_line > > > > For 1, 2, and 3, it makes sense to use the given output section, segment, > > and > > constant to define symbol values if the "linker-defined" symbols have higher > > priority. > > > > As for 4, I think it doesn't matter. > > > > - > > Thanks, > > Pete > > > > (In reply to comment #8) > >> Hi Pete, > >> > >> 1. My concern of your patch is that it affects all targets, not just > >> ARM. While it may fix the problem on ARM, I cannot speak for authors > >> of other backends. > >> 2. I am don't have power approve a patch, you need to convince Ian > >> Taylor that this is the right thing to do for all targets. > >> 3. This bug only happens when DSOs incorrectly export __exidx_start & > >> __exidx_end. These DSOs are considered broken and I strongly > >> recommend you moving to a newer NDK version with DSOs that have proper > >> visibility. Both ld & gold no longer export these symbols. The > >> reason for not exporting is explained here in the ld patch: > >> > >> http://sourceware.org/ml/binutils/2009-11/msg00367.html > >> > >> -Doug > >> > >> > >> On Wed, Mar 6, 2013 at 10:43 PM, petechou at gmail dot com > >> wrote: > >> > http://sourceware.org/bugzilla/show_bug.cgi?id=15200 > >> > > >> > --- Comment #7 from pete 2013-03-07 06:43:08 > >> > UTC --- > >> > I can think visibility does matter, but only for exporting the symbol or > >> > not. > >> > In the previous case, gold does define the __bss symbols but not use the > >> > definition in DSO, so I think we should still define section/segment > >> > symbols > >> > locally. > >> > > >> > -Pete > >> > > >> > (In reply to comment #6) > >> >> The symbols do not have same visibility. gold in general is > >> >> compatible with GNU ld. If you look at src/ld/emulparams/, you will > >> >> set that the __bss symbols are exported but the __exidx symbols are > >> >> defined hidden. gold matches the behaviour of ld. If you look at > >> >> defstd.cc in gold, you will again see that some of these symbols are > >> >> hidden but some are not. > >> >> > >> >> -Doug > >> >> > >> >> > >> >> On Wed, Mar 6, 2013 at 9:31 PM, petechou at gmail dot com > >> >> wrote: > >> >> > http://sourceware.org/bugzilla/show_bug.cgi?id=15200 > >> >> > > >> >> > --- Comment #5 from pete 2013-03-07 > >> >> > 05:31:25 UTC --- > >> >> > (In reply to comment #4) > >> >> >> The symbols __exidx_end & __exidx_start were exported incorrectly in > >> >> >> the past. This is fixed in both ld & gold in the sense that these > >> >> >> symbols are defined by not exported. I think defining these symbols > >> >> >> when there is a non-weak definition in a DSO is not the right thing > >> >> >> to > >> >> >> do. This is basically multiple definition and usually is an error. > >> >> > > >> >> > If so, all section/segment symbols should be defined (if needed) by > >> >> > not > >> >> > exported? > >> >> > > >> >> > But when there is a non-weak definition in a DSO, it's still possible > >> >> > to find > >> >> > ld.gold define and export segment symbols like __bss_start (See > >> >> > gold.defstd.cc:214. only_if_ref is set to false). And there is no > >> >> > multiple > >> >> > definition error. > >> >> > > >> >> > $ readelf -Ds libc.so | grep __bss_start > >> >> > 268 311: d898 0 NOTYPE GLOBAL DEFAULT ABS __bss_start__ > >> >> > 619 470: d898 0 NOTYPE GLOBAL DEFAULT ABS __bss_start > >> >> > > >> >> > $ arm-linux-androideabi-ld.gold -shared -o libplasma.so plasma.o > >> >> > libc.so > >> >> > > >> >> > $ readelf -s libplasma.so | grep __bss_start > >> >> > 73: 4148 0 NOTYPE GLOBAL DEFAULT ABS __bss_start > >> >> >168: 4148 0 NOTYPE GLOBAL DEFAULT ABS __bss_start > >> >> > > >> >> >> I am not working on the Android NDK so I can't give you a good > >> >> >> suggestion here. Does using the latest NDK fixes your problem? > >> >> >> > >> >> > > >> >> > Yes, there is no this issue with the latest ND
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #12 from Ian Lance Taylor 2013-03-20 03:29:49 UTC --- oldsym->in_dyn() will return true if the symbol was seen in a dynamic object. I don't see why we should create the symbol if it is seen in a dynamic object. It seems that the code should be something like if (oldsym == NULL) return NULL; if (oldsym->is_undefined()) ; else if (oldsym->is_from_dynobj()) ; else return NULL; but that's not right either. We should only create the symbol if it is referenced by a regular object. So it really needs to be if (oldsym == NULL) return NULL; -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #13 from Ian Lance Taylor 2013-03-20 03:32:47 UTC --- oldsym->in_dyn() will return true if the symbol was seen in a dynamic object. I don't see why we should create the symbol if it is seen in a dynamic object. It seems that the code should be something like if (oldsym == NULL) return NULL; if (oldsym->is_undefined()) ; else if (oldsym->is_from_dynobj()) ; else return NULL; but that's not right either. We should only create the symbol if it is referenced by a regular object. So perhaps it really needs to be something like if (oldsym == NULL) return NULL; if (oldsym->source() == Symbol::IS_UNDEFINED) ; else if (!oldsym->in_reg()) return NULL; else if (oldsym->is_from_dynobj()) ; else return NULL; -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #14 from Doug Kwan 2013-03-20 04:41:03 UTC --- The special symbols in question __exidx_{start,end} were exported from DSO's by mistake in both ld & gold and the mistake has been fixed in both linkers. We really don't want references of such symbols in the main executable to be bound to the definitions in a DSO. So Peter wants to define those symbols even when they are seen in a DSO. Doing so makes it possible to use DSO's generated by older versions of gold & ld on ARM but I really don't know whether this affects correctness in general. -Doug On Tue, Mar 19, 2013 at 8:29 PM, ian at airs dot com wrote: > http://sourceware.org/bugzilla/show_bug.cgi?id=15200 > > --- Comment #12 from Ian Lance Taylor 2013-03-20 > 03:29:49 UTC --- > oldsym->in_dyn() will return true if the symbol was seen in a dynamic object. > I don't see why we should create the symbol if it is seen in a dynamic object. > > It seems that the code should be something like > > if (oldsym == NULL) > return NULL; > if (oldsym->is_undefined()) > ; > else if (oldsym->is_from_dynobj()) > ; > else > return NULL; > > but that's not right either. We should only create the symbol if it is > referenced by a regular object. So it really needs to be > > if (oldsym == NULL) > return NULL; > > -- > Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email > --- You are receiving this mail because: --- > You are on the CC list for the bug. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #15 from Ian Lance Taylor 2013-03-20 04:44:36 UTC --- I think that is what my patch does. Did I get it wrong? I think it probably makes sense in general to create an only_if_ref linker-defined symbol if there is a ref in the regular objects and a definition in a shared object. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #16 from pete 2013-03-20 05:38:56 UTC --- (In reply to comment #15) > I think that is what my patch does. Did I get it wrong? I'm thinking the patch does the same thing as mine. I have 2 questions: 1. in_reg() and in_dyn() seem to be mutual exclusive? 2. is it possible to see in_dyn() and is_from_dynobj() return different truth values for a symbol? Did I misunderstand something? > > I think it probably makes sense in general to create an only_if_ref > linker-defined symbol if there is a ref in the regular objects and a > definition > in a shared object. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #17 from pete 2013-03-20 05:54:20 UTC --- (In reply to comment #16) > (In reply to comment #15) > > I think that is what my patch does. Did I get it wrong? And we should return NULL if oldsym->in_reg() is true instead? if (oldsym == NULL) return NULL; if (oldsym->source() == Symbol::IS_UNDEFINED) ; else if (oldsym->in_reg()) return NULL; else if (oldsym->is_from_dynobj()) ; else return NULL; > > I'm thinking the patch does the same thing as mine. > > I have 2 questions: > 1. in_reg() and in_dyn() seem to be mutual exclusive? > 2. is it possible to see in_dyn() and is_from_dynobj() return different truth > values for a symbol? > Did I misunderstand something? > > > > > I think it probably makes sense in general to create an only_if_ref > > linker-defined symbol if there is a ref in the regular objects and a > > definition > > in a shared object. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug gold/15200] Runtime undefined reference to __exidx_start/_end
http://sourceware.org/bugzilla/show_bug.cgi?id=15200 --- Comment #18 from pete 2013-03-20 06:20:57 UTC --- Sorry! Yours is correct. Forget about this. (In reply to comment #17) > > And we should return NULL if oldsym->in_reg() is true instead? > > if (oldsym == NULL) > return NULL; > if (oldsym->source() == Symbol::IS_UNDEFINED) > ; > else if (oldsym->in_reg()) > return NULL; > else if (oldsym->is_from_dynobj()) > ; > else > return NULL; > -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils