On Wed, Oct 18, 2017 at 07:29:52PM +0900, Mike Hommey wrote: > Package: diffoscope > Version: 87 > Severity: normal > > Dear Maintainer, > > Today I was comparing firefox builds from mozilla CI with diffoscope, > and it was taking an awful lot of time. Quick inspection revealed that > this is largely due to objdump --line-numbers. Remove --line-numbers > from the objdump command line created in ObjdumpDisassembleSection makes > a huge difference: > > $ time diffoscope --html diff.html firefox{1,2}.tar.bz2 # unpatched > real 58m48.870s > user 62m41.966s > sys 3m3.710s > > $ time diffoscope --html diff.html firefox{1,2}.tar.bz2 # patched > real 7m19.159s > user 7m42.558s > sys 3m2.730s
Some timings, fwiw: $ time objdump --disassemble --demangle --section=.text libxul.so > /dev/null real 0m23.431s user 0m23.366s sys 0m0.064s $ time objdump --line-numbers --disassemble --demangle --section=.text libxul.so > /dev/null real 25m49.537s user 25m49.191s sys 0m0.308s The irony is that the file doesn't even contain line numbers debug info, and all --line-numbers does is add lines with mangled function names: --- /dev/fd/63 2017-10-19 08:42:00.866097676 +0900 +++ /dev/fd/62 2017-10-19 08:42:00.866097676 +0900 @@ -5,6 +5,7 @@ セクション .text の逆アセンブル: 0000000000966f20 <InvalidArrayIndex_CRASH(unsigned long, unsigned long)>: +_Z23InvalidArrayIndex_CRASHmm(): 966f20: 55 push %rbp 966f21: 48 89 f1 mov %rsi,%rcx 966f24: 48 8d 35 a5 8f 0c 04 lea 0x40c8fa5(%rip),%rsi # 4a2fed0 <mozilla::Dafsa::kKeyNotFound+0xc8> @@ -15,6 +16,7 @@ 966f38: e8 63 a2 ff ff callq 9611a0 <MOZ_CrashPrintf@plt> 0000000000966f3d <mozilla::HangMonitor::Crash() [clone .part.25]>: +_ZN7mozilla11HangMonitor5CrashEv.part.25(): 966f3d: 55 push %rbp 966f3e: 48 89 e5 mov %rsp,%rbp 966f41: 48 83 ec 20 sub $0x20,%rsp @@ -42,6 +44,7 @@ 966fb3: 0f 0b ud2 0000000000966fb5 <isFollowedByCasedLetter(int (*)(void*, signed char), void*, signed char)>: +_ZL23isFollowedByCasedLetterPFiPvaES_a(): 966fb5: 48 85 ff test %rdi,%rdi 966fb8: 74 34 je 966fee <isFollowedByCasedLetter(int (*)(void*, signed char), void*, signed char)+0x39> 966fba: 55 push %rbp etc. It's also worth noting that the command is run whether the sections differ or not. So, if, like in my case, you have files that only differ via their build-id (not sure why yet), you still waste that extra processing time on the .text section, while there's no difference. FWIW: $ time objdump -s --section=.text libxul.so > /dev/null real 0m4.848s user 0m4.784s sys 0m0.064s So, a preliminaty check of the raw data would make things much faster overall for files with few differences. Even better, the output from readelf -SW could be parsed to get the section offsets and sizes, and then the actual raw data read directly (without having objdump do a dump of it). because... $ time cat libxul.so > /dev/null real 0m0.040s user 0m0.000s sys 0m0.041s Mike