On domingo, 27 de novembro de 2016 00:08:53 PST Ch'Gans wrote: > > That explains why _dl_runtime_resolve_avx was used, as opposed to the > > non-AVX version of it. > > > > Note that I can't find that function in glibc's source code. If it really > > is a problem, then your distribution caused it by applying a patch and > > you should complain to them. But I don't think that this the problem. > > Stop looking at _dl_runtime_resolve_avx, that's a red herring. Your > > problem is elsewhere. > It's in linux-ld.so. Anyway I gave up about the AVX2, i think you are > right, it was just a red herring.
BTW, I've found it in it glibc source code meanwhile. I hate their organisation, it was not where I expected (though in hindsight I should have expected it). It's in sysdeps/, not elf/, and the assembly code is in the .h, not in the .S file... > But I gave another go at running valgrind, this time i started with > event logging off, then started the test app, start zooming until the > app slows down, then switched event logging on, zommed in a few times, > switched logging off and closed the test app. > > I think i have now better data to analise. > > If i sort the data by "Incl. Cost: Ir", then the top one is > QEventLoop::exec, called 1.2E9 times, then lot of event processing "Ir" is not call count, it's "instruction read", or more simply the number of instructions executed. That includes "nop" instructions, so it differs from the processor's own accounting of retired instructions. And it's not the count of cycles, since callgrind does not model how many cycles an instruction takes. > related functions, then the first event handler: > QGraphicsView::paintEvent, then scene drawItem, quickly follow by the > drop shadow effect draw(), the drop shadow filter draw(), > qt_blurImage() followed by expblur<>(), after that the cost drop > > drastically, so to resume roughly: > | Cost , function > > 1,212,124,165, QEventLoop::exec() This line means "the problem happens while your application is executing its event loop" or the same as I had said before: "happens inside main". > 1,169,460,682, QGraphicsView::paintEvent() > ... > 1,123,576,248, QPixmapDropShadowFilter::draw() > 0,907,246,806, qt_blurImage() > 0,861,710,958, expblur<>() > 0,181,757,861, fillRect_normalized() > ... These are a lot more interesting. As expected, it happens inside the paint event, whose time is 96% taken by the drop shadow filter. Note that the relative time is not per se a problem: *something* has to take time. But since we know it's slow, that is a good target for the culprit. > What i'm trying to say here, is that almost all the events seem to be > targeted at repainting the drop shadow, and I can tell you that the > test app ran for only a couple of seconds during which i was > wheel-zooming, and the GraphicsView had visually changed only 4 or 5 > times. > > So something tells me that, for some reasons, the drop shadow is > repainted way too many times, yet the app on the screen is updated > only once every second or so. Your interpretation is incorrect because it's based on the wrong assumption that the number is call count. It's not. The number is more closely related to time spent, so the correct interpretation is that most of the application's time is dedicated to repainting the drop shadow. We don't know why that is. The instruction read count is the product of the cost of each instruction every time it is called by the number of times it is called. So it's still possible we have a call count problem, but your evidence doesn't show that or disprove it. It's possible that on your system, the drop shadow needs to be painted way more often than on another system. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center _______________________________________________ Interest mailing list Interest@qt-project.org http://lists.qt-project.org/mailman/listinfo/interest