On Thu, May 29, 2014 at 8:37 AM, Ilia Mirkin <[email protected]> wrote: > On Thu, May 29, 2014 at 12:59 AM, Kenneth Graunke <[email protected]> > wrote: >> On 05/28/2014 07:17 PM, Ilia Mirkin wrote: >>> Old files have duplicated entries for each subtest, in addition to a >>> filled subtest dictionary. Detect that the current test name is also a >>> subtest and treat it as though it were a complete test. This may have >>> false-negatives, but they're unlikely given test/subtest naming >>> convention. >>> >>> Signed-off-by: Ilia Mirkin <[email protected]> >>> --- >>> >>> Dylan, I'm sure you hate this, but it does seem to work for me. Not sure >>> where >>> you are with your fix, but this is a tool that lots of people use, so it >>> does >>> need to be addressed. And keep in mind that by now there are both the "old" >>> and "new" formats running around, so just slapping a version number in there >>> won't be enough. >> >> Seriously? We fixed a bug. Subtests were *broken* - it stored insane >> amounts of duplicate data in the files, to work around a bug in the >> tools that processed those data files. This caused huge amounts of >> wasted space and confusion. >> >> I don't understand the whole "let's not re-run Piglit to get a proper >> baseline unless something breaks" thinking. It only takes 10-15 minutes >> to do a full Piglit run here. Taking a proper baseline allows you to >> have confidence that any changes you see were caused by your patches, >> and not by other people's changes to Mesa or Piglit. It just seems like >> good practice. >> >> Have things gotten to the point where we can't even fix a bug without >> people requesting reverts or workarounds? It's bad enough that people >> keep insisting that we have to make this software work on 4 year old >> Python versions. >> >> Dylan's patches were on the list waiting for over a month, and bumped >> after two weeks, and AFAICS fix a long-standing bug. All people have to >> do is re-run Piglit to get data files that aren't *broken*. If the >> Piglit community won't even let us commit bug fixes, I don't know why I >> should continue contributing to this project. >> >> (Ilia - this isn't complaining about you specifically - it's just the >> attitude of the community in general I've observed over the last few >> months that frustrates me. It seems like any time we commit anything, >> there are very vocal objections and people calling for reverts. And >> that really frustrates me.) > > Hi Ken, > > First of all, I'd like to point out that at no point in time did I > complain about something being checked in or call for a revert. Merely > pointing out that certain use-cases should be supported, and had been, > but were recently broken. Bugs happen, but I'm surprised that not > everyone here agrees that this _is_ a bug. I don't have the > bandwidth/time/desire to review and test every piglit change, and this > seemed like a particularly nasty one, so I skipped it. I'm very happy > that the fix was done, I had noticed the subtests insanity myself and > it also annoyed me (although not enough for me to actually try to fix > it... xz is really good at compression). > > At Intel, there are 2 relevant chips that anyone cares about (gen7 and > gen7.5 from the looks of it), and maybe 3 more that are borderline > (gen6, gen5, gen4), but there are a lot more NVIDIA chips out there. > You all have easy access to all of these chips (perhaps not at your > desk, but if you really wanted to find a gen4 chip, I suspect you > could without too big of a hassle). I personally have access to a very > limited selection and have to ask others to run the tests, or swap in > cards, or whatever. There can even be kernel interactions, which adds > a whole dimension to the testing matrix. The vast, vast, *vast* > majority of piglit tests don't change names/etc, so outside of a few > oddities, piglit runs are comparable across different piglit > checkouts. > > Each piglit run takes upwards of 40-60 minutes and has the potential > to crash the machine. This is only counting the tests/gpu.py tests > (since tests/quick.py includes tons of tests I don't touch the code > for, like compiler/etc). It is this slow in large part because they're > run single-threaded and capture dmesg, but even if I didn't care about > dmesg, nouveau definitely can't handle multithreaded. You could say > "fix your driver!" but it's not quite that easy. > > Anyways, if I'm the only one who cares about being able to compare > across piglit runs from different times, I'll drop the issue and stop > trying to track failures on nouveau. I'm relatively certain that it > would reverse a recent trend of improving piglit results on nouveau > though. > > -ilia
A few additional thoughts: - I'm waiting for a Tested-by from someone before checking this in. That'll indicate I'm not the only crazy person who wants this. - Perhaps an additional difference is one of approach. Nouveau fails a lot of tests. Some tests fail on only some chips, that can make it easier to identify what is wrong and why. Having historical results really makes this a lot easier. For example Fermi vs Kepler, or with the various revisions within the Tesla family. - Comparing historical results makes it easier to track bugs in piglit as well (although TBH I can't remember any such specific instance). - Adding versioning to the piglit output would be great. Both a version number and a piglit checkout revision. This would allow us to have saner logic when we make changes in the future -ilia _______________________________________________ Piglit mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/piglit
