Re: [webkit-dev] pixel tests and --tolerance (was Re: Pixel test experiment)

Dirk Pranke Thu, 14 Oct 2010 14:44:29 -0700

On Thu, Oct 14, 2010 at 9:06 AM, Ojan Vafai <[email protected]> wrote:
> Dirk, implementing --tolerance in NRWT isn't that hard is it? Getting rid of
> --tolerance will be a lot of work of making sure all the pixel results that
> currently pass also pass with --tolerance=0. While I would support someone
> doing that work, I don't think we should block moving to NRWT on it.


Assuming we implement it only for the ports that currently use
tolerance on old-run-webkit-tests, no, I wouldn't expect it to be
hard. Dunno how much work it would be to implement tolerance on the
chromium image_diff implementations (side note: it would be nice if
these binaries weren't port-specific, but that's another topic).

As to how many files we'd have to rebaseline for the base ports, I
don't know how many there are compared to how many fail pixel tests,
period. I'll run a couple tests and find out.

-- Dirk

> Ojan
> On Fri, Oct 8, 2010 at 1:03 PM, Simon Fraser <[email protected]> wrote:
>>
>> I think the best solution to this pixel matching problem is ref tests.
>>
>> How practical would it be to use ref tests for SVG?
>>
>> Simon
>>
>> On Oct 8, 2010, at 12:43 PM, Dirk Pranke wrote:
>>
>> > Jeremy is correct; the Chromium port has seen real regressions that
>> > virtually no concept of a fuzzy match that I can imagine would've
>> > caught.
>> > new-run-webkit-tests doesn't currently support the tolerance concept
>> > at al, and I am inclined to argue that it shouldn't.
>> >
>> > However, I frequently am wrong about things, so it's quite possible
>> > that there are good arguments for supporting it that I'm not aware of.
>> > I'm not particularly interested in working on a tool that doesn't do
>> > what the group wants it to do, and I would like all of the other
>> > WebKit ports to be running pixel tests by default (and
>> > new-run-webkit-tests ;) ) since I think it catches bugs.
>> >
>> > As far as I know, the general sentiment on the list has been that we
>> > should be running pixel tests by default, and the reason that we
>> > aren't is largely due to the work involved in getting them back up to
>> > date and keeping them up to date. I'm sure that fuzzy matching reduces
>> > the work load, especially for the sort of mismatches caused by
>> > differences in the text antialiasing.
>> >
>> > In addition, I have heard concerns that we'd like to keep fuzzy
>> > matching because people might potentially get different results on
>> > machines with different hardware configurations, but I don't know that
>> > we have any confirmed cases of that (except for arguably the case of
>> > different code paths for gpu-accelerated rendering vs. unaccelerated
>> > rendering).
>> >
>> > If we made it easier to maintain the baselines (improved tooling like
>> > the chromium's rebaselining tool, add reftest support, etc.) are there
>> > still compelling reasons for supporting --tolerance -based testing as
>> > opposed to exact matching?
>> >
>> > -- Dirk
>> >
>> > On Fri, Oct 8, 2010 at 11:14 AM, Jeremy Orlow <[email protected]>
>> > wrote:
>> >> I'm not an expert on Pixel tests, but my understanding is that in
>> >> Chromium
>> >> (where we've always run with tolerance 0) we've seen real regressions
>> >> that
>> >> would have slipped by with something like tolerance 0.1.  When you have
>> >> 0 tolerance, it is more maintenance work, but if we can avoid
>> >> regressions,
>> >> it seems worth it.
>> >> J
>> >>
>> >> On Fri, Oct 8, 2010 at 10:58 AM, Nikolas Zimmermann
>> >> <[email protected]> wrote:
>> >>>
>> >>> Am 08.10.2010 um 19:53 schrieb Maciej Stachowiak:
>> >>>
>> >>>>
>> >>>> On Oct 8, 2010, at 12:46 AM, Nikolas Zimmermann wrote:
>> >>>>
>> >>>>>
>> >>>>> Am 08.10.2010 um 00:44 schrieb Maciej Stachowiak:
>> >>>>>
>> >>>>>>
>> >>>>>> On Oct 7, 2010, at 6:34 AM, Nikolas Zimmermann wrote:
>> >>>>>>
>> >>>>>>> Good evening webkit folks,
>> >>>>>>>
>> >>>>>>> I've finished landing svg/ pixel test baselines, which pass with
>> >>>>>>> --tolerance 0 on my 10.5 & 10.6 machines.
>> >>>>>>> As the pixel testing is very important for the SVG tests, I'd like
>> >>>>>>> to
>> >>>>>>> run them on the bots, experimentally, so we can catch regressions
>> >>>>>>> easily.
>> >>>>>>>
>> >>>>>>> Maybe someone with direct access to the leopard & snow leopard
>> >>>>>>> bots,
>> >>>>>>> could just run "run-webkit-tests --tolerance 0 -p svg" and mail me
>> >>>>>>> the
>> >>>>>>> results?
>> >>>>>>> If it passes, we could maybe run the pixel tests for the svg/
>> >>>>>>> subdirectory on these bots?
>> >>>>>>
>> >>>>>> Running pixel tests would be great, but can we really expect the
>> >>>>>> results to be stable cross-platform with tolerance 0? Perhaps we
>> >>>>>> should
>> >>>>>> start with a higher tolerance level.
>> >>>>>
>> >>>>> Sure, we could do that. But I'd really like to get a feeling, for
>> >>>>> what's
>> >>>>> problematic first. If we see 95% of the SVG tests pass with
>> >>>>> --tolerance 0,
>> >>>>> and only a few need higher tolerances
>> >>>>> (64bit vs. 32bit aa differences, etc.), I could come up with a
>> >>>>> per-file
>> >>>>> pixel test tolerance extension to DRT, if it's needed.
>> >>>>>
>> >>>>> How about starting with just one build slave (say. Mac Leopard) that
>> >>>>> runs the pixel tests for SVG, with --tolerance 0 for a while. I'd be
>> >>>>> happy
>> >>>>> to identify the problems, and see
>> >>>>> if we can make it work, somehow :-)
>> >>>>
>> >>>> The problem I worry about is that on future Mac OS X releases,
>> >>>> rendering
>> >>>> of shapes may change in some tiny way that is not visible but enough
>> >>>> to
>> >>>> cause failures at tolerance 0. In the past, such false positives
>> >>>> arose from
>> >>>> time to time, which is one reason we added pixel test tolerance in
>> >>>> the first
>> >>>> place. I don't think running pixel tests on just one build slave will
>> >>>> help
>> >>>> us understand that risk.
>> >>>
>> >>> I think we'd just update the baseline to the newer OS X release, then,
>> >>> like it has been done for the tiger -> leopard, leopard -> snow
>> >>> leopard
>> >>> switch?
>> >>> platform/mac/ should always contain the newest release baseline, when
>> >>> therere are differences on leopard, the results go into
>> >>> platform/mac-leopard/
>> >>>
>> >>>> Why not start with some low but non-zero tolerance (0.1?) and see if
>> >>>> we
>> >>>> can at least make that work consistently, before we try the bolder
>> >>>> step of
>> >>>> tolerance 0?
>> >>>> Also, and as a side note, we probably need to add more build slaves
>> >>>> to
>> >>>> run pixel tests at all, since just running the test suite without
>> >>>> pixel
>> >>>> tests is already slow enough that the testers are often significantly
>> >>>> behind
>> >>>> the builders.
>> >>>
>> >>> Well, I thought about just running the pixel tests for the svg/
>> >>> subdirectory as a seperate step, hence my request for tolerance 0, as
>> >>> the
>> >>> baseline passes without problems at least on my & Dirks machine
>> >>> already.
>> >>> I wouldnt' want to argue running 20.000+ pixel tests with tolerance 0
>> >>> as
>> >>> first step :-) But the 1000 SVG tests, might be fine, with tolerance
>> >>> 0?
>> >>>
>> >>> Even tolerance 0.1 as default for SVG would be fine with me, as long
>> >>> as we
>> >>> can get the bots to run the SVG pixel tests :-)
>> >>>
>> >>> Cheers,
>> >>> Niko
>> >>>
>> >>> _______________________________________________
>> >>> webkit-dev mailing list
>> >>> [email protected]
>> >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>> >>
>> >>
>> >> _______________________________________________
>> >> webkit-dev mailing list
>> >> [email protected]
>> >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>> >>
>> >>
>> > _______________________________________________
>> > webkit-dev mailing list
>> > [email protected]
>> > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>> _______________________________________________
>> webkit-dev mailing list
>> [email protected]
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
>
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Re: [webkit-dev] pixel tests and --tolerance (was Re: Pixel test experiment)

Reply via email to