On Wed, Aug 15, 2012 at 5:19 PM, Ryosuke Niwa <[email protected]> wrote: > I have a concern that a lot of people wouldn't know what the "correct" > output is for a given test. > > For a lot of pixel tests, deciding whether a given output is correct or not > is really hard. e.g. some seemingly insignificant anti-alias different may > turn out be a result of a bug in Skia and other graphics library or WebCore > code that uses it. > > As a result, > > people may check in wrong "correct" results > people may add "failing" results even though new results are more or less > correct > > > This leads me to think that just checking in the current output as > -expected.png and filing bugs separately is a better approach. >
I think your observations are correct, but at least my experience as a gardener/sheriff leads me to a different conclusion. Namely, when I'm looking at a newly failing test, it is difficult if not impossible for me to know if the existing baseline was previously believed to be correct or not, and thus it's hard for me to tell if the new baseline should be considered worse, better, or different. In theory I could go look at the changelog for each test, but I would be skeptical if that had enough useful information (I would expect most comments to be along the lines of "rebaselining after X" with no indication if the output is correct or not. This is just a theory, though. This is why I want to test this theory :). It seems like if we got experience with this on one (or more) ports for a couple of months we would have a much more well-informed opinion, and I'm not seeing a huge downside to at least trying this idea out. -- Dirk _______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo/webkit-dev

