Re: [dev-servo] Servo testing as part of PhD dissertation

James Graham Tue, 06 Sep 2016 13:59:59 -0700

[not sure if this will make it through to the list]

On 06/09/16 21:35, Jack Moffitt wrote:

I haven't quite settled on my dissertation topic, but my top contender at
the moment involves property-based (i.e. QuickCheck style) generation of
random web pages/stylesheets.


A sort of subtask of this which would be extremely useful is taking a
known rendering problem and producing a minimal reproduction of it.
For example, many issues are discovered in existing pages with perhaps
hundreds of kilobytes of extraneous data. It would be nice to reduce
the failing example to a minimal size. One issue is how to make an
oracle here. It would probably be an improvement to have it be only
semi-automated, where it does some shrinking then asks and repeats.

There is some prior art here e.g. [1]. I wrote a similar tool that wasspecialised to reducing js code whilst at Opera, but that doesn't everseem to have been released. In both cases you either had to write aone-off function to determine if the testcase was a pass or fail, orhave a human judge it. Obviously the latter is impractically slow ifyour input is large.

The oracle would be a cluster of browsers
(multiple vendors/variants) driven by WebDriver/Selenium that would render
the test cases and screenshot them. Significant discrepancies between
renderings would be considered a failing test case and then standard
QuickCheck-style shrinking would be used to reduce the test case HTML/CSS
to a minimal-ish reproducer.


Each browser renders things slightly differently, so pixel by pixel
comparison across browsers is probably not going to work well. For our
own testing of this kind we instead produce the same result using two
different techniques, or in a few cases we make reference images.
However making reference images can't account for all rendering
differences (like text) and so we avoid it if possible. I imagine it
would be quite difficult if the reference image was from another
engine, not our own.

Yes, I imagine specifically font rendering will be a problem, along withantialiasing in general and tegitimate-per-CSS variations in propertiessuch as outline.

However I think you might make progress with some sort ofconsensus-based approach e.g. take a testcase and render it ingecko/blink/webkit/edge. If the difference by some metric (e.g. numberof differing pixels, although more sophisticated approaches arepossible) is within some threshold then check whether servo is withinthe same threshold. If it is consider that a pass otherwise a fail.

Is this idea of interest to the Servo team? Would it be useful for Servo
development/testing? Or perhaps redundant with existing testing I'm not
aware of?


The main kind of testing we do is reference testing where the
reference is the same content achieved by different means. This is
pretty robust to things like font rendering changing slightly between
versions. We have some JS level testing where JS APIs are invoked and
then results verified, but it sounds like you are more focused on the
visual testing aspect. As an aside, I think quickchecking JS  APIs is
likely to find a ton of bugs and be useful too, plus it probably
doesn't have the oracle problems.


But this is also a good idea :)

[1]http://www.squarefree.com/2007/09/15/introducing-lithium-a-testcase-reduction-tool/

_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Re: [dev-servo] Servo testing as part of PhD dissertation

Reply via email to