Why did you choose jpeg quality as your independent variable? Wouldn't it make more sense to use the similarity value? When trying to match other formats to the jpeg's value, you can get close but can't exactly match it. This creates an inherent bias.
So for one thing, the data should have included the similarity values for all images and not just the jpeg - when the jpeg value was even included. We could see the range of values and then we could figure out how important the bias is. But beyond that, trying to match all formats to the same fixed value would at least give them all the same chance at bias. When searching for the matching quality values for other formats, how precise did you make them? Was it only integers, or did you use decimal values for formats that support them? It would have been nice to see the quality values for the other formats in the data too. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform