Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

Bill Spitzak Tue, 29 Jan 2013 14:05:09 -0800

Siarhei Siamashka wrote:

Going forward, we need to also add support for separable bilinear
scaling (first horizontal interpolation for single scanlines to
temporary buffers in L1 cache, then vertical interpolation of these
buffers to get the final result). Unless I misunderstood something,
Soeren thinks that it's going to be universally better. I think that
both direct and separable scaling methods are going to be useful for
the platforms with wide SIMD. Working with two source scanlines and
providing results directly is good for extreme downscaling. Separable
processing is good for extreme upscaling. There must be a backend
dependent crossover point at a certain scaling factor.

If by "downscaling" you mean making the picture smaller, this is theharder one, and the one that requires more than two source scanlines.This should be apparent if you imagine a downscale smaller than 1/2,since the resulting number of scan lines is less than 1/2 the original,if each of them only depends on 2 then there are some scanlines of theoriginal that did not contribute to the resulting image.

Attempting to do this is why current cairo downscaling produces verynoisy images.

Also both upscaling and downscaling can be sped up by using a 2-passmethod. It is far more important for downscaling but helps both. Amonkey wrench in this however is that hardware does support 4-inputbilinear interpolation and so you often get the fastest results by usingthis for upscaling even though it is doing some redundant work. That isno help for downscaling however unless you use mipmaps.

I don't think rectangle sources help affine transforms if you plan to do2-pass. An affine transform can be split into 3 parts, this can befigured out so the resulting matricies multiply back to the original):

1. Either the identity or a swap of x and y axis, chosen to make thedeterminant of the matrix in step 2 as large as possible


2. A transform that only moves pixels vertically (a is 1 and c is 0)

3. A transform that only moves pixels horizontally (b is 0 and d is 1)

By using step 1 to decide between two versions of step 2 (one whichsamples vertically from the source rather than horizontally) then youhave a two-pass algorithm. But each of them only needs a 1xn or nx1sample of input pixels to produce a 1xn or nx1 output section.

There is also a three-pass version (often called Catmull-Rom) thatproduces less blurring for a 45 degree rotation because the intermediateimages are larger. This is done by a horizontal, vertical, and thenanother horizontal pass. However I have found the 2-pass version worksfine and it is what is used by Nuke and nobody has complained.

Note that horizontal/vertical can be swapped in all this discussion,which is where knowledge of cache lines/etc is going to be more important.

_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman

Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

Reply via email to