On 2002.04.03 13:07 Marcelo E. Magallon wrote:
> [are the mailing lists having hiccups? I sent something yesterday and it
> didn't show up]
>
> >> Jos� Fonseca <[EMAIL PROTECTED]> writes:
>
> > +#if 0
> > #define DIV255(X) (((X) << 8) + (X) + 256) >> 16
> > +#else
> > + const GLint temp;
> > +#define DIV255(X) (temp = (X), ((temp << 8) + temp + 256) >> 16)
> > +#endif
> > +#if 0
>
> This function introduces a slight error (+-1) for about half the cases.
> This instead:
>
> DIV255(X) ((X) + ((X) >> 8) + 0x80) >> 8
>
> introduces a slight error (-1) for 0.2% of the cases in the input range
> [0, 0xFE01]. It uses only 16 bits instead of 24 bits as the original.
>
> If someone is insterested:
>
> x/255
> = x/256*256/255
> = x/256 + x/(256*255)
> = x/256 + x/256**2 + x/(256**2*255)
> approx = x/256 + x/256**2
> = (x + x/256)/256
>
> and the 0x80 is needed to account for rounding up when using interger
> arithmetic. The error introduced by the approximation is at most
> 255/256**2 but becomes more significant when using the last expression.
>
> The error introduced by this approximation occurs only on the upper
> half of the range of interest and is well spread all over it. Like I
> said before, it's always -1. The accumulated error varies between
> unnoticeable to quite annoying. The only way I've found to avoid the
> error is to use a 64 kB lookup table to perform the division. The
> speed difference is not significant and the accumulated error shows up
> much later.
>
> Cheers,
>
> Marcelo
>
Yes, you're right. This problem is discussed in
ftp://ftp.alvyray.com/Acrobat/4_Comp.pdf and the best way to achieve the
exact solution with 16 bits arithmetic was found by Blinn and is to do
#define DIV255(X) (temp = (X) + 256, ((temp << 8) + temp) >> 16)
There is no hit in performance to do this, so I think it should adopted
instead. In fact they are so similar that I didn't noticed the difference
in the formula that Mesa was using.
I attach a small test program that I had made before.
Regards,
Jos� Fonseca
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned a, b, r1, r2;
int i;
for (a = 0; a < 256; ++a)
for (b = 0; b <= a; ++b)
{
unsigned ab = a*b;
unsigned abt = ab + 0x80;
r1 = (int) (((float) ab)/255.0 + 0.5);
//r2 = (ab + (ab >> 8)) >> 8;
//r2 = (ab + (ab >> 8) + 0x80) >> 8;
r2 = (abt + (abt >> 8)) >> 8;
if (r1 != r2)
printf("%3ux%3u:\t%3u\t%3u\n", a,b, r1,r2);
}
return 0;
}