Re: [RFC PATCH v3 23/23] drm/vkms: Add tests for CTM handling

2024-01-04 Thread Harry Wentland



On 2023-12-08 08:32, Pekka Paalanen wrote:
> On Wed, 8 Nov 2023 11:36:42 -0500
> Harry Wentland  wrote:
> 
>> A whole slew of tests for CTM handling that greatly helped in
>> debugging the CTM code. The extent of tests might seem a bit
>> silly but they're fast and might someday help save someone
>> else's day when debugging this.
> 
> To be honest, this looks pretty limited testing. I guess the more
> exhaustive tests are in IGT though, would be nice to mention that in
> the commit message.
> 
> There is not a single case for out of [0.0, 1.0] input nor output.
> 
> The offset part is always zero.
> 

It is very limited testing, mostly intended for sanity checking.
If I find time I might expand on these but it's pretty low
priority at the moment. The core of the testing is intended to
be done via IGT.

Harry

>> Signed-off-by: Harry Wentland 
>> ---
>>  drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 258 ++
>>  drivers/gpu/drm/vkms/vkms_composer.c  |   2 +-
>>  2 files changed, 259 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c 
>> b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>> index ad4c2f72fd1e..3eaa2233afbb 100644
>> --- a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>> +++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>> @@ -3,6 +3,7 @@
>>  #include 
>>  
>>  #include 
>> +#include 
>>  
>>  #define TEST_LUT_SIZE 16
>>  
>> @@ -80,11 +81,268 @@ static void vkms_color_srgb_inv_srgb(struct kunit *test)
>>  }
>>  }
>>  
>> +#define FIXPT_HALF(DRM_FIXED_ONE >> 1)
>> +#define FIXPT_QUARTER (DRM_FIXED_ONE >> 2)
>> +
>> +const struct drm_color_ctm_3x4 test_matrix_3x4_50_desat = { {
>> +FIXPT_HALF, FIXPT_QUARTER, FIXPT_QUARTER, 0,
>> +FIXPT_QUARTER, FIXPT_HALF, FIXPT_QUARTER, 0,
>> +FIXPT_QUARTER, FIXPT_QUARTER, FIXPT_HALF, 0
>> +} };
>> +
>> +static void vkms_color_ctm_3x4_50_desat(struct kunit *test)
>> +{
>> +struct pixel_argb_s32 ref, out;
>> +
>> +/* full white */
>> +ref.a = 0x0;
>> +ref.r = 0x;
>> +ref.g = 0x;
>> +ref.b = 0x;
>> +
>> +memcpy(&out, &ref, sizeof(out));
>> +apply_3x4_matrix(&out, &test_matrix_3x4_50_desat);
>> +
>> +KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out));
>> +
>> +/* full black */
>> +ref.a = 0x0;
> 
> This is transparent, btw.
> 
>> +ref.r = 0x0;
>> +ref.g = 0x0;
>> +ref.b = 0x0;
>> +
>> +memcpy(&out, &ref, sizeof(out));
>> +apply_3x4_matrix(&out, &test_matrix_3x4_50_desat);
>> +
>> +KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out));
>> +
>> +/* 50% grey */
>> +ref.a = 0x0;
>> +ref.r = 0x8000;
>> +ref.g = 0x8000;
>> +ref.b = 0x8000;
>> +
>> +memcpy(&out, &ref, sizeof(out));
>> +apply_3x4_matrix(&out, &test_matrix_3x4_50_desat);
>> +
>> +KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out));
>> +
>> +/* full red to 50% desat */
>> +ref.a = 0x0;
>> +ref.r = 0x7fff;
>> +ref.g = 0x3fff;
>> +ref.b = 0x3fff;
>> +
>> +out.a = 0x0;
>> +out.r = 0x;
>> +out.g = 0x0;
>> +out.b = 0x0;
>> +
>> +apply_3x4_matrix(&out, &test_matrix_3x4_50_desat);
>> +
>> +KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out));
>> +}
>> +
>> +const struct drm_color_ctm_3x4 test_matrix_3x4_bt709_enc = { {
>> +0x366cf400ull, 0xb7175900ull, 0x000127bb300ull, 0,
>> +0x80001993b3a0ull, 0x80005609fe80ull, 0x6f9db200ull, 0,
>> +0x9d70a400ull, 0x80008f011100ull, 0x8e6f9330ull, 0
>> +} };
> 
> Would be nice to have a comment explaining where these values came
> from, like a snippet of Python printing these. The conversion from real
> numbers to this representation is the interesting part.
> 
>> +
>> +static void vkms_color_ctm_3x4_bt709(struct kunit *test)
>> +{
>> +struct pixel_argb_s32 ref, out;
>> +
>> +/* full white to bt709 */
>> +ref.a = 0x0;
>> +ref.r = 0xfffe; /* off by one in 16bpc not a big deal */
>> +ref.g = 0x0;
>> +ref.b = 0x0;
>> +
>> +out.a = 0x0;
>> +out.r = 0x;
>> +out.g = 0x;
>> +out.b = 0x;
>> +
>> +apply_3x4_matrix(&out, &test_matrix_3x4_bt709_enc);
>> +
>> +/* red 255 */
>> +KUNIT_EXPECT_GT(test, out.r, 0xfe00);
> 
> This allows result grossly greater than 1.0.
> 
>> +
>> +/* green 0 */
>> +KUNIT_EXPECT_LT(test, out.g, 0x0100);
> 
> This allows grossly negative results.
> 
>> +
>> +/* blue 0 */
>> +KUNIT_EXPECT_LT(test, out.b, 0x0100);
>> +
>> +/* full black to bt709 */
>> +ref.a = 0x0;
>> +ref.r = 0x0; /* off by one in 16bpc not a big deal */
>> +ref.g = 0x0;
>> +ref.b = 0x0;
>> +
>> +out.a = 0x0;
>> +out.r = 0x0;
>> +out.g = 0x0;
>> +out.b = 0x0;
>> +
>> +apply_3x4_matrix(&out, &test_matrix_3x4_bt709_enc);
>> +
>> +/* red 0 */
>> +KUNIT_EXPECT_LT(test, out.r, 0x100);
>> +
>> +/* green 0 */
>> +KUNIT_EXPECT_LT(test, out.g, 0x0100);
>> +
>> 

Re: [RFC PATCH v3 21/23] drm/vkms: add 3x4 matrix in color pipeline

2024-01-04 Thread Harry Wentland



On 2023-12-08 08:11, Pekka Paalanen wrote:
> On Wed, 8 Nov 2023 11:36:40 -0500
> Harry Wentland  wrote:
> 
>> We add two 3x4 matrices into the VKMS color pipeline. The reason
>> we're adding matrices is so that we can test that application
>> of a matrix and its inverse yields an output equal to the input
>> image.
> 
> Would it not be better to mimic what a hardware driver might likely
> have? Or maybe that will be just a few more pipelines?
> 
> People testing their compositors would likely expect a more usual
> pipeline arrangement.
> 

True, but my focus with VKMS was to get the interface and core
code off the ground and make sure it's all (mostly) sane. For
more realistic testing we're currently focussing on creating the
AMD color pipeline in amdgpu.

There's definitely value in mimicing real HW with VKMS but it's
also a non-trivial task and somewhat lower in priority for me.
I think at this point there is higher value in getting a real HW
implementation off the ground.

Harry

>> One complication with the matrix implementation has to do with
>> the fact that the matrix entries are in signed-magnitude fixed
>> point, whereas the drm_fixed.h implementation uses 2s-complement.
>> The latter one is the one that we want for easy addition and
>> subtraction, so we convert all entries to 2s-complement.
>>
>> Signed-off-by: Harry Wentland 
>> ---
>>  drivers/gpu/drm/vkms/vkms_colorop.c  | 32 +++-
>>  drivers/gpu/drm/vkms/vkms_composer.c | 27 +++
>>  2 files changed, 58 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/vkms/vkms_colorop.c 
>> b/drivers/gpu/drm/vkms/vkms_colorop.c
>> index 9a26b9fdc4a2..4e37e805c443 100644
>> --- a/drivers/gpu/drm/vkms/vkms_colorop.c
>> +++ b/drivers/gpu/drm/vkms/vkms_colorop.c
>> @@ -31,7 +31,37 @@ const int vkms_initialize_tf_pipeline(struct drm_plane 
>> *plane, struct drm_prop_e
>>  
>>  prev_op = op;
>>  
>> -/* 2nd op: 1d curve */
>> +/* 2nd op: 3x4 matrix */
>> +op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
>> +if (!op) {
>> +DRM_ERROR("KMS: Failed to allocate colorop\n");
>> +return -ENOMEM;
>> +}
>> +
>> +ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_CTM_3X4);
>> +if (ret)
>> +return ret;
>> +
>> +drm_colorop_set_next_property(prev_op, op);
>> +
>> +prev_op = op;
>> +
>> +/* 3rd op: 3x4 matrix */
>> +op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
>> +if (!op) {
>> +DRM_ERROR("KMS: Failed to allocate colorop\n");
>> +return -ENOMEM;
>> +}
>> +
>> +ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_CTM_3X4);
>> +if (ret)
>> +return ret;
>> +
>> +drm_colorop_set_next_property(prev_op, op);
>> +
>> +prev_op = op;
>> +
>> +/* 4th op: 1d curve */
>>  op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
>>  if (!op) {
>>  DRM_ERROR("KMS: Failed to allocate colorop\n");
>> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
>> b/drivers/gpu/drm/vkms/vkms_composer.c
>> index d04a235b9fcd..c278fb223188 100644
>> --- a/drivers/gpu/drm/vkms/vkms_composer.c
>> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
>> @@ -164,6 +164,30 @@ static void apply_lut(const struct vkms_crtc_state 
>> *crtc_state, struct line_buff
>>  }
>>  }
>>  
>> +static void apply_3x4_matrix(struct pixel_argb_s32 *pixel, const struct 
>> drm_color_ctm_3x4 *matrix)
>> +{
>> +s64 rf, gf, bf;
>> +
>> +rf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[0]), 
>> drm_int2fixp(pixel->r)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[1]), 
>> drm_int2fixp(pixel->g)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[2]), 
>> drm_int2fixp(pixel->b)) +
>> + drm_sm2fixp(matrix->matrix[3]);
> 
> Again, if you went for performance, you'd make a copy of the matrix in
> fixp format in advance, to avoid having to convert the same thing for
> every processed pixel.
> 
>> +
>> +gf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[4]), 
>> drm_int2fixp(pixel->r)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[5]), 
>> drm_int2fixp(pixel->g)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[6]), 
>> drm_int2fixp(pixel->b)) +
>> + drm_sm2fixp(matrix->matrix[7]);
>> +
>> +bf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[8]), 
>> drm_int2fixp(pixel->r)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[9]), 
>> drm_int2fixp(pixel->g)) +
>> + drm_fixp_mul(drm_sm2fixp(matrix->matrix[10]), 
>> drm_int2fixp(pixel->b)) +
>> + drm_sm2fixp(matrix->matrix[11]);
> 
> Likewise the repetition of int2fixp three times for the same value is
> probably hurting unless the compiler knows to eliminate the redundant
> calls.
> 
>> +
>> +pixel->r = drm_fixp2int(rf);
>> +pixel->g = drm_fixp2int(gf);
>> +pixel->b = drm_fixp2int(bf);
> 
> Btw. why pick s32 and not fixp for your intermediate type?
> 
> Using both you get the limitations of