RE: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color pipeline is needed
> -Original Message- > From: Pekka Paalanen > Sent: Thursday, November 9, 2023 5:26 PM > To: Shankar, Uma > Cc: Joshua Ashton ; Harry Wentland > ; dri-de...@lists.freedesktop.org; Sebastian Wick > ; Sasha McIntosh ; > Abhinav Kumar ; Shashank Sharma > ; Xaver Hugl ; Hector > Martin ; Liviu Dudau ; Alexander > Goins ; Michel Dänzer ; wayland- > de...@lists.freedesktop.org; Melissa Wen ; Jonas Ådahl > ; Arthur Grillo ; Victoria > Brekenfeld ; Sima ; Aleix Pol > ; Naseer Ahmed ; Christopher > Braga ; Ville Syrjala > Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color > pipeline is needed > > On Thu, 9 Nov 2023 10:17:11 + > "Shankar, Uma" wrote: > > > > -Original Message- > > > From: Joshua Ashton > > > Sent: Wednesday, November 8, 2023 7:13 PM > > > To: Shankar, Uma ; Harry Wentland > > > ; dri-de...@lists.freedesktop.org > > ... > > > > Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why > > > prescriptive color pipeline is needed > > > > > > > > > > > > On 11/8/23 12:18, Shankar, Uma wrote: > > > > > > > > > > > >> -Original Message- > > > >> From: Harry Wentland > > > >> Sent: Friday, October 20, 2023 2:51 AM > > > >> To: dri-de...@lists.freedesktop.org > > ... > > > > >> Subject: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why > > > >> prescriptive color pipeline is needed > > ... > > > > >> +An example of a drm_colorop object might look like one of these:: > > > >> + > > > >> +/* 1D enumerated curve */ > > > >> +Color operation 42 > > > >> +├─ "TYPE": immutable enum {1D enumerated curve, 1D LUT, 3x3 > > > >> + matrix, 3x4 > > > >> matrix, 3D LUT, etc.} = 1D enumerated curve > > > >> +├─ "BYPASS": bool {true, false} > > > >> +├─ "CURVE_1D_TYPE": enum {sRGB EOTF, sRGB inverse EOTF, PQ > > > >> + EOTF, PQ > > > >> inverse EOTF, …} > > > > > > > > Having the fixed function enum for some targeted input/output may > > > > not be scalable for all usecases. There are multiple colorspaces > > > > and transfer functions possible, so it will not be possible to > > > > cover all these by any enum definitions. Also, this will depend on > > > > the capabilities of > > > respective hardware from various vendors. > > > > > > The reason this exists is such that certain HW vendors such as AMD > > > have transfer functions implemented in HW. It is important to take > > > advantage of these for both precision and power reasons. > > > > Issue we see here is that, it will be too usecase and vendor specific. > > There will be BT601, BT709, BT2020, SRGB, HDR EOTF and many more. Not > > to forget we will need linearization and non-linearization enums for each of > these. > > I don't see that as a problem at all. It's not a combinatorial explosion like > input/output combinations in a single enum would be. > It's always a curve and its inverse at most. > > It's KMS properties, not every driver needs to implement every defined enum > value but only those values it can and wants to support. > Userspace also sees the supported list, it does not need trial and error. > > This is the only way to actually use hard-wired curves. The alternative would > be > for userspace to submit a LUT of some type, and the driver needs to start > guessing if it matches one of the hard-wired curves the hardware supports, > which > is just not feasible. > > Hard-wired curves are an addition, not a replacement, to custom curves defined > by parameters or various different LUT representations. > Many of these hard-wired curves will emerge as is from common use cases. Point taken, we can go with this fixed function curve types as long as it represents a single mathematical operation, thereby avoiding the combination nightmare. However, just want to make sure that the same thing can be done with a programmable hardware. In the case above, lut tables for the same need to be hardcoded in driver for various platforms (depending on its capabilities, precision, number, and distribution of luts etc). This is manageable, but driver will get bloated with all kinds of hardcoded lut tables, which could have been easily computed by the compositor runtime. Driver cannot compute the tables runtime due to the complexity of the floating math involved, so hardcoded lut tables will be the only option. So we should just ensure that if these enums are not exposed by a driver, but a programmable lut block is exposed instead, userspace should fall back to the programmable lut. Having the fixed function enum should not become a mandatory norm to implement and expose even for a programmable hardware. With this we will be able to cater to both kinds of hardware with a generic userspace. Hope this expectation is ok. > > Also > > a CTM indication to convert colospace. > > Did someone propose to enumerate matrices? I would not do that, unless you > literally have hard-wired matrices in hardware and cannot do custom matrices. Not currently, but there can be fixed
Re: [RFC PATCH v3 05/23] drm/vkms: Avoid reading beyond LUT array
On 08/11/23 13:36, Harry Wentland wrote: > When the floor LUT index (drm_fixp2int(lut_index) is the last > index of the array the ceil LUT index will point to an entry > beyond the array. Make sure we guard against it and use the > value of the floor LUT index. > > v3: > - Drop bits from commit description that didn't contribute >anything of value > > Signed-off-by: Harry Wentland > Cc: Arthur Grillo LGTM Reviewed-by: Arthur Grillo Best Regards, ~Arthur Grillo > --- > drivers/gpu/drm/vkms/vkms_composer.c | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c > b/drivers/gpu/drm/vkms/vkms_composer.c > index 6f942896036e..25b6b73bece8 100644 > --- a/drivers/gpu/drm/vkms/vkms_composer.c > +++ b/drivers/gpu/drm/vkms/vkms_composer.c > @@ -123,6 +123,8 @@ static u16 apply_lut_to_channel_value(const struct > vkms_color_lut *lut, u16 chan > enum lut_channel channel) > { > s64 lut_index = get_lut_index(lut, channel_value); > + u16 *floor_lut_value, *ceil_lut_value; > + u16 floor_channel_value, ceil_channel_value; > > /* >* This checks if `struct drm_color_lut` has any gap added by the > compiler > @@ -130,11 +132,15 @@ static u16 apply_lut_to_channel_value(const struct > vkms_color_lut *lut, u16 chan >*/ > static_assert(sizeof(struct drm_color_lut) == sizeof(__u16) * 4); > > - u16 *floor_lut_value = (__u16 *)&lut->base[drm_fixp2int(lut_index)]; > - u16 *ceil_lut_value = (__u16 *)&lut->base[drm_fixp2int_ceil(lut_index)]; > + floor_lut_value = (__u16 *)&lut->base[drm_fixp2int(lut_index)]; > + if (drm_fixp2int(lut_index) == (lut->lut_length - 1)) > + /* We're at the end of the LUT array, use same value for ceil > and floor */ > + ceil_lut_value = floor_lut_value; > + else > + ceil_lut_value = (__u16 > *)&lut->base[drm_fixp2int_ceil(lut_index)]; > > - u16 floor_channel_value = floor_lut_value[channel]; > - u16 ceil_channel_value = ceil_lut_value[channel]; > + floor_channel_value = floor_lut_value[channel]; > + ceil_channel_value = ceil_lut_value[channel]; > > return lerp_u16(floor_channel_value, ceil_channel_value, > lut_index & DRM_FIXED_DECIMAL_MASK);
Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color pipeline is needed
On Fri, 10 Nov 2023 11:27:14 + "Shankar, Uma" wrote: > > -Original Message- > > From: Pekka Paalanen > > Sent: Thursday, November 9, 2023 5:26 PM > > To: Shankar, Uma > > Cc: Joshua Ashton ; Harry Wentland > > ; dri-de...@lists.freedesktop.org; Sebastian Wick > > ; Sasha McIntosh ; > > Abhinav Kumar ; Shashank Sharma > > ; Xaver Hugl ; Hector > > Martin ; Liviu Dudau ; Alexander > > Goins ; Michel Dänzer ; wayland- > > de...@lists.freedesktop.org; Melissa Wen ; Jonas Ådahl > > ; Arthur Grillo ; Victoria > > Brekenfeld ; Sima ; Aleix Pol > > ; Naseer Ahmed ; Christopher > > Braga ; Ville Syrjala > > > > Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive > > color > > pipeline is needed > > > > On Thu, 9 Nov 2023 10:17:11 + > > "Shankar, Uma" wrote: > > > > > > -Original Message- > > > > From: Joshua Ashton > > > > Sent: Wednesday, November 8, 2023 7:13 PM > > > > To: Shankar, Uma ; Harry Wentland > > > > ; dri-de...@lists.freedesktop.org > > > > ... > > > > > > Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why > > > > prescriptive color pipeline is needed > > > > > > > > > > > > > > > > On 11/8/23 12:18, Shankar, Uma wrote: > > > > > > > > > > > > > > >> -Original Message- > > > > >> From: Harry Wentland > > > > >> Sent: Friday, October 20, 2023 2:51 AM > > > > >> To: dri-de...@lists.freedesktop.org > > > > ... > > > > > > >> Subject: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why > > > > >> prescriptive color pipeline is needed > > > > ... > > > > > > >> +An example of a drm_colorop object might look like one of these:: > > > > >> + > > > > >> +/* 1D enumerated curve */ > > > > >> +Color operation 42 > > > > >> +├─ "TYPE": immutable enum {1D enumerated curve, 1D LUT, 3x3 > > > > >> + matrix, 3x4 > > > > >> matrix, 3D LUT, etc.} = 1D enumerated curve > > > > >> +├─ "BYPASS": bool {true, false} > > > > >> +├─ "CURVE_1D_TYPE": enum {sRGB EOTF, sRGB inverse EOTF, PQ > > > > >> + EOTF, PQ > > > > >> inverse EOTF, …} > > > > > > > > > > Having the fixed function enum for some targeted input/output may > > > > > not be scalable for all usecases. There are multiple colorspaces > > > > > and transfer functions possible, so it will not be possible to > > > > > cover all these by any enum definitions. Also, this will depend on > > > > > the capabilities of > > > > respective hardware from various vendors. > > > > > > > > The reason this exists is such that certain HW vendors such as AMD > > > > have transfer functions implemented in HW. It is important to take > > > > advantage of these for both precision and power reasons. > > > > > > Issue we see here is that, it will be too usecase and vendor specific. > > > There will be BT601, BT709, BT2020, SRGB, HDR EOTF and many more. Not > > > to forget we will need linearization and non-linearization enums for each > > > of > > these. > > > > I don't see that as a problem at all. It's not a combinatorial explosion > > like > > input/output combinations in a single enum would be. > > It's always a curve and its inverse at most. > > > > It's KMS properties, not every driver needs to implement every defined enum > > value but only those values it can and wants to support. > > Userspace also sees the supported list, it does not need trial and error. > > > > This is the only way to actually use hard-wired curves. The alternative > > would be > > for userspace to submit a LUT of some type, and the driver needs to start > > guessing if it matches one of the hard-wired curves the hardware supports, > > which > > is just not feasible. > > > > Hard-wired curves are an addition, not a replacement, to custom curves > > defined > > by parameters or various different LUT representations. > > Many of these hard-wired curves will emerge as is from common use cases. > > Point taken, we can go with this fixed function curve types as long as it > represents a > single mathematical operation, thereby avoiding the combination nightmare. > > However, just want to make sure that the same thing can be done with a > programmable > hardware. In the case above, lut tables for the same need to be hardcoded in > driver for > various platforms (depending on its capabilities, precision, number, and > distribution of luts etc). Hi Uma, you can do that if you want to. > This is manageable, but driver will get bloated with all kinds of hardcoded > lut tables, > which could have been easily computed by the compositor runtime. Driver > cannot compute > the tables runtime due to the complexity of the floating math involved, so > hardcoded > lut tables will be the only option. You do not have to do that if you don't want to. > So we should just ensure that if these enums are not exposed by a driver, but > a programmable > lut block is exposed instead, userspace should fall back to the programmable > lut. Having the > fixed function enum should not become a m
Re: [RFC PATCH v3 04/23] drm/vkms: Add kunit tests for VKMS LUT handling
On 09/11/23 19:05, Arthur Grillo wrote: > > > On 08/11/23 13:36, Harry Wentland wrote: >> Debugging LUT math is much easier when we can unit test >> it. Add kunit functionality to VKMS and add tests for >> - get_lut_index >> - lerp_u16 >> >> v3: >> - Use include way of testing static functions (Arthur) >> >> Signed-off-by: Harry Wentland >> Cc: Arthur Grillo >> --- >> drivers/gpu/drm/vkms/Kconfig | 5 ++ >> drivers/gpu/drm/vkms/tests/.kunitconfig | 4 ++ >> drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 62 +++ Also, s/tests/test/ to follow the naming of other test files (like on drivers/gpu/drm/tests/) Best Regards, ~Arthur Grillo >> drivers/gpu/drm/vkms/vkms_composer.c | 8 ++- >> 4 files changed, 77 insertions(+), 2 deletions(-) >> create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig >> create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c >> >> diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig >> index b9ecdebecb0b..c1f8b343ff0e 100644 >> --- a/drivers/gpu/drm/vkms/Kconfig >> +++ b/drivers/gpu/drm/vkms/Kconfig >> @@ -13,3 +13,8 @@ config DRM_VKMS >>a VKMS. >> >>If M is selected the module will be called vkms. >> + >> +config DRM_VKMS_KUNIT_TESTS >> +tristate "Tests for VKMS" if !KUNIT_ALL_TESTS >> +depends on DRM_VKMS && KUNIT >> +default KUNIT_ALL_TESTS >> diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig >> b/drivers/gpu/drm/vkms/tests/.kunitconfig >> new file mode 100644 >> index ..70e378228cbd >> --- /dev/null >> +++ b/drivers/gpu/drm/vkms/tests/.kunitconfig >> @@ -0,0 +1,4 @@ >> +CONFIG_KUNIT=y >> +CONFIG_DRM=y >> +CONFIG_DRM_VKMS=y >> +CONFIG_DRM_VKMS_KUNIT_TESTS=y >> diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c >> b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c >> new file mode 100644 >> index ..b995114cf6b8 >> --- /dev/null >> +++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c >> @@ -0,0 +1,62 @@ >> +/* SPDX-License-Identifier: GPL-2.0+ */ >> + >> +#include >> + >> +#include >> + >> +#define TEST_LUT_SIZE 16 >> + >> +static struct drm_color_lut test_linear_array[TEST_LUT_SIZE] = { >> +{ 0x0, 0x0, 0x0, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +{ 0x, 0x, 0x, 0 }, >> +}; >> + >> +const struct vkms_color_lut test_linear_lut = { >> +.base = test_linear_array, >> +.lut_length = TEST_LUT_SIZE, >> +.channel_value2index_ratio = 0xf000fll >> +}; >> + >> + >> +static void vkms_color_test_get_lut_index(struct kunit *test) >> +{ >> +int i; >> + >> +KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(&test_linear_lut, >> test_linear_array[0].red)), 0); >> + >> +for (i = 0; i < TEST_LUT_SIZE; i++) >> +KUNIT_EXPECT_EQ(test, >> drm_fixp2int_ceil(get_lut_index(&test_linear_lut, >> test_linear_array[i].red)), i); >> +} >> + >> +static void vkms_color_test_lerp(struct kunit *test) >> +{ >> +KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, 0x8000), 0x8); >> +} >> + >> +static struct kunit_case vkms_color_test_cases[] = { >> +KUNIT_CASE(vkms_color_test_get_lut_index), >> +KUNIT_CASE(vkms_color_test_lerp), >> +{} >> +}; >> + >> +static struct kunit_suite vkms_color_test_suite = { >> +.name = "vkms-color", >> +.test_cases = vkms_color_test_cases, >> +}; >> +kunit_test_suite(vkms_color_test_suite); >> + >> +MODULE_LICENSE("GPL"); >> \ No newline at end of file >> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c >> b/drivers/gpu/drm/vkms/vkms_composer.c >> index 3c99fb8b54e2..6f942896036e 100644 >> --- a/drivers/gpu/drm/vkms/vkms_composer.c >> +++ b/drivers/gpu/drm/vkms/vkms_composer.c >> @@ -91,7 +91,7 @@ static void fill_background(const struct pixel_argb_u16 >> *background_color, >> } >> >> // lerp(a, b, t) = a + (b - a) * t >> -static u16 lerp_u16(u16 a, u16 b, s64 t) >> +u16 lerp_u16(u16 a, u16 b, s64 t) > > Now you don't need to remove the static keyword. > >> { >> s64 a_fp = drm_int2fixp(a); >> s64 b_fp = drm_int2fixp(b); >> @@ -101,7 +101,7 @@ static u16 lerp_u16(u16 a, u16 b, s64 t) >> return drm_fixp2int(a_fp + delta); >> } >> >> -static s64 get_lut_index(const struct vkms_color_lut *lut, u16 >> channel_value) >> +s64 get_lut_index(const struct vkms_color_lut *lut, u16 channel_value) > > DITTO > > Best Regards, > ~Arthur Grillo > >> { >> s64 color_channel_fp = drm_int2fixp(channel_value); >> >> @@ -429,3 +429,7
Re: [RFC PATCH 01/10] drm/doc/rfc: Describe why prescriptive color pipeline is needed
On 2023-11-09 04:20, Pekka Paalanen wrote: On Wed, 8 Nov 2023 11:27:35 -0500 Harry Wentland wrote: On 2023-11-08 11:19, Pekka Paalanen wrote: On Wed, 8 Nov 2023 09:31:17 -0500 Harry Wentland wrote: On 2023-11-08 06:40, Sebastian Wick wrote: On Wed, Nov 8, 2023 at 11:16 AM Pekka Paalanen wrote: ... An incremental UAPI development approach is fine by me, meaning that pipelines might not be complete at first, but I believe that requires telling userspace whether the driver developers consider the pipeline complete (no undescribed operations that would significantly change results from the expected results given the UAPI exposed pipeline). The prime example of what I would like to know is that if a FB contains PQ-encoded image and I use a color pipeline to scale that image up, will the interpolation happen before or after the non-linear colorop that decodes PQ. That is a significant difference as pointed out by Joshua. That's fair and I want to give that to you. My concern stems from the sentiment that I hear that any pipeline that doesn't explicitly advertise this is useless. I don't agree there. Let's not let perfect be the enemy of good. It's up to the use case. The policy of what is sufficient should reside in userspace. What about matching compositor shader composition with KMS? Can we use that as a rough precision threshold? If userspace implements the exact same color pipeline as the KMS UAPI describes, then that and the KMS composited result should be indistinguishable in side-by-side or alternating visual inspection on any monitor in isolation. Did this whole effort not start from wanting to off-load things to display hardware but still maintain visual equivalence to software/GPU composition? I agree with you and I want all that as well. All I'm saying is that every userspace won't have the same policy of what is sufficient. Just because Weston has a very high threshold doesn't mean we can't move forward with a workable solution for other userspace, as long as we don't do something that prevents us from doing the perfect solution eventually. And yes, I do want a solution that works for Weston and hear your comments on what that requires. I totally agree. How will that be reflected in the UAPI? If some pipelines are different from others in correctness/strictness perspective, how will userspace tell them apart? Is the current proposal along the lines of: userspace creates a software pipeline first, and if it cannot map all operations on it to KMS color pipeline colorops, then the KMS pipeline is not sufficient? The gist being, if the software pipeline is scaling the image for example, then it must find a scaling colorop in the KMS pipeline if it cares about the scaling correctness. With a simplified model of an imaginary color pipeline I expect this to look like this: Color Pipeline 1: EOTF Curve - CTM Color Pipeline 2: EOTF Curve - scale - CTM Realistically both would most likely map to the same HW blocks. Assuming userspace A and B do the following: EOTF Curve - scale - CTM Userspace A doesn't care about scaling and would only look for: EOTF Curve - CTM and find a match with Color Pipeline 1. Userspace B cares about scaling and would look for EOTF Curve - scale - CTM and find a match with Color Pipeline 2. If Color Pipeline 2 is not exposed in the first iteration of the driver's implementation userspace A would still be able to make use of it, but userspace B would not, since it requires a defined scale operation. If the driver then exposes Color Pipeline 2 in a second iteration userspace B can find a match for what it needs and make use of it. Realistically userspace B would not attempt to implement a DRM/KMS color pipeline implementation unless it knows that there's a driver that can do what it needs. Another example is YUV pixel format on an FB that magically turns into some kind of RGB when sampled, but there is no colorop to tell what happens. I suppose userspace cannot assume that the lack of colorop there means an identity YUV->RGB matrix either? How to model that? I guess the already mentioned pixel format requirements on a pipeline would help, making it impossible to use a pipeline without a YUV->RGB colorop on a YUV FB unless the lack of colorop does indeed mean an identity matrix. I agree. The same with sub-sampled YUV which probably needs to always(?) be expanded into fully sampled at the beginning of a pipeline? Chroma siting. This is in addition to the previously discussed userspace policy that if a KMS color pipeline contains colorops the userspace does not recognise, userspace probably should not pick that pipeline under any conditions, because it might do something completely unexpected. Unless those colorops can be put into bypass. I think the above could work, but I feel it requires documenting several examples like scaling that might not exist in the colorop definitions at first. Otherwise particula