On Thu, 15 Jun 2023 17:44:33 -0400
Christopher Braga wrote:
> On 6/14/2023 5:00 AM, Pekka Paalanen wrote:
> > On Tue, 13 Jun 2023 12:29:55 -0400
> > Christopher Braga wrote:
> >
> >> On 6/13/2023 4:23 AM, Pekka Paalanen wrote:
> >>> On Mon, 12 Jun 2023 12:56:57 -0400
> >>> Christopher Braga wrote:
> >>>
> On 6/12/2023 5:21 AM, Pekka Paalanen wrote:
> > On Fri, 9 Jun 2023 19:11:25 -0400
> > Christopher Braga wrote:
> >
> >> On 6/9/2023 12:30 PM, Simon Ser wrote:
> >>> Hi Christopher,
> >>>
> >>> On Friday, June 9th, 2023 at 17:52, Christopher Braga
> >>> wrote:
> >>>
> > The new COLOROP objects also expose a number of KMS properties.
> > Each has a
> > type, a reference to the next COLOROP object in the linked list,
> > and other
> > type-specific properties. Here is an example for a 1D LUT operation:
> >
> > Color operation 42
> > ├─ "type": enum {Bypass, 1D curve} = 1D curve
> > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} =
> > LUT
> The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D
> curves? Will different hardware be allowed to expose a subset of
> these
> enum values?
> >>>
> >>> Yes. Only hardcoded LUTs supported by the HW are exposed as enum
> >>> entries.
> >>>
> > ├─ "lut_size": immutable range = 4096
> > ├─ "lut_data": blob
> > └─ "next": immutable color operation ID = 43
> >
> Some hardware has per channel 1D LUT values, while others use the
> same
> LUT for all channels. We will definitely need to expose this in the
> UAPI in some form.
> >>>
> >>> Hm, I was assuming per-channel 1D LUTs here, just like the existing
> >>> GAMMA_LUT/
> >>> DEGAMMA_LUT properties work. If some hardware can't support that,
> >>> it'll need
> >>> to get exposed as another color operation block.
> >>>
> > To configure this hardware block, user-space can fill a KMS blob
> > with
> > 4096 u32
> > entries, then set "lut_data" to the blob ID. Other color operation
> > types
> > might
> > have different properties.
> >
> The bit-depth of the LUT is an important piece of information we
> should
> include by default. Are we assuming that the DRM driver will always
> reduce the input values to the resolution supported by the pipeline?
> This could result in differences between the hardware behavior
> and the shader behavior.
>
> Additionally, some pipelines are floating point while others are
> fixed.
> How would user space know if it needs to pack 32 bit integer values
> vs
> 32 bit float values?
> >>>
> >>> Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use
> >>> a common
> >>> definition of LUT blob (u16 elements) and it's up to the driver to
> >>> convert.
> >>>
> >>> Using a very precise format for the uAPI has the nice property of
> >>> making the
> >>> uAPI much simpler to use. User-space sends high precision data and
> >>> it's up to
> >>> drivers to map that to whatever the hardware accepts.
> >>>
> >> Conversion from a larger uint type to a smaller type sounds low effort,
> >> however if a block works in a floating point space things are going to
> >> get messy really quickly. If the block operates in FP16 space and the
> >> interface is 16 bits we are good, but going from 32 bits to FP16 (such
> >> as in the matrix case or 3DLUT) is less than ideal.
> >
> > Hi Christopher,
> >
> > are you thinking of precision loss, or the overhead of conversion?
> >
> > Conversion from N-bit fixed point to N-bit floating-point is generally
> > lossy, too, and the other direction as well.
> >
> > What exactly would be messy?
> >
> Overheard of conversion is the primary concern here. Having to extract
> and / or calculate the significand + exponent components in the kernel
> is burdensome and imo a task better suited for user space. This also has
> to be done every blob set, meaning that if user space is re-using
> pre-calculated blobs we would be repeating the same conversion
> operations in kernel space unnecessarily.
> >>>
> >>> What is burdensome in that calculation? I don't think you would need to
> >>> use any actual floating-point instructions. Logarithm for finding the
> >>> exponent is about finding the highest bit set in an integer and
> >>> everything is conveniently expressed in base-2. Finding significand is
> >>> jus