On Mon, Jun 15, 2026 at 11:02:22AM +0530, Gaurav Kohli wrote:
> 
> 
> On 6/9/2026 4:27 PM, Dmitry Baryshkov wrote:
> > On Tue, Jun 09, 2026 at 03:52:59PM +0530, Gaurav Kohli wrote:
> > > Unlike the CPU, the CDSP/Modem does not throttle its speed automatically
> > > when it reaches high temperatures in kodiak.
> > > 
> > > Set up CDSP cooling by throttling the cdsp when it reaches 100°C and
> > > for modem when it reaches to 95°C.
> > > 
> > > Remove inherited mdmss cooling-map nodes for Non Modem kodiak variant.
> > 
> > Why? If it is a GNSS-only MPSS, does it not provide any thermal
> > mitigation mechanisms? Does ADSP provide those? WPSS?
> > 
> 
> Hi Dmitry,
> 
> Thank you for the review.
> 
> Since the remoteproc_mpss node doesn't exist on these boards, the
> cooling-maps that reference it cause DT compilation errors. That's why
> we need to remove the inherited cooling-maps from the SoC DTSI.
>  /delete-node/ &remoteproc_mpss;

Ok. Explain that in the commit message. And maybe we need to fix those
boards to provide mpss instead.

> 
> Regarding thermal mitigation for other subsystems:
> ->CDSP and Modem are the primary heat sources based on our internal
>   thermal testing and evaluation.
> ->ADSP and WPSS have lower power consumption and don't typically reach
>   thermal thresholds that require active mitigation
> ->For this, I'm checking with our internal team to confirm if ADSP/WPSS
>   provide any TMD mechanism across all targets.

They do. I've posted the dump of qrtr-lookup somewhere.

> 
> > > 
> > > Signed-off-by: Gaurav Kohli <[email protected]>
> > > ---
> > >   arch/arm64/boot/dts/qcom/kodiak.dtsi               | 127 
> > > ++++++++++++++++++++-
> > >   .../boot/dts/qcom/qcs6490-radxa-dragon-q6a.dts     |  17 +++
> > 
> > So, you removed those for Radxa Q6A, but not forRB3 Gen2. Why?
> > 
> 
> Ack, this is a miss. will fix this.
> 
> > >   .../dts/qcom/qcs6490-thundercomm-minipc-g1iot.dts  |  17 +++
> > >   .../boot/dts/qcom/qcs6490-thundercomm-rubikpi3.dts |  17 +++
> > >   .../boot/dts/qcom/sc7280-herobrine-lte-sku.dtsi    |  18 +++
> > >   .../boot/dts/qcom/sc7280-herobrine-wifi-sku.dtsi   |  16 +++
> > >   6 files changed, 208 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/arch/arm64/boot/dts/qcom/kodiak.dtsi 
> > > b/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > index fa540d8c2615..d345add2d8c8 100644
> > > --- a/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > @@ -3427,6 +3427,9 @@ remoteproc_mpss: remoteproc@4080000 {
> > >                           qcom,smem-states = <&modem_smp2p_out 0>;
> > >                           qcom,smem-state-names = "stop";
> > > +                 #cooling-cells = <3>;
> > > +                 tmd-names = "pa", "modem";
> > > +
> > >                           status = "disabled";
> > >                           glink-edge {
> > > @@ -4787,6 +4790,9 @@ remoteproc_cdsp: remoteproc@a300000 {
> > >                           qcom,smem-states = <&cdsp_smp2p_out 0>;
> > >                           qcom,smem-state-names = "stop";
> > > +                 #cooling-cells = <2>;
> > > +                 tmd-names = "cdsp_sw";
> > 
> > I'm going to review only this DT, the comments apply to the rest of
> > them.
> > 
> > So, we have two cases, CDSP and MPSS. Why does CDSP have only 2 cells?
> > Just because tmd-names has only one name? What if we add another
> > mitigation (which can be added in the firmware), do we suddenly have to
> > change number of cells and all the cooling devices to reflect it?
> > 
> 
> As Cdsp has only one relevant tmd to mitigate, so we have used cooling cells
> as 2. But i will change this to 3 as this is backward compatible.
> 
> > Finally. If I understand correctly, these mitigtion mechanisms are
> > provided by the firmware. Firmware differs between the boards. Vendors
> > (in theory) can change them. Why do we list these names here, in the SoC
> > DT?
> > 
> 
> Below are the main reason for this, replied in other thread also.
> Please guide, if i can use qcom_pas_data to define names.
> 
> Following Daniel's series [1], the thermal framework supports
> mapping multiple cooling devices per remoteproc/device via indexed
> cooling-cells.
> 
> 1) The thermal framework's cooling-maps reference
> cooling devices by index (for #cooling-cells = <3>). Without tmd-names,
> there's no way to know which index corresponds to which TMD, as firmware
> may return tmd-names in any order.

You can #define the indices to the well known names, turning those into
bindings.

> 
> below are the changes post new thermal mapping changes:
> DT: tmd-names = "cdsp_sw", "xyz";
> Firmware: ["cdsp_sw", "xyz1", "xyz2",]
> Driver registers: Only "cdsp_sw" (index 0) and "xyz" (index 1)

What if the DT has "cdsp_sw", but the firmware doesn't report it? That's
an error, as it seems.

> 
> This allows cooling-maps like below:
> cooling-device = <&remoteproc 0 ...>  // "cdsp_sw"
> cooling-device = <&remoteproc 1 ...>  // "xyz"
> 
> 2) Not all firmware-provided TMDs should be
> exposed as cooling devices. The tmd-names property acts as a filter,
> allowing board-specific DT to select only the relevant TMDs for that
> platform.

But then the properties should be defined in the board DT rather than
the platform DT.

> 
> [1]
> https://lore.kernel.org/all/[email protected]/
> 
> Shall i use pas data to define tmd-names instead of dt ?

Let's settle in the thread with Daniel.

> 
> > > +
> > >                           status = "disabled";
> > >                           glink-edge {
> > > +                 cooling-maps {
> > > +                         map0 {
> > > +                                 trip = <&mdmss0_alert1>;
> > > +                                 cooling-device = <&remoteproc_mpss 0 0 
> > > 2>;
> > 
> > What does this mean? I assume that the first cell is one of the
> > mechanisms. What is the difference between them? Do we really need to
> > list them one by one here?
> > 
> 
> Let me check, if i can document different tmd's somewhere:
> 
> -> modem tmd used for Modem Processor mitigation.
> -> pa is used for Power Amplifier mitigation.

What does that mean?

> 
> And we need to list them for binding purpose mainly.
> 
> > What do other cells mean? Why are they 0 and 2 rather than
> > THERMAL_NO_LIMIT? How does one come with those values? This should all
> > be documented and explained somewhere.
> > 
> 
> Will change to THERMAL_NO_LIMIT. Let me check, if i can use
> qli doc for documentation.
> 
> > > +                         };
> > > +
> > > +                         map1 {
> > > +                                 trip = <&mdmss0_alert1>;
> > > +                                 cooling-device = <&remoteproc_mpss 1 0 
> > > 2>;
> > > +                         };
> > > +                 };
> > >                   };
> > 
> 

-- 
With best wishes
Dmitry

Reply via email to