Hi Andrew,
>> When the PHY's temparature is back to normal (low threshold, it is
>> 85 degrees) it restores user/default link speed settings.
>
> Hi Igor
>
> Please could you also export the temperature using HWMON. The Marvell
> PHY drivers are good examples.
>
> I also have a patch for driver/net/phy/aquantia.c which adds HWMON
> support to that PHY.
We actually have almost ready patches to submit hwmon support separately
for both AQC USB and AQC PCI MAC drivers.
Will do that in near time.
>> + if (aqc111_data->thermal_throttling)
>> + speed = SPEED_100;
>> +
>
> That is a big jump. Do you need to cool is down quickly, or would
> 1Gbps be sufficient? I think as a user, i would prefer it ping-pongs
> between 5G and 1G, not 5G and 100M.
In case overheat happens that really means something bad is happening outside,
or heatsink is bad/detached. I'll consult our HW team once again, but 100M was
the original request from our phy/hw team. It seems it really much less on
power and
1G may not be enough.
>
>> if (autoneg == AUTONEG_ENABLE) {
>> switch (speed) {
>> case SPEED_5000:
>
> It looks like this only works when auto-neg is enabled. If i've fixed
> configured it i don't think this works?
Looks you are right, will check this.
>> + if (aqc111_data->thermal_throttling &&
>> + temperature <= AQ_NORMAL_TEMP_THRESHOLD) {
>> + netdev_info(dev->net, "The temperature is back to normal(%u)",
>> + AQ_NORMAL_TEMP_THRESHOLD / 100);
>
> How often do you see these messages? I'm wondering if they need to be
> rate limited?
In worst case that will be at least limited by link down/up internal,
which in case of 5G normally takes 3-5secs.
>> + if (!aqc111_data->thermal_throttling &&
>> + temperature >= AQ_CRITICAL_TEMP_THRESHOLD) {
>
> Should there be some hysteresis in here? In fact, if temperature is
> AQ_CRITICAL_TEMP_THRESHOLD it is both back to normal, and throttled at
> the same time!
NORMAL_TEMP and CRITICAL_TEMP values are actually a hysteresis.
In cool down case only after TEMP_NORMAL temperature link will be flipped back
again.
>
>> + netdev_warn(dev->net, "Critical temperature(%u) is reached",
>> + AQ_CRITICAL_TEMP_THRESHOLD / 100);
>> + aqc111_data->thermal_throttling = 1;
>> + reset_speed = 1;
>
> update_speed might be a better name, since you are not always
> resetting it.
Agreed.
Regards,
Igor