On 2022-01-27 5:30 a.m., hdv@gmail wrote:

Sadly I do not have access to this machine remotely. I do have my own VPN server, but that does not help when the machine in question is turned off. ;-)

I'll check the temperature when I am back, and when it happens again.

I played around a tiny bit.

My GPU temperature when logged in remotely via VNC was 29C.

Logging in on local system GPU went up to high 30s just on the desktop.

I ran a game and initially the temperature went.

pwm1 (the power management state) was 0 (off) until 53C when it got set to 43.

Temperature went up to 62C, and then dropped to 45C, and then pwm1 went back to 0.

Temp went back to 53C before the power management kicked in again.

I did:
echo 0 > pwm1_enable

Which apparently maxes the power management as:
pwm1 was now 255

Then I did:
echo 1 > pwm1_enable

This seemed more aggressive than the default setting of 2 on my system.

Looking at the kernel docs:
https://www.kernel.org/doc/Documentation/hwmon/g762

It seems 1 setting is open mode and 2 is closed mode.

In closed mode it seems there is some feedback mechanism involving the fan speed.

Undocumented but it seems setting pwm1_enable to 0, just maxes the power management out.

Turns out it is documented in the kernel source comments:
 *  0 : no fan speed control (i.e. fan at full speed)
 *  1 : manual fan speed control enabled (use pwm[1-*]) (open-loop)
* 2+: automatic fan speed control enabled (use fan[1-*]_target) (closed-loop)

Anyways with all this playing around I got my temp down to 21C, when logged in locally but not running the game.

tl;dr

If anyone has this happen quickly enough you could try setting fan speed to max (echo 0 > pwm1_enable and check that pwm1 goes to 255) and see if it fixes it.

Bijan

Reply via email to