On Tue, 21 Nov 2017 15:26:17 +0200 Eero Tamminen <[email protected]> wrote:
> 1. Temperature is affected by the temperature of the surrounding > media, power usage less so > 2. Temperature sensor might not be exactly where current load is > using most power (e.g. ALUs vs. memory) I would imagine the reason to monitor temperature, is to look for situations where the temperature is exceeding a limit. It is not necessary that the maximum temperature on the GPU is equal to the value of the sensor, only that it is representative of that maximum. We then rephrase the maximum temperature of the GPU in terms of the maximum observed at the temperature probe. In any event, using umr I collected a bit over 900 data points when my computer (FX-8320e and RX-460) was doing an Einstein@Home BOINX job (Gamma Ray pulsar binary search, which is a process which requires 1 GPU and 1 CPU core both working together). These jobs typically take about 30 minutes on my computer, and 900 sample points is much less than this. It is possible that a run ended in the middle of my sampling, I didn't look into that at the time). There is a single temperature reading in the log (which is integer degrees C), with a minimum of 58C and a maximum of 65C. The average was 62.3C +/- 1.2C. The median temperature was 62C. There are 3 sensor values (reported as integer centiWatts) that seem to be "power": AvgGPU, MaxGPU and VDCC. VDCC tends to be the lowest value. Minimum seen was 2.63W, maximum was 64.27W. Median was 30.985W. Mean was 27 +/- 17W (mean and standard deviation properly rounded). The distribution has two modes. Average GPU Power had a minimum of 15.31W and a maximum of 60.54W. Median was 38.965W. The mean was 39 +/- 14W (properly rounded). Distribution again has 2 modes. Maximum GPU Power tended to be the highest of the three. Minimum seen was 8.72W, maximum was 76.86W. The median was 41.335W. The mean was 38W +/- 18W (properly rounded). Two modes were seen. I also looked at the ratio of VDCC over Maximum GPU Power. The minimum was 0.05 and the maximum was 1.67. The median was 0.69. The mean ratio was 0.66 +/- 0.35 (not properly rounded). Most of the values are in a mode about 0.8, but there is also a narrow peak at a ratio "close" to 0. The ratio of Average GPU Power over Maximum Power; minimum was 0.27, maximum was 2.34. Median was 1.00. The mean was 1.00 +/- 0.34 (not properly rounded). This appears to be just a single distribution, with a tail extending off to high values of the ratio. The temperature at time t, should be a function of power. But it is not limited to the power at time t. There will be some interval of time which correlates power most strongly to the temperature. There will be some minimum time gap, which is related to the time it takes for a power spike (delta function) at some location in the GPU, to travel to the temperature probe. Probably easier to analyse by doing a step change in power, instead of trying to approximate a delta function. Gord _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
