On Sun, Oct 11, 2015 at 08:38:17PM +0200, Thomas Gleixner wrote: > On Mon, 28 Sep 2015, Yunhong Jiang wrote: > > Currently the get_cpu_idle/iowait_time_us() updates the idle_entrytime. > > When it's invoked from another CPU and the target CPU has been on idle > > already, it will update the idle_entrytime to now, which is incorrect. > > > > However, the get_cpu_idle/iowait_time_us() is not guranteed to be called > > on the target CPU. For example, the get_cpu_idle_time_us() seems is > > invoked remotely on drivers/cpufreq/cpufreq_governor.c through > > get_cpu_idle_time(). > > > > There is a check that the update happens only if a valid last_update_time > > parameter passed. IMHO, this is more a hack because there is no guarantee > > that it's invoked on the target CPU when last_update_time is valid. > > Looking at the call sites, this last_update_time parameter is > silly. Why is the calling code not taking the timestamp? There is > hardly a requirement that this needs to be the same timestamp as the > one which is used to calculate idle time. That cpufreq calculations > are speculative anyway. > > So we better get rid of that parameter completely.
Sure, will change the patch accordingly. > > > In fact, we don't need update the idle stats from > > get_cpu_idle/iowait_time_us(). Now the policy is, we record the > > entrytime when tick_nohz_start_idle() and update the stats > > when tick_nohz_stop_idle(). We calculate the stats on other situations. > > > > Please notice: > > 1) There is a bug currently that the tick_nohz_stop_idle() calls the > > update_ts_time_stats() and update the idle_entrytime, which is sure > > to be wrong. Removing the idle_entrytime update resolves the bug also. > > Care to explain the actual bug and what wreckage it causes? If it's a > real bug then removing "ts->idle_entrytime = now" needs to be a > separate patch. AFAICT, it's a cosmetic issue. Thanks for review and sorry for slow response. Yes, it's an exaggeration to call it a bug. I just thought that updating the idle_entrytime when exiting the idle state should be something wrong. But since the idle_entrytime is not used meaningful, it has no impact. > > > 2) There is a small widows in tick_nohz_stop_idle() between > > update_ts_time_stats() and clear ts->idle_active, that > > get_cpu_idle/iowait_time_us(), when invoked remotely, may double > > count last idle period. This window exists w/o this change and this > > change does not fix it. > > Calling any of those functions from a remote cpu is broken to begin > with, especially on 32bit machines. And that does not change with your > patch at all. Yes, as stated, this change does not fix it. Arjan also said that get_cpu_idle/iowait_time_us() should not be called from remote CPUs, but as stated, drivers/cpufreq/cpufreq_governor.c seems trying to invoke them remotely. I have no idea of the impact, Rafael or Viresh, can you give me some hints? > > What we really need here is protecting the idle stats fields with a > raw_spinlock. If no calling from remote CPU, then we don't need the spinlock protection, right? Thanks --jyh > > Thanks, > > tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

