On Thu, Apr 21, 2005 at 02:43:37PM +0200, Dag Wieers <[EMAIL PROTECTED]> wrote: > On Thu, 21 Apr 2005, Marc Lehmann wrote: > > Fast system with no activity :) On my dual-P3 1Ghz dstat usually needs > > around 4 seconds to start up. On my dual 600mhz p3 machine it can take up > > to 10 seconds. > > Ouch, blame python :) I'm wondering if your system might slow down dstat > this badly that the job scheduling can't guarantee a near 1 second > interval.
I guess it's more that dstat does 430 open() calls (and reads 82 files), while, say, vmstat only opens 5. Reading 82 seperate files is always going to be slow. That's the only reason I still use vmstat sometimes, because, when a system starts to thrash it will usually not be able to start dstat in any reasonable timeframe :) (But that's ok for me, and won't detract me from usign dstat, really :) > And then use dstat -t <options>. On my system it results in: I get this: 1114091418.016 1114091419.017 1114091420.017 1114091421.018 1114091422.019 1114091423.020 1114091424.020 1114091425.021 1114091426.022 1114091427.023 1114091428.025 1114091429.026 1114091430.026 1114091431.027 1114091432.028 So the deviation increases. This happens on my completely idle dual-opteron, too: 1114091500.907 1114091501.908 1114091502.908 1114091503.909 1114091504.910 1114091505.911 1114091506.912 1114091507.912 1114091508.913 1114091509.914 1114091510.915 1114091511.916 1114091512.916 Even on my rather busy and slow freenet node (less busy during daytime, but dstat still takes a few seconds to start), I get 1ms increasing deviation: 1114091890.894 1114091891.895 1114091892.896 1114091893.897 1114091894.898 1114091895.899 1114091896.900 1114091897.901 1114091898.901 1114091899.902 1114091900.903 1114091901.904 1114091902.905 1114091903.906 1114091904.907 1114091905.908 > 1114086819.071| 1 3 96 0 0 0| 0 0 | 0 0 | 0 0 > |1082 844 |0.11 0.06 0.08 > 1114086820.072| 1 4 95 0 0 0| 0 0 | 0 0 | 0 0 > |1061 885 |0.11 0.06 0.08 > 1114086821.073| 1 3 96 0 0 0| 0 0 | 0 0 | 0 0 > |1082 814 |0.11 0.06 0.08 > > So you see a 1ms deviation per second. (about 1sec deviation after 17mins) Obviously a minor bug in dstat then. However, it should not affect calculations much. The more I think of it (and the more I test), I think the non-averaging was probably just an artifact. dstat clearly does averaging here. It's possible that the intermediate updates confused me, too. Sorry for making a bogus bugreport, I'll close it :( One thing, though: the longer I think about it, the mroe I come to the conclusion that the intermediate updates are useless, as by looking at dstat output you cannot know what the numbers actually show, because they are averaged over an unknown number of seconds. Personaly, I would prefer either a real n-second average, with intermediate updates and one "scroll" every n seconds, or no averages at all for the intermediate reprots. Given that I don't really rely on the intermediate updates (it's just one extra feature over vmstat), and this is my personal preference, you might simply ignore my thoughts :-> > Now try the same when enabling the following lines: > > ### Increase precision if we're root (does not seem to have effect) > # if os.geteuid() == 0: > # os.nice(-20) > # sys.setcheckinterval(op.delay / 10000) > > And let me know if this makes a difference for you. On all occasions it > never made a real difference for me. (ie. you may want to try this both as > root as well as a user and maybe disable the if statement) Your and my time dumps show no problem with precision, but with clock stability. I tried to find out how dstat does it's time scheduling but could only find references to ALARM, which has no stability guarentees. Not that deviations of a few ms/second matter to me, but if you want to make one update per second, on average, for continued time, then you'd need to wait "till the next update" and not "one second between updates", as the latter doesn't take into account the time of the update itself (or in this case, delays in ALARM handling out of your control). > I understand, but the code becomes more ugly :/ (ie. I have to indent a Welcome to the real world :) Nice algorithms often become ugly because of complicatred corner cases, too :) > complete block + subblocks of code...). But if it can take up to 10secs, Well, vmstat can easily take 10 seconds for startup, too, if your system is thrashing. It might even never start :) I don't count that as a big problem in my book. Pressing INT in what might be my last remaining shell for that machine to get a shell prompt back is vital for me, though :) Wether with trace or not. > you're absolutely right. I'm going to look into speeding up dstat (or > slowing down my system and profiling statements, I bet some modules take > a long time and may not be necessary all the time). I have similar problems with any of my perl programs. Perl simply has to read so many files that it is impossible to make it faster, except by using no modules, which is inacceptable. I don't think it's dstat's fault at all. It's a mere artifact of dstat being written in a language that does all linking at runtime. > > I guess the best thing is to only catching INT before initscr, as before > > there is no reaosn even to catch the signal, because catching it only adds > > time before the user gets back at his/her prompt (there is nothing to > > cleanup), although I suspect that it can't be done with python. > > It can be done. But it still requires me to import the signal module :) Hm.. you mean to say that python gives a backtrace by default within a "short" timeframe after starting up? Frankly, I'd report this as a bug immediately, after all, that precludes being able to control signal handlign completely within python :-> > > However, despite me bashing on this, it really is a very very minor issue > > :) > > I understand. But details matter to me too Obviously! And that's becoming increasingly rare! Thanks for investing time. If vmstat had bugged me enough I would have wirtten my own vmstat replacement, but it would have been some small unpublished hack that would only work for me. Thanks for taking the time to do it properly and release it, I know that's quite an amount of extra work. -- The choice of a -----==- _GNU_ ----==-- _ generation Marc Lehmann ---==---(_)__ __ ____ __ [EMAIL PROTECTED] --==---/ / _ \/ // /\ \/ / http://schmorp.de/ -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]