On Fri, Jan 20, 2012 at 06:06:08PM +0100, Ariane van der Steldt wrote: > [...] > > * Timeslices are 20ms long instead of 10ms. This solves the issue of 0ms > > long timeslices on machines with hz < 100. > > I'm not really happy having timeslices get larger. They're considerably > large already. Can you think of a way to derive a sane value based on > the hz? > [...]
Hmm, maybe my writing is not as clear as I thought it to be. The 20ms vs 10ms was the second version of my patch vs. the first one. Mainline has timeslices of 100ms. They are already derived from hz in kern/sched_bsd.c:scheduler_start(). Originally, the relevant line read rrticks_init = hz / 10; whereas now it is rrticks_init = hz / 50; The first version of the patch had rrticks_init = hz / 100; which lead to rrticks_init being zero on machines with hz < 100. From the comments to the first version I gathered that one can rely on 50 <= hz <= 1024. I'm not too sure whether changing timeslices has any real benefit though. I noticed a bit less A/V lag in mplayer with smaller timeslices so I gave it a shot. > [...] > You'll want to test this on a multi-socket machine as well (preferably a > numa machine). You're currently migrating processes using your on-die L3 > cache, which is considerably faster than memory. > [...] Yes, right. Christiano also mentioned that in an off-list mail. Maybe I should abuse one of the machines at my university for that... > [...] > The above comparison will fail if: > a != b, but a->p_deadline_set == b->p_deadline_set. > Your code hides this from view, because of the a==b test before it, > turning it into a rare occurence. However when it hits during insert or > remove, you may find your tree getting corrupted and your invariants > getting violated. > [...] Right. Thanks for spotting that. I'll change that in the next iteration of the patch. > [...] > Note: the above breaks RB_FIND and RB_NFIND, since a search clause > can never compare equal. > [...] Since neither RB_FIND nor RB_NFIND is used (as far as I can see), I'd call that a good tradeoff. > [...] > > + > > +RB_GENERATE_STATIC(runqueue, proc, p_runq, sched_cmp_proc); > > +RB_GENERATE_STATIC(expqueue, proc, p_expq, sched_cmp_proc_exp); > > That's gonna add a lot of code. > [...] Hmm, is there any better way to do it? I couldn't think of any way to have two queues with a disjoint set of processes in each and their only difference being the comparison function. > [...] > Btw, even though the implementation works fine without it, I'd really > like if you could put matching RB_PROTOTYPE_STATIC before the generator > macro. Using GENERATE without PROTOTYPE is abuse of an implementation > detail. > [...] Will be fixed. > [...] > > + if (!(RB_REMOVE(runqueue, &sched_runqueue, p))) > > Is it ok to call this function for a process not on the run queue? > Your code implies it is. If you don't intend that to happen, put a > KASSERT in. > > > + RB_REMOVE(expqueue, &sched_expqueue, p); > > Please check return value. I understand that if a process is on the > runqueue, it must also be on the expqueue? > [...] No, runqueue and expqueue are disjoint. A process is either in the runqueue if its deadline has not yet been reached and in the expqueue if its deadline is expired. Maybe I should think of a better name than runqueue because essentially both RB-trees together are the runqueue. > [...] > tv_usec is measured in microseconds, msec suggests milliseconds. > I'd change that to: > .tv_usec = offset_msec % 1000 * 1000 > [...] D'oh! I'll fix that. Must've been my brain being a bit sleepy :) > [...] Thanks a lot for your time and suggestions :) -- Gregor Best