I have a configuration where it appears that I have missed a decrementer
interrupt and the system gets frozen in the idle loop:

      Dec:  0           (interrupts on 1->0 transition)
      MSR[EE]: 1        (interrupt enabled)
      TSR[DIS]: 0       (no interrupt pending)

Since the decrementer interrupt occurs ONLY on the 1->0 transition
I'm not too sure how the system can get out of this state -- in my
particular case the answer is it gets a Watchdog timeout and reboots.

This occurs very rarely running a modified version of 2.4.28_pre3
on a custom board very similar to the Ebony.

I am very curious about the behavior of the code in arch/ppc/time.c
in the case where I manage to miss exactly 1 jiffy - IRQ service time:

int timer_interrupt(struct pt_regs * regs)
{
        int next_dec;
        unsigned long cpu = smp_processor_id();
        unsigned jiffy_stamp = last_jiffy_stamp(cpu);
        extern void do_IRQ(struct pt_regs *);

        if (atomic_read(&ppc_n_lost_interrupts) != 0)
                do_IRQ(regs);

        hardirq_enter(cpu);
        while ((next_dec = tb_ticks_per_jiffy - tb_delta(&jiffy_stamp))
< 0) {
                jiffy_stamp += tb_ticks_per_jiffy; 
<snip>
        }

        if ( !disarm_decr[smp_processor_id()] )
                set_dec(next_dec);
        last_jiffy_stamp(cpu) = jiffy_stamp;

}

Seems like the result is to go through the loop once and then program
the decrementer to 0, effectively disabling timer interrupts.

Also curious about exactly where disarm_decr() gets initialized.

I will be adjusting this and testing but it seems really strange
that I have the only setup that manages to hit just the right
timing -- is there someplace else where we would normally recover
from this situation?

Thanks
David





Reply via email to