2011/11/17 Andriy Gapon <a...@freebsd.org>:
> on 17/11/2011 21:09 John Baldwin said the following:
>> On Thursday, November 17, 2011 11:58:03 am Andriy Gapon wrote:
>>> on 17/11/2011 18:37 John Baldwin said the following:
>>>> On Thursday, November 17, 2011 4:47:42 am Andriy Gapon wrote:
>>>>> on 17/11/2011 10:34 Andriy Gapon said the following:
>>>>>> on 17/11/2011 10:15 Kostik Belousov said the following:
>>>>>>> I have the following change for eons on my test boxes. Without it,
>>>>>>> I simply cannot get _any_ dump.
>>>>>>>
>>>>>>> diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c
>>>>>>> index 10b89c7..a38e42f 100644
>>>>>>> --- a/sys/cam/cam_xpt.c
>>>>>>> +++ b/sys/cam/cam_xpt.c
>>>>>>> @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb)
>>>>>>>                          TAILQ_INSERT_TAIL(&cam_simq, sim, links);
>>>>>>>                          mtx_unlock(&cam_simq_lock);
>>>>>>>                          sim->flags |= CAM_SIM_ON_DONEQ;
>>>>>>> -                        if (first)
>>>>>>> +                        if (first && panicstr == NULL)
>>>>>>>                                  swi_sched(cambio_ih, 0);
>>>>>>>                  }
>>>>>>>          }
>>>>>>
>>>>>> I think that this (or similar) change should go into the patch and the 
>>>>>> tree.
>>>>>>
>>>>>
>>>>> And, BTW, I still would like to do something like the following (perhaps 
>>>>> with
>>>>> td_oncpu = NOCPU and td_flags &= ~TDF_NEEDRESCHED also moved to the 
>>>>> common code):
>>>>>
>>>>> Index: sys/kern/sched_ule.c
>>>>> ===================================================================
>>>>> --- sys/kern/sched_ule.c   (revision 227608)
>>>>> +++ sys/kern/sched_ule.c   (working copy)
>>>>> @@ -1790,7 +1790,6 @@ sched_switch(struct thread *td, struct thread *new
>>>>>    td->td_oncpu = NOCPU;
>>>>>    if (!(flags & SW_PREEMPT))
>>>>>            td->td_flags &= ~TDF_NEEDRESCHED;
>>>>> -  td->td_owepreempt = 0;
>>>>>    tdq->tdq_switchcnt++;
>>>>>    /*
>>>>>     * The lock pointer in an idle thread should never change.  Reset it
>>>>> Index: sys/kern/kern_synch.c
>>>>> ===================================================================
>>>>> --- sys/kern/kern_synch.c  (revision 227608)
>>>>> +++ sys/kern/kern_synch.c  (working copy)
>>>>> @@ -406,6 +406,8 @@ mi_switch(int flags, struct thread *newtd)
>>>>>        ("mi_switch: switch must be voluntary or involuntary"));
>>>>>    KASSERT(newtd != curthread, ("mi_switch: preempting back to ourself"));
>>>>>
>>>>> +  td->td_owepreempt = 0;
>>>>> +
>>>>>    /*
>>>>>     * Don't perform context switches from the debugger.
>>>>>     */
>>>>> Index: sys/kern/sched_4bsd.c
>>>>> ===================================================================
>>>>> --- sys/kern/sched_4bsd.c  (revision 227608)
>>>>> +++ sys/kern/sched_4bsd.c  (working copy)
>>>>> @@ -940,7 +940,6 @@ sched_switch(struct thread *td, struct thread *new
>>>>>    td->td_lastcpu = td->td_oncpu;
>>>>>    if (!(flags & SW_PREEMPT))
>>>>>            td->td_flags &= ~TDF_NEEDRESCHED;
>>>>> -  td->td_owepreempt = 0;
>>>>>    td->td_oncpu = NOCPU;
>>>>>
>>>>>    /*
>>>>>
>>>>> Does anybody see any potential problems with such a change?
>>>>
>>>> Hmm, does this mean the preemption will be lost if you break into the
>>>> debugger and continue in the non-panic case?
>>>
>>> Not sure which exact scenario you have in mind.
>>> Please note that the above diff just moves resetting of td_owepreempt to an
>>> earlier place.  As far as I can see there are no checks of td_owepreempt 
>>> value
>>> between the new place and the old places.
>>
>> I'm worried that you are clearing td_owepreempt even in cases where a context
>> switch is not performed.  So say you enter DDB with td_owepreempt set and 
>> that
>> DDB bails on a context switch.  With your change it will now clear 
>> td_owepreempt
>> and "lose" the preemption.
>>
>
> And without the change we get the recursion and double-fault because of
> kdb_switch -> thread_unlock ->  spinlock_exit ->  critical_exit ->
> mi_switch in this case ?
>
> BTW, it is my opinion that we really should not let the debugger code call
> mi_switch for any reason.

Yes, I agree with this, this is why the sched_bind() in boot() is
broken (immagine calling things like doadump from KDB. KDB right now
can be thought as a first cut of this patch because it does disable
the CPUs when entering the context, thus, the bug here is that if you
stop all CPUs including CPU0 and later on you want bind on it you are
death).

We need to discuss this and the patch more extensively, I'm taking my
time and hopefully will do a full review during the weekend.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to