Garrett D'Amore schrieb:
> You need to synchronize with a mutex that is initialized with an 
> appropriately high level interrupt block cookie.
> 
> Anytime is_running is checked, or is changed, that mutex should be 
> held.  The mutex basically is implemented in a way that prevents the 
> high level interrupt from running while you hold it.  Obviously, you 
> don't want to hold any longer than you absolutely have to, and you 
> shouldn't do anything that requires sleeping, etc. while holding it.  
> (Such locks should always, I believe, be leaf locks...)

Sure, this is all well described in the manuals like 
http://docs.sun.com/app/docs/doc/816-4854/interrupt-18?a=view, and I am 
confident I understand it and am actually using this successfully. But 
the race I'm asking about is present (as I suspect) in exactly this 
example code.

> A fundamental question arises though: why do you need hi-level 
> interrupts?  You can have "hard interrupts" that are not high level -- 
> by default most PCI devices do *not* use high level interrupt handling.  
> If you don't use high level interrupts, then you don't need any of this 
> ugly hand off code to a soft interrupt.

Our PCIe device (Dolphin DXH510, a PCI-Express Interconnect HBA) gets an 
MSI interrupt priority 12 assigned, and it can not be lowered. Don't ask 
me why. Just searched the web a bit (again) and found your related 
posting 
http://mail.opensolaris.org/pipermail/opensolaris-arc/2008-May/008874.html 
where you describe the "interrupt-priorities" property in drivers.conf - 
will try this out!

[BTW: the interrupt-priorities property is at least documented in 
http://developers.sun.com/solaris/developer/support/driver/wps/pci/html/Interrupts.doc.html,
 
as I just found out.]

> As far as synchronization issues go with work queue management, there 
> are several solutions:
> 
>    * kmem_alloc()/zalloc() can be used with KM_NOSLEEP -- in which case 
> you need to be prepared to deal with errors.

I read in a Sun documentation (not "Writing Device Driver", have to look 
it up in the office) that only mutex_enter/_exit (on hi-level mutex), 
ddi_read/_write and soft_intr_trigger were allowed to be called in 
high-level interrupt context. And we got a kernel panic when trying to 
use a kmem_cache allocator (with KM_NOSLEEP).

>    * you can preallocate what you need, so you can't run out

I can probably allocate sufficient pieces of "work units", all only a 
few bytes, but I can not know for sure in advance how many (thousand) 
interrupts will come in, and how fast the rate of processing them in the 
softintr handler (or taskq) is. I know that allocating "enough" and 
post-allocating as we do usually works (it does for us); but I would 
like to know if there are optimal ways to achieve this.

>    * if your work queue is relatively fixed, and you don't need to pass 
> data, you can use taskq's to have a different thread.  You can avoid 
> having to worry about ddi_taskq_dispatch() by using a master loop and a 
> cv synchronization.  (You can use bitfields to indicate which tasks to 
> perform, in such a case.) The sdcard code does this: see 
> "sda_slot_thread()" in usr/src/uts/common/io/sdcard/impl/sda_slot.c for 
> example code.

I'll have a look for inspiration, thanks, but as we are porting 
(generalizing, to be exact) code from Linux with a given 
"schedule_workqueue" functionality, I don't want to change it more than 
really necessary (time matters).

Thanks for your input! Joachim

>    -- Garrett
> 
> Joachim Worringen wrote:
>> Greetings,
>>
>> a more generic follow-up question on dispatching soft-interrupts from 
>> hi-level (hard) interrupts: as described in the Sun documentation at 
>> several places (device driver tutorial, or 
>> http://developers.sun.com/solaris/articles/interrupt_handlers.html), 
>> the softintr and hardintr handler synchronize via a "is_running" flag, 
>> protected by a mutex.
>>
>> If 'is_running' is true, the softintr handler will not be triggered 
>> (again). However, what about the race between the softintr handler 
>> setting 'is_running' to false (and releasing the lock), and the time 
>> that it is actually returning from the handler function and is 
>> declared "non-running", so that it can be triggered again?
>>
>> The hardintr handler may come in in this very moment, assume the 
>> softintr handler is not running, trigger it, get EPENDING and quits => 
>> an interrupt would be lost.
>>
>> Admitted, the race is small - or am I wrong?
>>
>> Additionally, what is best practise for the work queue maintained 
>> between these two handlers? As we can not allocate memory within the 
>> hardirq-handler, we need to use some statically allocated queue. 
>> Assuming the worst, this queue could be full when the hardintr handler 
>> wants to queue something => interrupt lost.
>>
>> We are trying to prevent this by letting the softintr handler increase 
>> the number of available queue entries depending on a low-water mark, 
>> but still not bullet proof, or is it?
>>
>>   Joachim
>>
>>
>>   
> 


-- 
Joachim Worringen, Software Architect, Dolphin Interconnect Solutions
phone ++49/(0)228/324 08 17 - http://www.dolphinics.com
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to