This project proposal is a subtask of smp networking.
The goal of this project is to implement interrupt handling at the
granularity of a networking interface. When a network device gets an
interrupt, it could call <iftype>_defer(ifp)
to schedule a kernel
continuation (see kernel continuations) for that interface which could
then invoke <iftype>_poll
. Whether the interrupted source should be
masked depends on if the device is a DMA device or a PIO device. This
routine should then call (*ifp->if_poll)(ifp)
to deal with the
interrupt's servicing.
During servicing, any received packets should be passed up via
(*ifp->if_input)(ifp, m)
which would be responsible for ALTQ or any other
optional processing as well as protocol dispatch. Protocol dispatch in
<iftype>_input
decodes the datalink headers, if needed, via a table
lookup and call the matching protocol's pr_input
to process the packet.
As such, interrupt queues (e.g. ipintrq
) would no longer be needed. Any
transmitted packets can be processed as can MII events. Either true or
false should be returned by if_poll
depending on whether another
invokation of <iftype>_poll
for this interface should be immediately
scheduled or not, respectively.
Memory allocation has to be prohibited in the interrupt routines. The
device's if_poll
routine should pre-allocate enough mbufs to do any
required buffering. For devices doing DMA, the buffers are placed into
receive descripors to be filled via DMA.
For devices doing PIO, pre-allocated mbufs are enqueued onto the softc of
the device so when the interrupt routine needs one it simply dequeues one,
fills in it in, and then enqueues it onto a completed queue, finally calls
<iftype>_defer
. If the number of pre-allocated mbufs drops below a
threshold, the driver may decide to increase the number of mbufs that
if_poll
pre-allocates. If there are no mbufs left to receive the packet,
the packets is dropped and the number of mbufs for if_poll
to
pre-allocate should be increased.
When interrupts are unmasked depends on a few things. If the device is interrupting "too often", it might make sense for the device's interrupts to remain masked and just schedule the device's continuation for the next clock tick. This assumes the system has a high enough value set for HZ.