File:  [NetBSD Developer Wiki] / wikisrc / projects / project / kernel_continuations.mdwn
Revision 1.1: download - view: text, annotated - select for diffs
Thu Nov 10 03:06:51 2011 UTC (8 years, 4 months ago) by jmmv
Branches: MAIN
CVS tags: HEAD
Add a specific proposal for the SMP networking project.

This proposal is built on top of several individual, smaller projects, all
of which are related to achieve the goals of SMP support and modularity on
the network stack.  Keep in mind that this is just that: a proposal.
Applicants could still come up with their own ideas.

The text of all these new pages is mostly a copy/paste of the original
document written by matt@ (see
I have done some minor edits (hopefully not changing any of the technical
details) and added some preliminary texts to the pages.  (I was unable to
parse some of the sentences though, so they remain "as is"...)

    1: [[!template id=project
    3: title="Kernel continuations"
    5: contact="""
    6: [tech-kern](,
    7: [board](,
    8: [core](
    9: """
   11: category="kernel"
   12: difficulty="hard"
   13: funded="The NetBSD Foundation"
   15: description="""
   16: This project proposal is a subtask of [[smp_networking]] and is elegible
   17: for funding independently.
   19: The goal of this project is to implement continuations at the kernel level.
   20: Most of the pieces are already available in the kernel, so this can be
   21: reworded as: combine *callouts*, *softints*, and *workqueues* into a single
   22: framework.  Continuations are meant to cheap; very cheap.
   24: Please note that the main goal of this project is to simplify the
   25: implementation of [[SMP networking|smp_networking]], so care must be taken
   26: in the design of the interface to support all the features required for
   27: this other project.
   29: The proposed interface looks like the following.  This interface is mostly
   30: derived from the `callout(9)` API and is a superset of the softint(9) API.
   31: The most significant change is that workqueue items are not tied to a
   32: specific kernel thread.
   34: * `kcont_t *kcont_create(kcont_wq_t *wq, kmutex_t *lock, void
   35:   (*func)(void *, kcont_t *), void *arg, int flags);`
   37:   A `wq` must be supplied.  It may be one returned by
   38:   `kcont_workqueue_acquire` or a predefined workqueue such as (sorted from
   39:   highest priority to lowest):
   41:   * `wq_softserial`, `wq_softnet`, `wq_softbio`, `wq_softclock`
   42:   * `wq_prihigh`, `wq_primedhigh`, `wq_primedlow`, `wq_prilow`
   44:   `lock`, if non-NULL, should be locked before calling `func(arg)` and
   45:   released afterwards.  However, if the lock is released and/or destroyed
   46:   before the called function returns, then, before returning,
   47:   `kcont_set_mutex` must be called with either a new mutex to be released
   48:   or `NULL`.  If acquiring lock would block, other pending kernel
   49:   continuations which depend on other locks may be dispatched in the
   50:   meantime.  However, all continuations sharing the same set of `{ wq, lock,
   51:   [ci] }` need to be processed in the order they were scheduled.
   53:   `flags` must be 0.  This field is just provided for extensibility.
   55: * `int kcont_schedule(kcont_t *kc, struct cpu_info *ci, int nticks);`
   57:   If the continuation is marked as *INVOKING*, an error of `EBUSY` should
   58:   be returned.  If `nticks` is 0, the continuation is marked as *INVOKING*
   59:   while *EXPIRED* and *PENDING* are cleared, and the continuation is
   60:   scheduled to be invoked without delay.  Otherwise, the continuation is
   61:   marked as *PENDING* while *EXPIRED* status is cleared, and the timer
   62:   reset to `nticks`.  Once the timer expires, the continuation is marked as
   63:   *EXPIRED* and *INVOKING*, and the *PENDING* status is cleared.  If `ci`
   64:   is non-NULL, the continuation is invoked on the specified CPU if the
   65:   continuations's workqueue has per-cpu queues.  If that workqueue does not
   66:   provide per-cpu queues, an error of `ENOENT` is returned.  Otherwise when
   67:   `ci` is `NULL`, the continuation is invoked on either the current CPU or
   68:   the next available CPU depending on whether the continuation's workqueue
   69:   has per-cpu queues or not, respectively.
   71: * `void kcont_destroy(kcont_t *kc);`
   73: * `kmutex_t *kcont_getmutex(kcont_t *kc);`
   75:   Returns the lock currently associated with the continuation `kc`.
   77: * `void kcont_setarg(kcont_t *kc, void *arg);`
   79:   Updates `arg` in the continuation `kc`.  If no lock is associated with
   80:   the continuation, then `arg` may be changed at any time; however, if the
   81:   continuation is being invoked, it may not pick up the change.  Otherwise,
   82:   `kcont_setarg` must only be called when the associated lock is locked.
   84: * `kmutex_t *kcont_setmutex(kcont_t *kc, kmutex_t *lock);`
   86:   Updates the lock associated with the continuation `kc` and returns the
   87:   previous lock.  If no lock is currently associated with the continuation,
   88:   then calling this function with a lock other than NULL will trigger an
   89:   assertion failure.  Otherwise, `kcont_setmutex` must be called only when
   90:   the existing lock (which will be replaced) is locked.  If
   91:   `kcont_setmutex` is called as a result of the invokation of func, then
   92:   after kcont_setmutex has been called but before func returns, the
   93:   replaced lock must have been released, and the replacement lock, if
   94:   non-NULL, must be locked upon return.
   96: * `void kcont_setfunc(kcont_t *kc, void (*func)(void *), void *arg);`
   98:   Updates `func` and `arg` in the continuation `kc`.  If no lock is
   99:   associated with the continuation, then only arg may be changed.
  100:   Otherwise, `kcont_setfunc` must be called only when the associated lock
  101:   is locked.
  103: * `bool kcont_stop(kcont_t *kc);`
  105:   The `kcont_stop function` stops the timer associated the continuation
  106:   handle kc.  The *PENDING* and *EXPIRED* status for the continuation
  107:   handle is cleared.  It is safe to call `kcont_stop` on a continuation
  108:   handle that is not pending, so long as it is initialized.  `kcont_stop`
  109:   will return a non-zero value if the continuation was *EXPIRED*.
  111: * `bool kcont_pending(kcont_t *kc);`
  113:   The `kcont_pending` function tests the *PENDING* status of the
  114:   continuation handle `kc`.  A *PENDING* continuation is one who's timer
  115:   has been started and has not expired.  Note that it is possible for a
  116:   continuation's timer to have expired without being invoked if the
  117:   continuation's lock could not be acquired or there are higher priority
  118:   threads preventing its invokation.  Note that it is only safe to test
  119:   *PENDING* status when holding the continuation's lock.
  121: * `bool kcont_expired(kcont_t *kc);`
  123:   Tests to see if the continuation's function has been invoked since the
  124:   last `kcont_schedule`.
  126: * `bool kcont_active(kcont_t *kc);`
  128: * `bool kcont_invoking(kcont_t *kc);`
  130:   Tests the *INVOKING* status of the handle `kc`.  This flag is set just
  131:   before a continuation's function is being called.  Since the scheduling
  132:   of the worker threads may induce delays, other pending higher-priority
  133:   code may run before the continuation function is allowed to run.  This
  134:   may create a race condition if this higher-priority code deallocates
  135:   storage containing one or more continuation structures whose continuation
  136:   functions are about to be run.  In such cases, one technique to prevent
  137:   references to deallocated storage would be to test whether any
  138:   continuation functions are in the *INVOKING* state using
  139:   `kcont_invoking`, and if so, to mark the data structure and defer storage
  140:   deallocation until the continuation function is allowed to run.  For this
  141:   handshake protocol to work, the continuation function will have to use
  142:   the `kcont_ack` function to clear this flag.
  144: * `bool kcont_ack(kcont_t *kc);`
  146:   Clears the *INVOKING* state in the continuation handle `kc`.  This is
  147:   used in situations where it is necessary to protect against the race
  148:   condition described under `kcont_invoking`.
  150: * `kcont_wq_t *kcont_workqueue_acquire(pri_t pri, int flags);`
  152:   Returns a workqueue that matches the specified criteria.  Thus if
  153:   multiple requesters ask for the same criteria, they are all returned the
  154:   same workqueue.  `pri` specifies the priority at which the kernel thread
  155:   which empties the workqueue should run.
  157:   If `flags` is 0 then the standard operation is required.  However, the
  158:   following flag(s) may be bitwise ORed together:
  160:   * `WQ_PERCPU` specifies that the workqueue should have a separate queue
  161:     for each CPU, thus allowing continuations to invoked on specific CPUs.
  163: * `int kcont_workqueue_release(kcont_wq_t *wq);`
  165:   Releases an acquired workqueue.  On the last release, the workqueue's
  166:   resources are freed and the workqueue is destroyed.
  167: """
  168: ]]
  170: [[!tag smp_networking]]

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb