title="Coordinated caching and scheduling"
duration="4 months and up"
There are many caches in the kernel. Most of these have knobs and
adjustments, some exposed and some not, for sizing and writeback rate
and flush behavior and assorted other voodoo, and most of the ones
that aren't adjustable probably should be.
Currently all or nearly all of these caches operate on autopilot
independent of the others, which does not necessarily produce good
results, especially if the system is operating in a performance regime
different from when the behavior was tuned by the implementors.
It would be nice if all these caches were instead coordinated, so that
they don't end up fighting with one another. Integrated control of
sizing, for example, would allow explicitly maintaining a sensible
balance between different memory uses based on current conditions;
right now you might get that, depending on whether the available
voodoo happens to work adequately under the workload you have, or you
might not. Also, it is probably possible to define some simple rules
about eviction, like not evicting vnodes that have UVM pages still to
be written out, that can help avoid unnecessary thrashing and other
adverse dynamic behavior. And similarly, it is probably possible to
prefetch some caches based on activity in others. It might even be
possible to come up with one glorious unified cache management
Also note that cache eviction and prefetching is fundamentally a form
of scheduling, so all of this material should also be integrated with
the process scheduler to allow *it* to make more informed decisions.
This is a nontrivial undertaking.
Step 1 is to just find all the things in the kernel that ought to
participate in a coordinated caching and scheduling scheme. This
should not take all that long. Some examples include:
* UVM pages
* file system metadata buffers
* VFS name cache
* vnode cache
* size of the mbuf pool
Step 2 is to restructure and connect things up so that it is readily
possible to get the necessary information from all the random places
in the kernel that these things occupy, without making a horrible mess
and without trashing system performance in the process or deadlocking
out the wazoo. This is not going to be particularly easy or fast.
Step 3 is to take some simple steps, like suggested above, to do
something useful with the coordinated information, and hopefully to
show via benchmarks that it has some benefit.
Step 4 is to look into more elaborate algorithms for unified control
of everything. The previous version of this project cited IBM's ARC
("Adaptive Replacement Cache") as one thing to look at. (But note that
ARC may be encumbered -- someone please check on that and update this
page.) Another possibility is to deploy machine learning algorithms to
look for and exploit patterns.
Note: this is a serious research project. Step 3 will yield a
publishable minor paper; step 4 will yield a publishable major paper
if you manage to come up with something that works, and it quite
possibly contains enough material for a PhD thesis.
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb