Annotation of wikisrc/projects/project/improve-caching.mdwn, revision 1.5

1.1       jmmv        1: [[!template id=project
1.3       dholland    3: title="Coordinated caching and scheduling"
1.1       jmmv        4: 
                      5: contact="""
                      6: [tech-kern](
                      7: """
                      9: category="kernel"
1.2       jmmv       10: difficulty="hard"
1.3       dholland   11: duration="4 months and up"
1.1       jmmv       12: 
                     13: description="""
1.3       dholland   14: 
                     15: There are many caches in the kernel. Most of these have knobs and
                     16: adjustments, some exposed and some not, for sizing and writeback rate
                     17: and flush behavior and assorted other voodoo, and most of the ones
                     18: that aren't adjustable probably should be.
                     20: Currently all or nearly all of these caches operate on autopilot
                     21: independent of the others, which does not necessarily produce good
                     22: results, especially if the system is operating in a performance regime
                     23: different from when the behavior was tuned by the implementors.
                     25: It would be nice if all these caches were instead coordinated, so that
                     26: they don't end up fighting with one another. Integrated control of
                     27: sizing, for example, would allow explicitly maintaining a sensible
                     28: balance between different memory uses based on current conditions;
                     29: right now you might get that, depending on whether the available
                     30: voodoo happens to work adequately under the workload you have, or you
                     31: might not. Also, it is probably possible to define some simple rules
                     32: about eviction, like not evicting vnodes that have UVM pages still to
                     33: be written out, that can help avoid unnecessary thrashing and other
                     34: adverse dynamic behavior. And similarly, it is probably possible to
                     35: prefetch some caches based on activity in others. It might even be
                     36: possible to come up with one glorious unified cache management
                     37: algorithm.
                     39: Also note that cache eviction and prefetching is fundamentally a form
                     40: of scheduling, so all of this material should also be integrated with
                     41: the process scheduler to allow *it* to make more informed decisions.
                     43: This is a nontrivial undertaking.
                     45: Step 1 is to just find all the things in the kernel that ought to
                     46: participate in a coordinated caching and scheduling scheme. This
                     47: should not take all that long. Some examples include:
1.5     ! dholland   48: 
        !            49: * UVM pages
        !            50: * file system metadata buffers
        !            51: * VFS name cache
        !            52: * vnode cache
        !            53: * size of the mbuf pool
1.3       dholland   54: 
                     55: Step 2 is to restructure and connect things up so that it is readily
                     56: possible to get the necessary information from all the random places
                     57: in the kernel that these things occupy, without making a horrible mess
                     58: and without trashing system performance in the process or deadlocking
                     59: out the wazoo. This is not going to be particularly easy or fast.
                     61: Step 3 is to take some simple steps, like suggested above, to do
                     62: something useful with the coordinated information, and hopefully to
                     63: show via benchmarks that it has some benefit.
                     65: Step 4 is to look into more elaborate algorithms for unified control
                     66: of everything. The previous version of this project cited IBM's ARC
                     67: ("Adaptive Replacement Cache") as one thing to look at. (But note that
                     68: ARC may be encumbered -- someone please check on that and update this
                     69: page.) Another possibility is to deploy machine learning algorithms to
                     70: look for and exploit patterns.
                     72: Note: this is a serious research project. Step 3 will yield a
                     73: publishable minor paper; step 4 will yield a publishable major paper
                     74: if you manage to come up with something that works, and it quite
                     75: possibly contains enough material for a PhD thesis.
1.1       jmmv       77: """
                     78: ]]

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb