Annotation of wikisrc/projects/project/improve-caching.mdwn, revision 1.4

1.1       jmmv        1: [[!template id=project
                      2: 
1.3       dholland    3: title="Coordinated caching and scheduling"
1.1       jmmv        4: 
                      5: contact="""
                      6: [tech-kern](mailto:tech-kern@NetBSD.org)
                      7: """
                      8: 
                      9: category="kernel"
1.2       jmmv       10: difficulty="hard"
1.3       dholland   11: duration="4 months and up"
1.1       jmmv       12: 
                     13: description="""
1.3       dholland   14: 
                     15: There are many caches in the kernel. Most of these have knobs and
                     16: adjustments, some exposed and some not, for sizing and writeback rate
                     17: and flush behavior and assorted other voodoo, and most of the ones
                     18: that aren't adjustable probably should be.
                     19: 
                     20: Currently all or nearly all of these caches operate on autopilot
                     21: independent of the others, which does not necessarily produce good
                     22: results, especially if the system is operating in a performance regime
                     23: different from when the behavior was tuned by the implementors.
                     24: 
                     25: It would be nice if all these caches were instead coordinated, so that
                     26: they don't end up fighting with one another. Integrated control of
                     27: sizing, for example, would allow explicitly maintaining a sensible
                     28: balance between different memory uses based on current conditions;
                     29: right now you might get that, depending on whether the available
                     30: voodoo happens to work adequately under the workload you have, or you
                     31: might not. Also, it is probably possible to define some simple rules
                     32: about eviction, like not evicting vnodes that have UVM pages still to
                     33: be written out, that can help avoid unnecessary thrashing and other
                     34: adverse dynamic behavior. And similarly, it is probably possible to
                     35: prefetch some caches based on activity in others. It might even be
                     36: possible to come up with one glorious unified cache management
                     37: algorithm.
                     38: 
                     39: Also note that cache eviction and prefetching is fundamentally a form
                     40: of scheduling, so all of this material should also be integrated with
                     41: the process scheduler to allow *it* to make more informed decisions.
                     42: 
                     43: This is a nontrivial undertaking.
                     44: 
                     45: Step 1 is to just find all the things in the kernel that ought to
                     46: participate in a coordinated caching and scheduling scheme. This
                     47: should not take all that long. Some examples include:
1.4     ! dholland   48:  * UVM pages
        !            49:  * file system metadata buffers
        !            50:  * VFS name cache
        !            51:  * vnode cache
        !            52:  * size of the mbuf pool
1.3       dholland   53: 
                     54: Step 2 is to restructure and connect things up so that it is readily
                     55: possible to get the necessary information from all the random places
                     56: in the kernel that these things occupy, without making a horrible mess
                     57: and without trashing system performance in the process or deadlocking
                     58: out the wazoo. This is not going to be particularly easy or fast.
                     59: 
                     60: Step 3 is to take some simple steps, like suggested above, to do
                     61: something useful with the coordinated information, and hopefully to
                     62: show via benchmarks that it has some benefit.
                     63: 
                     64: Step 4 is to look into more elaborate algorithms for unified control
                     65: of everything. The previous version of this project cited IBM's ARC
                     66: ("Adaptive Replacement Cache") as one thing to look at. (But note that
                     67: ARC may be encumbered -- someone please check on that and update this
                     68: page.) Another possibility is to deploy machine learning algorithms to
                     69: look for and exploit patterns.
                     70: 
                     71: Note: this is a serious research project. Step 3 will yield a
                     72: publishable minor paper; step 4 will yield a publishable major paper
                     73: if you manage to come up with something that works, and it quite
                     74: possibly contains enough material for a PhD thesis.
                     75: 
1.1       jmmv       76: """
                     77: ]]

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb