Annotation of wikisrc/projects/project/improve-caching.mdwn, revision 1.5
1.1 jmmv 1: [[!template id=project
1.3 dholland 3: title="Coordinated caching and scheduling"
1.1 jmmv 4:
1.2 jmmv 10: difficulty="hard"
1.3 dholland 11: duration="4 months and up"
1.1 jmmv 12:
1.3 dholland 14:
15: There are many caches in the kernel. Most of these have knobs and
16: adjustments, some exposed and some not, for sizing and writeback rate
17: and flush behavior and assorted other voodoo, and most of the ones
18: that aren't adjustable probably should be.
20: Currently all or nearly all of these caches operate on autopilot
21: independent of the others, which does not necessarily produce good
22: results, especially if the system is operating in a performance regime
23: different from when the behavior was tuned by the implementors.
25: It would be nice if all these caches were instead coordinated, so that
26: they don't end up fighting with one another. Integrated control of
27: sizing, for example, would allow explicitly maintaining a sensible
28: balance between different memory uses based on current conditions;
29: right now you might get that, depending on whether the available
30: voodoo happens to work adequately under the workload you have, or you
31: might not. Also, it is probably possible to define some simple rules
32: about eviction, like not evicting vnodes that have UVM pages still to
33: be written out, that can help avoid unnecessary thrashing and other
34: adverse dynamic behavior. And similarly, it is probably possible to
35: prefetch some caches based on activity in others. It might even be
36: possible to come up with one glorious unified cache management
39: Also note that cache eviction and prefetching is fundamentally a form
40: of scheduling, so all of this material should also be integrated with
41: the process scheduler to allow *it* to make more informed decisions.
43: This is a nontrivial undertaking.
45: Step 1 is to just find all the things in the kernel that ought to
46: participate in a coordinated caching and scheduling scheme. This
47: should not take all that long. Some examples include:
1.5 ! dholland 48:
! 49: * UVM pages
! 50: * file system metadata buffers
! 51: * VFS name cache
! 52: * vnode cache
! 53: * size of the mbuf pool
1.3 dholland 54:
55: Step 2 is to restructure and connect things up so that it is readily
56: possible to get the necessary information from all the random places
57: in the kernel that these things occupy, without making a horrible mess
58: and without trashing system performance in the process or deadlocking
59: out the wazoo. This is not going to be particularly easy or fast.
61: Step 3 is to take some simple steps, like suggested above, to do
62: something useful with the coordinated information, and hopefully to
63: show via benchmarks that it has some benefit.
65: Step 4 is to look into more elaborate algorithms for unified control
66: of everything. The previous version of this project cited IBM's ARC
67: ("Adaptive Replacement Cache") as one thing to look at. (But note that
68: ARC may be encumbered -- someone please check on that and update this
69: page.) Another possibility is to deploy machine learning algorithms to
70: look for and exploit patterns.
72: Note: this is a serious research project. Step 3 will yield a
73: publishable minor paper; step 4 will yield a publishable major paper
74: if you manage to come up with something that works, and it quite
75: possibly contains enough material for a PhD thesis.
1.1 jmmv 77: """
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb