- Contact: tech-toolchain
- Duration estimate: 3-4 months
Currently BSD make emits tons and tons of stat(2) calls in the course of figuring out what to do, most notably (but not only) when matching suffix rules. This causes measurable amounts of overhead, especially when there are a lot of files like in e.g. libc. First step is to quantify this overhead so you can tell what you're accomplishing.
Fixing this convincingly requires a multi-step approach: first, give make an abstraction layer for directories it's working with. This alone is a nontrivial undertaking because make is such a mess inside. This step should also involve sorting out various issues where files in other directories (other than the current directory) sometimes don't get checked for changes properly.
Then, implement a cache under that abstraction layer so make can remember what it's learned instead of making the same stat calls over and over again. Also, teach make to scan a directory with readdir() instead of populating the cache by making thousands of scattershot stat() calls, and implement a simple heuristic to decide when to use the readdir method.
Unfortunately, in general after running a program the cache has to be flushed because we don't know what the program did. The final step in this project is to make use of kqueue or inotify or similar when running programs to synchronize make's directory cache so it doesn't have to be flushed.
As an additional step one might also have make warn when a recipe touches files it hasn't been declared to touch... but note that while this is desirable it is also somewhat problematic.