Currently the buffer handling logic only sorts the buffer queue (aka disksort). In an ideal world it would be able to coalesce adjacent small requests, as this can produce considerable speedups. It might also be worthwhile to split large requests into smaller chunks on the fly as needed by hardware or lower-level software.

Note that the latter interacts nontrivially with the ongoing dynamic MAXPHYS project and might not be worthwhile. Coalescing adjacent small requests (up to some potentially arbitrary MAXPHYS limit) is worthwhile regardless, though.