Parallelize page queues

Contact: tech-kern
Mentors: Taylor R Campbell, Matthew R. Green, Chuck Silvers

IMPORTANT: This project was completed by Andrew Doran. You may still contact the people above for details, but please do not submit an application for this project.

For many resource-intensive applications on NetBSD, the biggest bottlenecks in the operating system are physical page allocation and mapping -- specifically, contention over the centralized queues of free pages and of cached pages that can be freed, and over acquiring references to pages that are already allocated.

This can be broken into three independent milestones:

The queues of pages that are currently free. Although there are per-CPU queues, access to the queues is serialized under the single global lock uvm_fpageqlock. Instead, access to the per-CPU queues should be done on the local CPU without talking to the other CPUs unless the per-CPU queue runs out.
When memory is short, the part of the system called the page daemon must choose some cached pages to free. The page daemon currently maintains queues of active and inactive pages, to prefer freeing up inactive pages. Access to these is serialized by another global lock, uvm_pageqlock. Instead, the page daemon should be modified so that adjusting page activity is not globally serialized.
A related bottleneck is acquiring references to physical pages from a frequently used virtual memory object, such as the libc.so file, which is serialized by a per-file lock, the vmobjlock, that is a bottleneck for, e.g., executing new processes. The nature of the contention needs to be analyzed: is the bottleneck in acquiring different physical pages from the VM object, or the same ones? If it's mostly different physical pages, breaking vmobjlock into multiple locks may suffice; if it's mostly the same physical pages, a new strategy is needed, e.g. perhaps a lockless radix tree.

Add a comment

Last edited late Saturday afternoon, March 28th, 2020

Preferences | Logout