This is due to many people spending significant amounts of time fixing bugs, submitting problem reports, improving tests, and (not unimportant in the cases at hand) improving the in-tree toolchain.
We collect the results of various regular test runs on the releng test results page, but this gives no good overview of the failures found by the different runs. I proposed a GSoC project to help with this; the idea is to collect all test results in a database and have nice query and statistics pages. If you are a student and interested in this, please check the project description.
For the time being, here is an overview of the top performers (< 10 failures) on this week's tests against -current:
And the same table for the netbsd-7 branch:
I will work on getting shark results down to zero next (it is the strangler in my own runs), and then hppa (hi Nick!) and sparc. The ultimate target is my VAX, but I am fighting with various libm and toolchain issues there, so this will take more effort.