This page contains the data of all the available projects proposals as a single document to permit easy searching within the page.

Note that this page also contains an Atom/RSS feed. If you are interested in being notified about changes to the projects whenever these pages get updated, you should subscribe to this feed.

If one embarks on a set of package updates and it doesn't work out too well, it is nice to be able to roll back to a previous state. This entails two things: first, reverting the set of installed packages to what it was at some chosen prior time, and second, reverting changes to configuration, saved application state, and other such material that might be needed to restore to a fully working setup.

This project is about the first part, wihch is relatively tractable. The second part is Hard(TM), but it's pointless to even speculate about it until we can handle the first part, which we currently can't. Also, in many cases the first part is perfectly sufficient to recover from a problem.

Ultimately the goal would be to be able to do something like pkgin --revert yesterday but there are a number of intermediate steps to that to provide the infrastructure; after that adding the support to pkgin (or whatever else) should be straightforward.

The first thing we need is a machine-readable log somewhere of package installs and deinstalls. Any time a package is installed or removed, pkg_install adds a record to this log. It also has to be possible to trim the log so it doesn't grow without bound -- it might be interesting to keep the full history of all package manipulations over ten years, but for many people that's just a waste of disk space. Adding code to pkg_install to do this should be straightforward; the chief issues are

  • choosing the format
  • deciding where to keep the data

both of which require some measure of community consensus.

Preferably, the file should be text-based so it can be manipulated by hand if needed. If not, there ought to be some way to recover it if it gets corrupted. Either way the file format needs to be versioned and extensible to allow for future changes.

The file format should probably have a way to enter snapshots of the package state in addition to change records; otherwise computing the state at a particular point requires scanning the entire file. Note that this is very similar to deltas in version control systems and there's a fair amount of prior art.

Note that we can already almost do this using /var/backups/work/pkgs.current,v; but it only updates once a day and the format has assorted drawbacks for automated use.

The next thing needed is a tool (maybe part of pkg_info, maybe not) to read this log and both (a) report the installed package state as of a particular point in time, and (b) print the differences between then and now, or between then and some other point in time.

Given these two things it becomes possible to manually revert your installed packages to an older state by replacing newer packages with older packages. There are then two further things to do:

Arrange a mechanism to keep the .tgz files for old packages on file.

With packages one builds oneself, this can already be done by having them accumulate in /usr/pkgsrc/packages; however, that has certain disadvantages, most notably that old packages have to be pruned by hand. Also, for downloaded binary packages no comparable scheme exists yet.

Provide a way to compute the set of packages to alter, install, or remove to switch to a different state. This is somewhat different from, but very similar to, the update computations that tools like pkgin and pkg_rolling-replace do, so it probably makes sense to implement this in one or more of those tools rather than in pkg_install; but perhaps not.

There are some remaining issues, some of which aren't easily solved without strengthening other things in pkgsrc. The most notable one is: what about package options? If I rebuild a package with different options, it's still the "same" package (same name, same version) and even if I record the options in the log, there's no good way to distinguish the before and after binary packages.

Posted terribly early Saturday morning, August 30th, 2014 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Some of the games in base would be much more playable with even a simple graphic interface. The card games (e.g. canfield, cribbage) are particularly pointed examples; but quite a few others, e.g. atc, gomoku, hunt, and sail if anyone ever fixes its backend, would benefit as well.

There are two parts to this project: the first and basically mostly easy part is to pick a game and write an alternate user interface for it, using SDL or gtk2 or tk or whatever seems appropriate.

The hard part is to arrange the system-level infrastructure so that the new user interface appears as a plugin for the game that can be installed from pkgsrc but that gets found and run if appropriate when the game is invoked from /usr/games. The infrastructure should be sufficiently general that lots of programs in base can have plugins.

Some things this entails:

  • possibly setting up a support library in the base system for program plugins, if it appears warranted;

  • setting up infrastructure in the build system for programs with plugins, if needed;

  • preferably also setting up infrastructure in the build system for building plugins;

  • choosing a place to put the header files needed to build external plugins;

  • choosing a place to put plugin libraries, too, as there isn't a common model out there yet;

  • establishing a canonical way for programs in base to find things in pkgsrc, which is not technically difficult but will require a lot of wrangling to reach a community consensus;

  • setting up any pkgsrc infrastructure needed to build plugin packages for base (this should take little or no work).

It is possible that plugins warrant a toplevel directory under each prefix rather than being stuffed in lib/ directories; e.g. /usr/plugins, /usr/pkg/plugins, /usr/local/plugins, so the pkgsrc-installed plugins for e.g. rogue would go by default in /usr/pkg/plugins/rogue.

Note that while this project is disguised as a frivolous project on games, the infrastructure created will certainly be wanted in the medium to long term for other more meaty things. Doing it on games first is a way to get it done without depending on or conflicting with much of anything else.

Posted Wednesday afternoon, April 2nd, 2014 Tags: ?category:desktop ?difficulty:medium ?project ?status:active

Right now if you install NetBSD with native X11 you get the same default X session that X has shipped forever, complete with TWM. Just like it's 1989.

Obviously this behavior can be improved after installation by adding suitable packages and choosing a different window manager, or installing one of the full desktop environments. Veteran NetBSD users know how to do this. Veteran Unix and X users who aren't veteran NetBSD users can figure it out pretty easily. Most such people have an X setup they've been using for years that they can and do just plop into place on a new machine.

This project does not affect those users, even the ones who for some crazy reason want to use twm. It also does not affect users who just want to install GNOME, KDE, or whatever other desktop.

The goal of this project is to provide a less knobby default X environment for users who are willing to run plain X rather than a full desktop but do not have their own X setup.

The project is: choose a window manager suitable for replacing twm in base; import it into base; and update the default X config to provide a decent default session using it.

Note that it needs to be a mainstream window manager; tiling window managers and other exotica are not really suitable for the purpose. It should also not depend on anything that isn't in base (that is, base with native X), it should be smallish and not a bloatmonster, and ideally it should be BSD-licensed.

Based on the window managers in pkgsrc, there are two chief candidates: fluxbox and golem. Each has pluses and minuses. Some of these are:

  • golem is prettier and fully themeable
  • golem is written in C, fluxbox in C++
  • golem is a lot smaller (less than 1/10th the code size)
  • golem currently needs some work, chiefly on the config system; fluxbox appears to be ready to go
  • golem is just about dead upstream; fluxbox is not very active but is still getting occasional releases

Currently it looks like golem is a better choice, especially since ideally we'd like to pick something that will run decently on older/slower ports. But it requires more work.

Note that we could (AFAIK) become upstream for golem; this is not necessarily a bad thing.

There might be other possible choices; these two are the ones in pkgsrc that seem viable.

Note: this project is not listed as a GSoC project (although doing the needed work on golem maybe could be one) because much of it has to do with build system integration and ancient X things and other stuff that requires experience a student probably won't have.

Posted Wednesday afternoon, April 2nd, 2014 Tags: ?category:desktop ?difficulty:medium ?project ?status:active

In MacOS X, while logged in on the desktop, you can switch to another user (leaving yourself logged in) via a system menu.

The best approximation to this we have right now is to hit ctrl-alt-Fn, switch to a currently unused console, log in, and run startx with an alternate screen number. This has a number of shortcomings, both from the point of view of general polish (logging in on a console and running startx is very untidy, and you have to know how) and of technical underpinnings (starting multiple X servers uses buckets of memory, may cause driver or drmgem issues, acceleration may not work except in the first X server, etc.)

Ideally we'd have a better scheme for this. We don't necessarily need something as slick as OS X provides, and we probably don't care about Apple's compositing effects when switching, but it's useful to be able to switch users as a way of managing least privilege and it would be nice to make it easy to do.

The nature of X servers makes this difficult; for example, it isn't in any way safe to use the same X server for more than one user. It also isn't safe to connect a secure process to a user's X server to display things.

It seems that the way this would have to work is akin to job control: you have a switching supervisor process, which is akin to a shell, that runs as root in order to do authentication and switches (or starts) X servers for one or more users. The X servers would all be attached, I guess, to the same graphics backend device (wsfb, maybe? wsdrmfb?) and be switched in and out of the foreground in much the way console processes get switched in and out of the foreground on a tty.

You have to be able to get back to the switching supervisor from whatever user and X server you're currently running in. This is akin to ^Z to get back to your shell in job control. However, unlike in job control there are security concerns: the key combination has to be something that a malicious application, or even a malicious X server, can't intercept. This is the "secure attention key". Currently even the ctrl-alt-Fn sequences are handled by the X server; supporting this will take quite a bit of hacking.

Note that the secure attention key will also be wanted for other desktop things: any scheme that provides desktop-level access to system configuration needs to be able to authenticate a user as root. The only safe way to do this is to notify the switching supervisor and then have the user invoke the secure attention key; then the user authenticates to the switching supervisor, and that in turn provides some secured channel back to the application. This avoids a bunch of undesirable security plumbing as is currently found in (I think) GNOME; it also avoids the habit GNOME has of popping up unauthenticated security dialogs asking for the root password.

Note that while the switching supervisor could adequately run in text mode, making a nice-looking graphical one won't be difficult. Also that would allow the owner of the machine to configure the appearance (magic words, a custom image, etc.) to make it harder for an attacker to counterfeit the thing.

It is probably not even possible to think about starting this project until DRM/GEM/KMS stuff is more fully deployed, as the entire scheme presupposes being able to switch between X servers without the X servers' help.

Posted Wednesday afternoon, April 2nd, 2014 Tags: ?category:desktop ?difficulty:medium ?project ?status:active

Support for "real" desktops on NetBSD is somewhat limited and also scattershot. Part of the reason for this is that most of the desktops are not especially well engineered, and they contain poorly designed infrastructure components that interact incestuously with system components that are specific to and/or make sense only on Linux. Providing the interfaces these infrastructure components wish to talk to is a large job, and Linux churn being what it is, such interfaces are usually moving targets anyway. Reimplementing them for NetBSD is also a large job and often presupposes things that aren't true or would require that NetBSD adopt system-level design decisions that we don't want.

Rather than chase after Linux compatibility, we are better off designing the system layer and infrastructure components ourselves. It is easier to place poorly conceived interfaces on top of a solid design than to reimplement a poorly conceived design. Furthermore, if we can manage to do a few things right we should be able to get at least some uptake for them.

The purpose of this project, per se, is not to implement any particular piece of infrastructure. It is to identify pieces of infrastructure that the desktop software stacks need from the operating system, or provide themselves but ought to get from the operating system, figure out solid designs that are compatible with Unix and traditional Unix values, and generate project descriptions for them in order to get them written and deployed.

For example, GNOME and KDE rely heavily on dbus for sending notifications around. While dbus itself is a more or less terrible piece of software that completely fails to match the traditional Unix model (rather than doing one thing well, it does several things at once, pretty much all badly) it is used by GNOME and KDE because there are things a desktop needs to do that require sending messages around. We need a transport for such messages; not a reimplementation of dbus, but a solid and well-conceived way of sending notices around among running processes.

Note that one of the other things dbus does is start services up; we already have inetd for that, but it's possible that inetd could use some strengthening in order to be suitable for the purpose. It will probably also need to be taught about whatever message scheme we come up with. And we might want to e.g. set up a method for starting per-user inetd at login time.

(A third thing dbus does is serve as an RPC library. For this we already have XDR, but XDR sucks; it may be that in order to support existing applications we want a new RPC library that's more or less compatible with the RPC library parts of dbus. Or that may not be feasible.)

This is, however, just one of the more obvious items that arises.

Your goal working on this project is to find more. Please edit this page and make a list, and add some discussion/exposition of each issue and possible solutions. Ones that become fleshed out enough can be moved to their own project pages.

Posted Wednesday afternoon, April 2nd, 2014 Tags: ?category:desktop ?difficulty:medium ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 3-9 months?

The ZFS port to NetBSD is half-done, or maybe more than half. Finish it and get it really running.

(This page is a stub. Someone who knows what the current state is, please expand it.)

Posted early Thursday morning, February 27th, 2014 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

Port or implement NFSv4.

As of this writing the FreeBSD NFS code, including NFSv4, has been imported into the tree but no work has yet been done on it. The intended plan is to port it and eventually replace the existing NFS code, which nobody is particularly attached to.

It is unlikely that writing new NFSv4 code is a good idea, but circumstances might change.

Posted terribly early Thursday morning, February 27th, 2014 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 3-6 months

Port Dragonfly's Hammer file system to NetBSD.

Note that because Dragonfly has made substantial changes (different from NetBSD's substantial changes) to the 4.4BSD vfs layer this port may be more difficult than it looks up front.

Posted terribly early Thursday morning, February 27th, 2014 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

The U-Boot bootloader is being used by an increasing number of devices, including lots which run NetBSD. The NetBSD ${TOOLDIR} infrastructure even includes the mkubootimage program, which is used to wrap binaries (e.g. a kernel) into a packet understood by U-Boot.

While the ${TOOLDIR} provides all toolchain bits needed (cross-compiler, assembler, linker, ...) and can be created by the build.sh script automatically for any supported architecture, U-Boot itself needs to be compiled natively on a Linux machine.

The purpose of this project is to fix this, and feed as much as possible of the resulting changes upstream to the U-Boot developers.

If possible, the result should be a pkgsrc package, that only needs to be pointed at a pre-populated tooldir for building. But a simple bsd style makefile would be good enough as well.

An optional extension (but unlikely to be doable within the GSoC timescale): add support for loading from FFS file system to U-Boot. This would allow devices with SATA or USB disks to load the kernel directly from the root file system (while currently a wrappd .ub copy of the kernel has to be put into flash or on SD card).

Posted early Wednesday morning, February 19th, 2014 Tags: ?category:userland ?difficulty:medium ?project ?status:active

An important part of a binary-only environment is to support something similar to the existing options framework, but produce different combinations of pkgs.

There is an undefined amount of work required to accomplish this, and also a set of standards and naming conventions to be applied for the output of these pkgs.

This project should also save on having to rebuild "base" components of software to support different options: see amanda-base for an example.

OpenBSD supports this now and should be leveraged heavily for borrowing.

Posted Thursday evening, October 3rd, 2013 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Improve support for NetBSD in various GNOME packages.

The easy targets are adding more support for NetBSD to various system and network monitoring infrastructure (which support some stuff already).

A harder step is to port the devicekit and libudev parts of GNOME.

Posted Tuesday evening, July 30th, 2013 Tags: ?category:pkgsrc ?difficulty:high ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 3 months

Lua is obviously a great choice for an embedded scripting language in both kernel space and user space.

However, there exists a issue where Lua needs bindings to interface with NetBSD for it to be useful.

Someone needs to take on the grunt-work task of simply creating Lua interfaces into every possible NetBSD component.

That same someone needs to produce meaningful documentation.

As this enhances Lua's usefulness, more lua bindings will then be created, etc.

This project's goal is to set that inertia by attempting to define and create a solid baseline of Lua bindings.

Here is a list of things to possibly target

  • filesystem hooks
  • sysctl hooks
  • proplib
  • envsys
  • MI system bus code

WIP

Posted late Sunday afternoon, June 2nd, 2013 Tags: ?category:misc ?difficulty:medium ?project ?status:active

Right now kqueue, the kernel event mechanism, only attaches to individual files. This works great for sockets and the like but doesn't help for a directory full of files.

The end result should be feature parity with linux's inotify (a single dir worth of notifications, not necessarily sub-dirs) and the design must be good enough to be accepted by other users of kqueue: osx, freebsd, etc

For an overview of inotify start on wikipedia.
Another example API is the windows event api.

I believe most of the work will be around genfs, namei, and vfs.

Posted at lunch time on Monday, April 15th, 2013 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

pkgin is aimed at being an apt-/yum-like tool for managing pkgsrc binary packages. It relies on pkg_summary(5) for installation, removal and upgrade of packages and associated dependencies, using a remote repository.

While pkgin is now widely used and seems to handle packages installation, upgrade and removal correctly, there's room for improvement. In particular:

Main quest

  • Support for multi-repository
  • Speed-up installed packages database calculation
  • Speed-up local vs remote packages matching
  • Implement an automated-test system
  • Handle conflicting situations (MySQL conflicting with MySQL...)
  • Better logging for installed / removed / warnings / errors
  • Make pkgin independent from pkg_install binaries, use pkg_install libraries or abstract them

Bonus I

In order to ease pkgin's integration with third party tools, it would be useful to split it into a library (libpkgin) providing all methods needed to manipulate packages, i.e., all pkgin's runtime capabilities and flags.

Bonus II (operation "burn the troll")

It would be a very nice addition to abstract SQLite backend so pkgin could be plugged to any database system. A proof of concept using bdb or cdb would fulfill the task.

Useful steps:

  • Understand pkgsrc
  • Understand pkg_summary(5)'s logic
  • Understand SQLite integration
  • Understand pkgin's `impact.c' logic
Posted early Tuesday morning, April 9th, 2013 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active
  • Contact: tech-pkg
  • Duration estimate: 1 month

Mancoosi is a project to measure distribution quality by the number of conflicts in a software distribution environment.

The main work of this project would be to analyse possible incompatibilities of pkgsrc and mancoosi (i.e., incompatibilities between Debian and pkgsrc), to write a converter from pkg_summary(5) to CUDF (the format used by Mancoosi). You will need OCaml for doing so, as the libraries used by Mancoosi are written in OCaml.

When you are a French student in the third year of your Bachelor, or first year of your Master, you could also do this project as an internship for your studies.

Posted at lunch time on Wednesday, March 27th, 2013 Tags: ?category:pkgsrc ?difficulty:easy ?project ?status:active

NOTE: this project has been partly completed during GSoC 2013, the remaining cleanup work and tidying is likely not enough work for another GSoC!

Oracle VM Virtualbox is a virtualization platform that already runs NetBSD guest machines fine.

However, there are a few fine tunings missing: mouse and graphics support, better integration via client tools (clipboard sharing, time synchronization), and maybe some possible optimizations.

FreeBSD is supported, and the sources are available. Note that the FreeBSD kernel code might have license problems.

As a start there is a pkg of the build tools in pkgsrc/wip already (an ancient version), and Valeriy Ushakov has patches for current VirtualBox sources and NetBSD 5, to get the build tools going.

Overall this project is a mixture of various kernel and userland tasks. For GSoC, some of these may be defined as optional - e.g. working X11 should be a target, but could be dropped if other things turn out to be more difficult than expected. A list of things to check and improve could include:

  • time synchronization
  • proper mouse integration within X (absolute coordinates)
  • shared clipboard support
  • higher video resolutions
  • hardware acceleration (XVideo, DRI...)

You need a machine with hardware virtualization support (VT-x/Intel or AMD-V) for reasonable guest machine speed. Supporting NetBSD fast without this would be possible with a few tweaks, but is not part of this project (and maybe not even a reasonable goal beyound GSoC).

Technical documentation about VirtualBox

Build environment documentation

Build tool sources

Posted mid-morning Thursday, March 21st, 2013 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Several ports support a /boot.cfg file. Examples are i386, amd64 (which share a bootloader) and sparc64.

However, they do not share code, nor even the basic command parser. Obviously this is not the NetBSD way to do things and the situation needs to be improved.

The goal of this project is to split the machine dependent parts out and provide generic, machine independent support for most of the /boot.cfg handling, leaving configuration (like what commands are allowed) to the architecture specific code, as well as provide means for overriding command handlers (i.e. implement a common command differently).

Due to the organization of the bootstrapping code this is not as easy as it sounds at first sight, but it does not require deep kernel hacking skills either. Having various hardware available for testing is a bonus, but not required.

Posted Sunday evening, March 17th, 2013 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

WARNING: This project is very preliminary and incomplete.

NPF is a packet filter for the NetBSD system. The goal of this project is to create a web-based user interface for NetBSD + NPF as a firewall system.

High level deliverables are the following:

  • Web UI interface using HTML, CSS, Python or other languages. No PHP, sorry!
  • The interface should be able to manage the following: set the interface addresses and route, configure DHCP, manage firewall rules, NAT (including port redirection), wireless AP.
  • Ideally, it would be packaged and integrated with NetBSD's startup system.

Comparable products: http://www.polarcloud.com/tomato http://openwrt.org http://www.pfsense.org http://m0n0.ch/wall http://www.dd-wrt.com

Posted Saturday evening, March 16th, 2013 Tags: ?category:userland ?difficulty:hard ?project ?status:active

NPF is a packet filter for the NetBSD system. For the packet classification engine it uses rules which are compiled into a byte-code. Recently, support for the BPF code has been added (see bpf(4) and pcap(3) manual pages). NPF has syntax parser which generates intermediate structures. Combined with pcap(3) library, generating BPF byte-code is almost trivial. However, there is no code to perform the inverse process i.e. generate npf.conf(5) syntax from the BPF byte-code.

The goal of this project is to write an unparser which would generate NPF configuration in npf.conf(5) syntax from the BPF byte-code and which would use the techniques researched in academia. It is important to note that the key deliverable of this project is a good design, which would be structured and easy to maintain. Therefore, academic experience with parsers and unparsers is a prerequisite.

High level deliverables are the following:

  • Academic reasoning on the design and the structure of the unparser, taking into account the flexibility of the configuration syntax.
  • Implementation of the BPF byte-code unparser into npf.conf(5).
  • Integration into the npfctl(8) utility.
Posted late Wednesday evening, March 6th, 2013 Tags: ?category:userland ?difficulty:hard ?project ?status:active

pygrub, which is part of the xentools package, allows booting a Linux domU without having to keep a copy of the kernel and ramdisk in the dom0 file system.

What it does is extract the grub.conf from the domU disk, processes it, and presents a grub like menu. It then extracts the appropriate kernel and ramdisk image from the domU disk, and stores temporary copies of them. Finally, it creates a file containing:

  • kernel=\
  • ramdisk=\
  • extras=\

This file gets processed by xend as if it were part of the domU config.

Most of the code required to do the NetBSD equivalent should be available as part of /boot or the libraries it depends on. The path to the source code for /boot is: sys/arch/i386/stand/boot.

Posted at midnight, December 27th, 2012 Tags: ?category:xen ?difficulty:medium ?project ?status:active

recent rpmbuild for redhat creates a second rpm for later install containing debug symbols.

This is a very convenient system for having symbols available for debugging but not clogging up the system until crashes happen and debugging is required.

The end result of this project should be a mk.conf flag or build target (debug-build) which would result in a set of symbol files gdb could use, and which could be distributed with the pkg.

Posted at lunch time on Sunday, December 2nd, 2012 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active
  • Contact: port-xen ???
  • Duration estimate: 3 months

Microsoft's cloud platform WindowsAzure has ec2-like features of creating custom virtual disks.

Would it be possible to push NetBSD onto this platform?

Wikipedia article describing the virtualization platform

Marketing material about azure virtual machines

FreeBSD drivers for the Hyper-V virtual machine interface

Posted in the wee hours of Saturday night, October 21st, 2012 Tags: ?category:kernel ?difficulty:medium ?project ?status:active
  • Contact: tech-pkg
  • Mentors: unknown
  • Duration estimate: unknown

Currently, bulk build results are sent to the pkgsrc-bulk mailing list. To figure out if a package is or has been building successfully, or when it broke, one must wade through the list archives and search the report e-mails by hand. Furthermore, to figure out what commits if any were correlated with a breakage, one must wade through the pkgsrc-changes archive and cross-reference manually.

The project is to produce a web/database application that can be run from the pkgsrc releng website on NetBSD.org that tracks bulk build successes and failures and provides search and crossreferencing facilities.

The application should subscribe to the pkgsrc-bulk and pkgsrc-changes mailing lists and ingest the data it finds into a SQL database. It should track commits to each package (and possibly infrastructure changes that might affect all packages) on both HEAD and the current stable branch, and also all successful and failed build reports on a per-platform (OS and/or machine type) basis.

The web part of the application should be able to retrieve summaries of currently broken packages, in general or for a specific platform and/or specific branch. It should also be able to generate a more detailed report about a single package, containing for example which platforms it has been built on recently and whether it succeeded or not; also, if it is broken, how long it has been broken, and the history of package commits and version bumps vs. build results. There will likely be other searches/reports wanted as well.

The application should also have an interface for people who do partial or individual-package check builds; that is, it should be able to generate a list of packages that have not been built since they were last committed, on a given platform or possibly on a per-user basis, and accept results from attempting to build these or subsets of these packages. It is not entirely clear what this interface should be (and e.g. whether it should be command-line-based, web-based, or what, and whether it should be limited to developers) and it's reasonable to expect that some refinements or rearrangements to it will be needed after the initial deployment.

The application should also be able to record cross-references to the bug database. To begin with at least it's reasonable for this to be handled manually.

This project should be a routine web/database application; there is nothing particularly unusual about it from that standpoint. The part that becomes somewhat less trivial is making all the data flows work: for example, it is probably necessary to coordinate an improvement in the way bulk build results are tagged by platform. It is also necessary to avoid importing the reports that appear occasionally on pkgsrc-bulk from misconfigured pbulk installs.

Note also that "OS" and "machine type" are not the only variables that can affect build outcome. There are also multiple compilers on some platforms, for which the results should be tracked separately, plus other factors such as non-default installation paths. Part of the planning phase for this project should be to identify all the variables of this type that should be tracked.

Also remember that what constitutes a "package" is somewhat slippery as well. The pkgsrc directory for a package is not a unique key; multiversion packages, such as Python and Ruby extensions, generate multiple results from a single package directory. There are also a few packages where for whatever reason the package name does not match the pkgsrc directory. The best key seems to be the pkgsrc directory paired with the package-name-without-version.

Posted late Sunday night, April 9th, 2012 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Font handling in Unix has long been a disgrace. Every program that needs to deal with fonts has had to do its own thing, and there has never been any system-level support, not even to the point of having a standard place to install fonts in the system. (While there was/is a place to put X11 fonts, X is only one of the many programs doing their own thing.)

Font management should be a system service. There cannot be one type of font file -- it is far too late for that -- but there should be one standardized place for fonts, one unified naming scheme for fonts and styles, and one standard way to look up and open/retrieve/use fonts. (Note: "one standardized place" means relative to an installation prefix, e.g. share/fonts.)

Nowadays fontconfig is capable of providing much of this.

The project is:

  1. Figure out for certain if fontconfig is the right solution. Note that even if the code isn't, the model may be. If fontconfig is not the right solution, figure out what the right solution is. Also ascertain whether existing applications that use the fontconfig API can be supported easily/directly or if some kind of possibly messy wrapper layer is needed and doing things right requires changing applications. Convince the developer community that your conclusions are correct so that you can go on to step 2.

1a. Part of this is identifying exactly what is involved in "managing" and "handling" fonts and what applications require.

  1. Implement the infrastructure. If fontconfig is the right solution, this entails moving it from the X sources to the base sources. Also, some of the functionality/configurability of fontconfig is probably not needed in a canonicalized environment. All of this should be disabled if possible. If fontconfig is not the right solution, implement something else in base.

  2. Integrate the new solution in base. Things in base that use fonts should use fontconfig or the replacement for fontconfig. This includes console font handling, groff, mandoc, and X11. Also, the existing fonts that are currently available only to subsets of things in base should be made available to all software through fontconfig or its replacement.

3a. It would also be useful to kill off the old xfontsel and replace it with something that interoperates fully with the new system and also has a halfway decent user interface.

  1. Deploy support in pkgsrc. If the new system does not itself provide the fontconfig API, provide support for that via a wrapper package. Teach applications in pkgsrc that use fontconfig to recognize and use the new system. (This should be fairly straightforward and not require patching anything.) Make pkgsrc font installation interoperate with the new system. (This should be fairly straightforward too.) Take an inventory of applications that use fonts but don't use fontconfig. Patch one or two of these to use the new system to show that it can be done easily. If the new system is not fontconfig and has its own better API, take an inventory of applications that would benefit from being patched to use the new API. Patch one or two of these to demonstrate that it can be done easily. (If the answers from step 1 are correct, it should be fairly easy for most ordinary applications.)

  2. Persuade the rest of the world that we've done this right and try to get them to adopt the solution. This is particularly important if the solution is not fontconfig. Also, if the solution is not fontconfig, provide a (possibly limited) version of the implementation as a pkgsrc package that can be used on other platforms by packages and applications that support it.

Note that step 1 is the hard part. This project requires a good bit of experience and familiarity with Unix and operating system design to allow coming up with a good plan. If you think it's obvious that fontconfig is the right answer and/or you can't see why there might be any reason to change anything about fontconfig, then this may not be the right project for you.

Because of this, this project is really not suitable for GSoC.

Posted at midnight, March 26th, 2012 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Dependency handling in pkgsrc is rather complex task. There exist some cases (TeX packages, Perl packages) where it is hard to find build dependencies precisely and the whole thing is handled conservatively. E.g. the whole TeXLive meta-package is declared a build dependency even when rather small fraction of it is used actually. Another case is stale heavy dependency which is no longer required but still listed as prerequisite.

It would be nice to have a tool (or a set of them, if necessary) to detect which installed packages, libraries or tools were actually used to build new package. Ideally, the tool should report files used during configure, build, and test stages, and packages these files are provided by.

Posted late Saturday evening, March 17th, 2012 Tags: ?category:pkgsrc ?difficulty:hard ?project ?status:active

There exist two ways of launching processes: one is forking them with fork or vfork and replacing the clone with exec-family function, another is spawning process directly with procedures like posix_spawn. Not all platforms implement fork model, and spawn model has its own merits.

pkgsrc relies heavily on launching subprocesses when building software. NetBSD posix_spawn support was implemented in GSoC 2011, it is included in NetBSD 6.0. Now that NetBSD supports both ways, it would be nice to compare the efficiency of both ways of launching subprocesses and measure its effect when using pkgsrc (both in user and developer mode). In order to accomplish that, the following tools should support posix_spawn:

  • devel/bmake
  • shells/pdksh
  • NetBSD base make
  • NetBSD sh
  • NetBSD base ksh
  • potentially some other tools (e.g. lang/nawk, shells/bash, lang/perl5)

Optionally, MinGW spawn support can be added as well.

The main goal of this project is to support starting processes and subprocesses by posix_spawn in devel/bmake and in shells/pdksh, measure its efficiency and compare it to traditional fork+exec.

Posted late Saturday evening, March 17th, 2012 Tags: ?category:pkgsrc ?difficulty:easy ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to implement interrupt handling at the granularity of a networking interface. When a network device gets an interrupt, it could call <iftype>_defer(ifp) to schedule a kernel continuation (see kernel continuations) for that interface which could then invoke <iftype>_poll. Whether the interrupted source should be masked depends on if the device is a DMA device or a PIO device. This routine should then call (*ifp->if_poll)(ifp) to deal with the interrupt's servicing.

During servicing, any received packets should be passed up via (*ifp->if_input)(ifp, m) which would be responsible for ALTQ or any other optional processing as well as protocol dispatch. Protocol dispatch in <iftype>_input decodes the datalink headers, if needed, via a table lookup and call the matching protocol's pr_input to process the packet. As such, interrupt queues (e.g. ipintrq) would no longer be needed. Any transmitted packets can be processed as can MII events. Either true or false should be returned by if_poll depending on whether another invokation of <iftype>_poll for this interface should be immediately scheduled or not, respectively.

Memory allocation has to be prohibited in the interrupt routines. The device's if_poll routine should pre-allocate enough mbufs to do any required buffering. For devices doing DMA, the buffers are placed into receive descripors to be filled via DMA.

For devices doing PIO, pre-allocated mbufs are enqueued onto the softc of the device so when the interrupt routine needs one it simply dequeues one, fills in it in, and then enqueues it onto a completed queue, finally calls <iftype>_defer. If the number of pre-allocated mbufs drops below a threshold, the driver may decide to increase the number of mbufs that if_poll pre-allocates. If there are no mbufs left to receive the packet, the packets is dropped and the number of mbufs for if_poll to pre-allocate should be increased.

When interrupts are unmasked depends on a few things. If the device is interrupting "too often", it might make sense for the device's interrupts to remain masked and just schedule the device's continuation for the next clock tick. This assumes the system has a high enough value set for HZ.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to implement lockless and atomic FIFO/LIFO queues in the kernel. The routines to be implemented allow for commonly typed items to be locklessly inserted at either the head or tail of a queue for either last-in, first-out (LIFO) or first-in, first-out (FIFO) behavior, respectively. However, a queue is not instrinsicly LIFO or FIFO. Its behavior is determined solely by which method each item was pushed onto the queue.

It is only possible for an item to removed from the head of queue. This removal is also performed in a lockless manner.

All items in the queue must share a atomic_queue_link_t member at the same offset from the beginning of item. This offset is passed to atomic_qinit.

The proposed interface looks like this:

  • void atomic_qinit(atomic_queue_t *q, size_t offset);

    Initializes the atomic_queue_t queue at q. offset is the offset to the atomic_queue_link_t inside the data structure where the pointer to the next item in this queue will be placed. It should be obtained using offsetof.

  • void *atomic_qpeek(atomic_queue_t *q);

    Returns a pointer to the item at the head of the supplied queue q. If there was no item because the queue was empty, NULL is returned. No item is removed from the queue. Given this is an unlocked operation, it should only be used as a hint as whether the queue is empty or not.

  • void *atomic_qpop(atomic_queue_t *q);

    Removes the item (if present) at the head of the supplied queue q and returns a pointer to it. If there was no item to remove because the queue was empty, NULL is returned. Because this routine uses atomic Compare-And-Store operations, the returned item should stay accessible for some indeterminate time so that other interrupted or concurrent callers to this function with this q can continue to deference it without trapping.

  • void atomic_qpush_fifo(atomic_queue_t *q, void *item);

    Places item at the tail of the atomic_queue_t queue at q.

  • void atomic_qpush_lifo(atomic_queue_t *q, void *item);

    Places item at the head of the atomic_queue_t queue at q.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to make the SYN cache optional. For small systems, this is complete overkill and should be made optional.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to remove the ARP, AARP, ISO SNPA, and IPv6 Neighbors from the routing table. Instead, the ifnet structure should have a set of nexthop caches (usually implemented using patricia trees), one per address family. Each nexthop entry should contain the datalink header needed to reach the neighbor.

This will remove cloneable routes from the routing table and remove the need to maintain protocol-specific code in the common Ethernet, FDDI, PPP, etc. code and put it back where it belongs, in the protocol itself.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to split out obvious PR*_xxx that should have never been dispatched through the pr_usrreq/pr_ctloutput. Note that pr_ctloutput should be replaced by pr_getopt/pr_setopt:

  • PRU_CONTROL -> pr_ioctl
  • PRU_PURGEIF -> pr_purgeif
  • PRC0_GETOPT -> pr_getopt
  • PRC0_GETOPT -> pr_setopt

It's expected that pr_drain will just schedule a kernel continuation such as:

  • pr_init -> int pr_init(void *dsc);
  • int pr_fini(void *dsc)
Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to implement full virtual network stacks. A virtual network stack collects all the global data for an instance of a network stack (excluding AF_LOCAL). This includes routing table, data for multiple domains and their protocols, and the mutexes needed for regulating access to it all. Instead, a brane is an instance of a networking stack.

An interface belongs to a brane, as do processes. This can be considered a chroot(2) for networking, e.g. chbrane(2).

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to implement continuations at the kernel level. Most of the pieces are already available in the kernel, so this can be reworded as: combine callouts, softints, and workqueues into a single framework. Continuations are meant to be cheap; very cheap.

Please note that the main goal of this project is to simplify the implementation of SMP networking, so care must be taken in the design of the interface to support all the features required for this other project.

The proposed interface looks like the following. This interface is mostly derived from the callout(9) API and is a superset of the softint(9) API. The most significant change is that workqueue items are not tied to a specific kernel thread.

  • kcont_t *kcont_create(kcont_wq_t *wq, kmutex_t *lock, void (*func)(void *, kcont_t *), void *arg, int flags);

    A wq must be supplied. It may be one returned by kcont_workqueue_acquire or a predefined workqueue such as (sorted from highest priority to lowest):

    • wq_softserial, wq_softnet, wq_softbio, wq_softclock
    • wq_prihigh, wq_primedhigh, wq_primedlow, wq_prilow

    lock, if non-NULL, should be locked before calling func(arg) and released afterwards. However, if the lock is released and/or destroyed before the called function returns, then, before returning, kcont_set_mutex must be called with either a new mutex to be released or NULL. If acquiring lock would block, other pending kernel continuations which depend on other locks may be dispatched in the meantime. However, all continuations sharing the same set of { wq, lock, [ci] } need to be processed in the order they were scheduled.

    flags must be 0. This field is just provided for extensibility.

  • int kcont_schedule(kcont_t *kc, struct cpu_info *ci, int nticks);

    If the continuation is marked as INVOKING, an error of EBUSY should be returned. If nticks is 0, the continuation is marked as INVOKING while EXPIRED and PENDING are cleared, and the continuation is scheduled to be invoked without delay. Otherwise, the continuation is marked as PENDING while EXPIRED status is cleared, and the timer reset to nticks. Once the timer expires, the continuation is marked as EXPIRED and INVOKING, and the PENDING status is cleared. If ci is non-NULL, the continuation is invoked on the specified CPU if the continuations's workqueue has per-cpu queues. If that workqueue does not provide per-cpu queues, an error of ENOENT is returned. Otherwise when ci is NULL, the continuation is invoked on either the current CPU or the next available CPU depending on whether the continuation's workqueue has per-cpu queues or not, respectively.

  • void kcont_destroy(kcont_t *kc);

  • kmutex_t *kcont_getmutex(kcont_t *kc);

    Returns the lock currently associated with the continuation kc.

  • void kcont_setarg(kcont_t *kc, void *arg);

    Updates arg in the continuation kc. If no lock is associated with the continuation, then arg may be changed at any time; however, if the continuation is being invoked, it may not pick up the change. Otherwise, kcont_setarg must only be called when the associated lock is locked.

  • kmutex_t *kcont_setmutex(kcont_t *kc, kmutex_t *lock);

    Updates the lock associated with the continuation kc and returns the previous lock. If no lock is currently associated with the continuation, then calling this function with a lock other than NULL will trigger an assertion failure. Otherwise, kcont_setmutex must be called only when the existing lock (which will be replaced) is locked. If kcont_setmutex is called as a result of the invokation of func, then after kcont_setmutex has been called but before func returns, the replaced lock must have been released, and the replacement lock, if non-NULL, must be locked upon return.

  • void kcont_setfunc(kcont_t *kc, void (*func)(void *), void *arg);

    Updates func and arg in the continuation kc. If no lock is associated with the continuation, then only arg may be changed. Otherwise, kcont_setfunc must be called only when the associated lock is locked.

  • bool kcont_stop(kcont_t *kc);

    The kcont_stop function stops the timer associated the continuation handle kc. The PENDING and EXPIRED status for the continuation handle is cleared. It is safe to call kcont_stop on a continuation handle that is not pending, so long as it is initialized. kcont_stop will return a non-zero value if the continuation was EXPIRED.

  • bool kcont_pending(kcont_t *kc);

    The kcont_pending function tests the PENDING status of the continuation handle kc. A PENDING continuation is one who's timer has been started and has not expired. Note that it is possible for a continuation's timer to have expired without being invoked if the continuation's lock could not be acquired or there are higher priority threads preventing its invokation. Note that it is only safe to test PENDING status when holding the continuation's lock.

  • bool kcont_expired(kcont_t *kc);

    Tests to see if the continuation's function has been invoked since the last kcont_schedule.

  • bool kcont_active(kcont_t *kc);

  • bool kcont_invoking(kcont_t *kc);

    Tests the INVOKING status of the handle kc. This flag is set just before a continuation's function is being called. Since the scheduling of the worker threads may induce delays, other pending higher-priority code may run before the continuation function is allowed to run. This may create a race condition if this higher-priority code deallocates storage containing one or more continuation structures whose continuation functions are about to be run. In such cases, one technique to prevent references to deallocated storage would be to test whether any continuation functions are in the INVOKING state using kcont_invoking, and if so, to mark the data structure and defer storage deallocation until the continuation function is allowed to run. For this handshake protocol to work, the continuation function will have to use the kcont_ack function to clear this flag.

  • bool kcont_ack(kcont_t *kc);

    Clears the INVOKING state in the continuation handle kc. This is used in situations where it is necessary to protect against the race condition described under kcont_invoking.

  • kcont_wq_t *kcont_workqueue_acquire(pri_t pri, int flags);

    Returns a workqueue that matches the specified criteria. Thus if multiple requesters ask for the same criteria, they are all returned the same workqueue. pri specifies the priority at which the kernel thread which empties the workqueue should run.

    If flags is 0 then the standard operation is required. However, the following flag(s) may be bitwise ORed together:

    • WQ_PERCPU specifies that the workqueue should have a separate queue for each CPU, thus allowing continuations to be invoked on specific CPUs.
  • int kcont_workqueue_release(kcont_wq_t *wq);

    Releases an acquired workqueue. On the last release, the workqueue's resources are freed and the workqueue is destroyed.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to enhance the networking protocols to process incoming packets more efficiently. The basic idea is the following: when a packet is received and it is destined for a socket, simply place the packet in the socket's receive PCQ (see atomic pcq) and wake the blocking socket. Then, the protocol is able to process the next packet.

The typical packet flow from ip_input is to {rip,tcp,udp}_input which:

  • Does the lookup to locate the socket which takes a reader lock on the appropriate pcbtable's hash bucket.
  • If found and in the proper state:
    • Do not lock the socket since that would might block and therefore stop packet demultiplexing.
    • pcq_put the packet to the pcb's pcq.
    • kcont_schedule the worker continuation with small delay (~100ms). See kernel continuations.
    • Lock the socket's cvmutex.
    • Release the pcbtable lock.
    • If TCP and in sequence, then if we need to send an immediate ACK:
      • Try to lock the socket.
      • If successful, send an ACK.
    • Set a flag to process the PCQ.
    • cv_signal the socket's cv.
    • Release the cvmutex.
  • If not found or not in the proper state:
    • Release the pcb hash table lock.
Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to implement lockless, atomic and generic Radix and Patricia trees. BSD systems have always used a radix tree for their routing tables. However, the radix tree implementation is showing its age. Its lack of flexibility (it is only suitable for use in a routing table) and overhead of use (requires memory allocation/deallocation for insertions and removals) make replacing it with something better tuned to today's processors a necessity.

Since a radix tree branches on bit differences, finding these bit differences efficiently is crucial to the speed of tree operations. This is most quickly done by XORing the key and the tree node's value together and then counting the number of leading zeroes in the result of the XOR. Many processors today (ARM, PowerPC) have instructions that can count the number of leading zeroes in a 32 bit word (and even a 64 bit word). Even those that do not can use a simple constant time routine to count them:

int
clz(unsigned int bits)
{
    int zeroes = 0;
    if (bits == 0)
        return 32;
    if (bits & 0xffff0000) bits &= 0xffff0000; else zeroes += 16;
    if (bits & 0xff00ff00) bits &= 0xff00ff00; else zeroes += 8;
    if (bits & 0xf0f0f0f0) bits &= 0xf0f0f0f0; else zeroes += 4;
    if (bits & 0xcccccccc) bits &= 0xcccccccc; else zeroes += 2;
    if (bits & 0xaaaaaaaa) bits &= 0xaaaaaaaa; else zeroes += 1;
    return zeroes;
}

The existing BSD radix tree implementation does not use this method but instead uses a far more expensive method of comparision. Adapting the existing implementation to do the above is actually more expensive than writing a new implementation.

The primary requirements for the new radix tree are:

  • Be self-contained. It cannot require additional memory other than what is used in its data structures.

  • Be generic. A radix tree has uses outside networking.

To make the radix tree flexible, all knowledge of how keys are represented has to be encapsulated into a pt_tree_ops_t structure with these functions:

  • bool ptto_matchnode(const void *foo, const void *bar, pt_bitoff_t max_bitoff, pt_bitoff_t *bitoffp, pt_slot_t *slotp);

    Returns true if both foo and bar objects have the identical string of bits starting at *bitoffp and ending before max_bitoff. In addition to returning true, *bitoffp should be set to the smaller of max_bitoff or the length, in bits, of the compared bit strings. Any bits before *bitoffp are to be ignored. If the string of bits are not identical, *bitoffp is set to the where the bit difference occured, *slotp is the value of that bit in foo, and false is returned. The foo and bar (if not NULL) arguments are pointers to a key member inside a tree object. If bar is NULL, then assume it points to a key consisting of entirely of zero bits.

  • bool ptto_matchkey(const void *key, const void *node_key, pt_bitoff_t bitoff, pt_bitlen_t bitlen);

    Returns true if both key and node_key objects have identical strings of bitlen bits starting at bitoff. The key argument is the same key argument supplied to ptree_find_filtered_node.

  • pt_slot_t ptto_testnode(const void *node_key, pt_bitoff_t bitoff, pt_bitlen_t bitlen);

    Returns bitlen bits starting at bitoff from node_key. The node_key argument is a pointer to the key members inside a tree object.

  • pt_slot_t ptto_testkey(const void *key, pt_bitoff_t bitoff, pt_bitlen_t bitlen);

    Returns bitlen bits starting at bitoff from key. The key argument is the same key argument supplied to ptree_find_filtered_node.

All bit offsets are relative to the most significant bit of the key,

The ptree programming interface should contains these routines:

  • void ptree_init(pt_tree_t *pt, const pt_tree_ops_t *ops, size_t ptnode_offset, size_t key_offset);

    Initializes a ptree. If pt points at an existing ptree, all knowledge of that ptree is lost. The pt argument is a pointer to the pt_tree_t to be initialized. The ops argument is a pointer to the pt_tree_ops_t used by the ptree. This has four members: The ptnode_offset argument contains the offset from the beginning of an item to its pt_node_t member. The key_offset argument contains the offset from the beginning of an item to its key data. This is used if 0 is used, a pointer to the beginning of the item will be generated.

  • void *ptree_find_filtered_node(pt_tree_t *pt, const void *key, pt_filter_t filter, void *filter_ctx);

    The filter argument is either NULL or a function bool (*)(const void *, void *, int);

  • bool ptree_insert_mask_node(pt_tree_t *pt, void *item, pt_bitlen_t masklen);

  • bool ptree_insert_node(pt_tree_t *pt, void *item);

  • void *ptree_iterate(pt_tree_t *pt, const void *node, pt_direction_t direction);

  • void ptree_remove_node(pt_tree_t *pt, const pt_tree_ops_t *ops, void *item);

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

This project proposal is a subtask of smp networking and is elegible for funding independently.

The goal of this project is to improve the way the processing of incoming packets is handled.

Instead of having a set of active workqueue lwps waiting to service sockets, the kernel should use the lwp that is blocked on the socket to service the workitem. It is not productive being blocked and it has an interest in getting that workitem done, and maybe we can directly copy that data to user's address and avoid queuing in the socket at all.

Posted in the wee hours of Wednesday night, November 10th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

While NetBSD has had LiveCDs for a while, there has not yet been a LiveCD that allows users to install NetBSD after test-driving it. A LiveCD that contains a GUI based installer and reliably detects the platforms features would be very useful.

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:easy ?project ?status:active
Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 4 months and up

There are many caches in the kernel. Most of these have knobs and adjustments, some exposed and some not, for sizing and writeback rate and flush behavior and assorted other voodoo, and most of the ones that aren't adjustable probably should be.

Currently all or nearly all of these caches operate on autopilot independent of the others, which does not necessarily produce good results, especially if the system is operating in a performance regime different from when the behavior was tuned by the implementors.

It would be nice if all these caches were instead coordinated, so that they don't end up fighting with one another. Integrated control of sizing, for example, would allow explicitly maintaining a sensible balance between different memory uses based on current conditions; right now you might get that, depending on whether the available voodoo happens to work adequately under the workload you have, or you might not. Also, it is probably possible to define some simple rules about eviction, like not evicting vnodes that have UVM pages still to be written out, that can help avoid unnecessary thrashing and other adverse dynamic behavior. And similarly, it is probably possible to prefetch some caches based on activity in others. It might even be possible to come up with one glorious unified cache management algorithm.

Also note that cache eviction and prefetching is fundamentally a form of scheduling, so all of this material should also be integrated with the process scheduler to allow it to make more informed decisions.

This is a nontrivial undertaking.

Step 1 is to just find all the things in the kernel that ought to participate in a coordinated caching and scheduling scheme. This should not take all that long. Some examples include: * UVM pages * file system metadata buffers * VFS name cache * vnode cache * size of the mbuf pool

Step 2 is to restructure and connect things up so that it is readily possible to get the necessary information from all the random places in the kernel that these things occupy, without making a horrible mess and without trashing system performance in the process or deadlocking out the wazoo. This is not going to be particularly easy or fast.

Step 3 is to take some simple steps, like suggested above, to do something useful with the coordinated information, and hopefully to show via benchmarks that it has some benefit.

Step 4 is to look into more elaborate algorithms for unified control of everything. The previous version of this project cited IBM's ARC ("Adaptive Replacement Cache") as one thing to look at. (But note that ARC may be encumbered -- someone please check on that and update this page.) Another possibility is to deploy machine learning algorithms to look for and exploit patterns.

Note: this is a serious research project. Step 3 will yield a publishable minor paper; step 4 will yield a publishable major paper if you manage to come up with something that works, and it quite possibly contains enough material for a PhD thesis.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

NetBSD currently requires a system with an MMU. This obviously limits the portability. We'd be interested in an implementation/port of NetBSD on/to an MMU-less system.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active
  • Contact: tech-net
  • Duration estimate: 3 months

Implement the ability to route based on properties like QoS label, source address, etc.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active
  • Contact: tech-net
  • Duration estimate: 2 months

Write tests for the routing code and re-factor. Use more and better-named variables.

PCBs and other structures are sprinkled with route caches (struct route). Sometimes they do not get updated when they should. It's necessary to modify rtalloc(), at least. Fix that. Note XXX in rtalloc(); this may mark a potential memory leak!

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active

Improve on the Kismet design and implementation in a Kismet replacement for BSD.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:easy ?project ?status:active

Audit NetBSD for misuse of shared/read-only mbuf storage, and fix bugs as you find them. Improve the mbufs API to help developers protect against misuse in the future.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 2 months

Apply TCP-like flow control to processes writing to the filesystem (put them to sleep when there is "congestion"), to avoid enormous backlogs and provide more fair allocation of disk bandwidth.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

Today a number of OS provide some form of kernel-level virtualization that offer better isolation mechanisms that the traditional (yet more portable) &chroot(2). Currently, NetBSD lacks functionality in this field; there have been multiple attempts (gaols, mult) to implement a jails-like system, but none so far has been integrated in base.

The purpose of this project is to study the various implementations found elsewhere (FreeBSD Jails, Solaris Zones, Linux Containers/VServers, ...), and eventually see their plus/minus points. An additional step would be to see how this can be implemented the various architectural improvements NetBSD gained, especially rump(3) and kauth(9).

Caution: this is a research project.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

To create packages that are usable by anyone, pkgsrc currently requires that packages be built with superuser privileges. It is already possible to use pkgsrc in great parts without such privileges, but there haven't been many thoughts about how the resulting binary packages should be specified. For example, many packages don't care at all about the owner/group of their files, as long as they are not publicly overwritable. In the end, the binary packages should be as independent from the build environment as possible.

For more information about the current state, see the How to use pkgsrc as non-root section in the pkgsrc guide, Jörg's mail on DESTDIR support as well as pkgsrc/mk/unprivileged.mk.

Posted Sunday evening, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

NetBSD version of compressed cache system (for low-memory devices): http://linuxcompressed.sourceforge.net/.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

kernfs is a virtual file system that reports information about the running system, and in some cases allows adjusting this information. procfs is a virtual file system that provides information about currently running processes. Both of these file systems work by exposing virtual files containing textual data.

The current implementations of these file systems are redundant and both are non-extensible. For example, kernfs is a hardcoded table that always exposes the same set of files; there is no way to add or remove entries on the fly, and even adding new static entries is a nuisance. procfs is similarly limited; there is no way to add additional per-process data on the fly. Furthermore, the current code is not modular, not well designed, and has been a source of security bugs in the past.

We would like to have a new implementation for both of these file systems that rectifies these problems and others, as outlined below:

  • kernfs and procfs should share most of their code, and in particular they should share all the code for managing lists of virtual files. They should remain separate entities, however, at least from the user perspective: community consensus is that mixing system and per-process data, as Linux always has, is ugly.

  • It should be possible to add and remove entries on the fly, e.g. as modules are loaded and unloaded.

  • Because userlevel programs can become dependent on the format of the virtual files (Linux has historically had compatibility problems because of this) they should if possible not have complex formats at all, and if they do the format should be clearly specifiable in some way that isn't procedural code. (This makes it easier to reason about, and harder for it to get changed by accident.)

  • There is an additional interface in the kernel for retrieving and adjusting arbitrary kernel information: sysctl. Currently the sysctl code is a third completely separate mechanism, on many points redundant with kernfs and/or procfs. It is somewhat less primitive, but the current implementation is cumbersome and not especially liked. Integrating kernfs and procfs with sysctl (perhaps somewhat like the Linux sysfs) is not automatically the right design choice, but it is likely to be a good idea. At a minimum we would like to be able to have one way to handle reportable/adjustable data within the kernel, so that kernfs, procfs, and/or sysctl can be attached to any particular data element as desired.

  • While most of the implementations of things like procfs and sysctl found in the wild (including the ones we currently have) work by attaching callbacks, and then writing code all over the kernel to implement the callback API, it is possible to design instead to attach data, that is, pointers to variables within the kernel, so that the kernfs/procfs or sysctl code itself takes responsibility for fetching that data. Please consider such a design strongly and pursue it if feasible, as it is much tidier. (Note that attaching data also in general requires specifying a locking model and, for writeable data, possibly a condition variable to signal on when the value changes.)

It is possible that using tmpfs as a backend for kernfs and procfs, or sharing some code with tmpfs, would simplify the implementation. It also might not. Consider this possibility, and assess the tradeoffs; do not treat it as a requirement.

Alternatively, investigate FreeBSD's pseudofs and see if this could be a useful platform for this project and base for all the file systems mentioned above.

When working on this project, it is very important to write a complete regression test suite for procfs and kernfs beforehand to ensure that the rewrites do not create incompatibilities.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

Libvirt is a project that aims at bringing yet-another-level of abstraction to the management of different virtualization technologies; it supports a wide range of virtualization technologies, like Xen, VMWare, KVM and containers.

A package for libvirt was added to pkgsrc under sysutils/libvirt , however it requires more testing before all platforms supported by pkgsrc can also seamlessly support libvirt.

The purpose of this project is to investigate what is missing in libvirt (in terms of patches or system integration) so it can work out-of-the-box for platforms that can benefit from it. GNU/Linux, NetBSD, FreeBSD and Solaris are the main targets.

Posted Sunday evening, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:easy ?project ?status:active

BSD make (aka bmake) uses traditional suffix rules (.c.o: ...) instead of pattern rules like gmake's (%.c:%.o: ...) which are more general and flexible.

The suffix module should be re-written to work from a general match-and-transform function on targets, which is sufficient to implement not only traditional suffix rules and gmake pattern rules, but also whatever other more general transformation rule syntax comes down the pike next. Then the suffix rule syntax and pattern rule syntax can both be fed into this code.

Note that it is already possible to write rules where the source is computed explicitly based on the target, simply by using $(.TARGET) in the right hand side of the rule. Investigate whether this logic should be rewritten using the match-and-transform code, or if the match-and-transform code should use the logic that makes these rules possible instead.

Implementing pattern rules is widely desired in order to be able to read more makefiles written for gmake, even though gmake's pattern rules aren't well designed or particularly principled.

Posted Sunday evening, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Add memory-efficient snapshots to tmpfs. A snapshot is a view of the filesystem, frozen at a particular point in time. The snapshotted filesystem is not frozen, only the view is. That is, you can continue to read/write/create/delete files in the snapshotted filesystem.

The interface to snapshots may resemble the interface to null mounts, e.g., 'mount -t snapshot /var/db /db-snapshot' makes a snapshot of /var/db/ at /db-snapshot/.

You should exploit features of the virtual memory system like copy-on-write memory pages to lazily make copies of files that appear both in a live tmpfs and a snapshot. This will help conserve memory.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 1 year for port; 3 years for rewrite by one developer

Implement a BSD licensed JFS. A GPL licensed implementation of JFS is available at http://jfs.sourceforge.net/.

Alternatively, or additionally, it might be worthwhile to do a port of the GPL code and allow it to run as a kernel module.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

With the push of virtualization, the x86 world started recently to gain a more widespread attention towards supporting IOMMUs; similar to MMUs that translate virtual addresses into physical ones, an IOMMU translates device/bus addresses into physical addresses. The purpose of this project is to add AMD and Intel IOMMU support in NetBSD's machine-independent bus abstraction layers bus.space(9) and bus.dma(9).

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active

Use puffs or refuse to write an imapfs that you can mount on /var/mail, either by writing a new one or porting the old existing Plan 9 code that does this.

Note: there might be existing solutions, please check upfront and let us know.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

Apply statistical AI techniques to the problem of monitoring the logs of a busy system. Can one identify events of interest to a sysadmin, or events that merit closer inspection? Failing that, can one at least identify some events as routine and provide a filtered log that excludes them? Also, can one group a collection of related messages together into a single event?

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:easy ?project ?status:active

It would be nice to support these newer highly SMP processors from Sun. A Linux port already exists, and Sun has contributed code to the FOSS community.

(Some work has already been done and committed.)

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 8-12 months

Remove the residual geometry code and datastructures from FFS (keep some kind of allocation groups but without most of what cylinder groups now have) and replace blocks and fragments with extents, yielding a much simpler filesystem well suited for modern disks.

Note that this results in a different on-disk format and will need to be a different file system type.

The result would be potentially useful to other operating systems beyond just NetBSD, since UFS/FFS is used in so many different kernels.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

Create an easy to use wifi setup widget for NetBSD: browse and select networks in the vicinity by SSID, BSSID, channel, etc.

Posted Sunday evening, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

Design and program a scheme for extending the operating range of 802.11 networks by using techniques like frame combining and error-correcting codes to cope with low S/(N+I) ratio. Implement your scheme in one or two WLAN device drivers -- Atheros & Realtek, say.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active

Heavily used file systems tend to spread the blocks all over the disk, especially if free space is scarce. The traditional fix for this is to use dump, newfs, and restore to rebuild a volume; however, this is not acceptable for current disk sizes.

The resize_ffs code already contains logic for relocating blocks, as it needs to be able to do that to shrink a volume; it is likely that it would serve nicely as a starting point for a defragmenting tool.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

Recent overhaul of the X server infrastructure lead to the appearance of Kernel Mode Setting (KMS) and Graphics Execution Manager (GEM) to rejuvenate the X world. This has a number of benefits, from improving 3D GPU support, deprivileged X server to in-kernel video mode management, which can be very helpful when it is necessary to debug a live system using ddb(4).

The goal of this project is to add the missing bits inside NetBSD, most notably KMS and GEM. Testing the port using a recent driver like nouveau would be worth the effort.

Work on this is underway on the riastradh-drm2 branch.

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:hard ?project ?status:active

Add configurable memory-usage policies to tmpfs, per discussions on NetBSD's tech-kern mailing list: % total memory, min/max kilobytes, etc.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 2-3 months

NetBSD has a fixed, kernel-wide upper limit on transfer size, called MAXPHYS, which is currently set to 64k on most ports. This is too small to perform well on modern IDE and SCSI hardware; on the other hand some devices can't do more than 64k, and in some cases are even limited to less (the Xen virtual block device for example). Software RAID will also cause requests to be split in multiple smaller requests.

There is currently work in progress to make MAXPHYS configured at runtime based on driver capability.

This project originaly suggested instead to make the buffer queue management logic (which currently only sorts the queue, aka disksort) capable of splitting too-large buffers or aggregating small contiguous buffers in order to conform to device-level requirements.

Once the MAXPHYS changes are finalized and committed, this project may be simply outdated. However, it may also be worthwhile to pursue this idea as well, particularly the aggregation part.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Create a BSD licensed drop-in replacement for rsync that can handle large numbers of files/directories and large files efficiently. The idea would be to not scan the filesystem for every client to detect changes that need transfer, but rather maintain some database that can be queried (and that also needs updating when files are changed on the server). See supservers(8) for some ideas of how this could work. Compatibility with existing rsync clients should be retained.

Posted Sunday evening, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

Modern 802.11 NICs provide two or more transmit descriptor rings, one for each priority level or 802.11e access category. Add to NetBSD a generic facility for placing a packet onto a different hardware transmit queue according to its classification by pf or IP Filter. Demonstrate this facility on more than one 802.11 chipset.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active

NetBSD/sgimips currently runs on O2s with R10k (or similar) CPUs, but for example speculative loads are not handled correctly. It is unclear if this is pure kernel work or the toolchain needs to be changed too.

See also NetBSD/sgimips.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:medium ?project ?status:active

syspkgs is the concept of using pkgsrc's pkg_* tools to maintain the base system. That is, allow users to register and components of the base system with more ease.

There has been a lot of work in this area already, but it has not yet been finalized. Which is a diplomatic way of saying that this project has been attempted repeatedly and failed every time.

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:hard ?project ?status:active

NetBSD/sgimips currently runs on a number of SGI hardware, but support for IP27 (Origin) and IP30 (Octane) is not yet available. An Octane for development is available for pickup in Hoboken, NJ (contact Jan Schaumann jschauma AT NetBSD.org).

See also NetBSD/sgimips.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 2-3 months

The ext2 file system is the lowest common denominator Unix-like file system in the Linux world, as ffs is in the BSD world. NetBSD has had kernel support for ext2 for quite some time.

However, the Linux world has moved on, with ext3 and now to some extent also ext4 superseding ext2 as the baseline. NetBSD has no support for ext3; the goal of this project is to implement that support.

Since ext3 is a backward-compatible extension that adds journaling to ext2, NetBSD can mount clean ext3 volumes as ext2 volumes. However, NetBSD cannot mount ext3 volumes with journaling and it cannot handle recovery for crashed volumes. As ext2 by itself provides no crash recovery guarantees whatsoever, this journaling support is highly desirable.

The ext3 support should be implemented by extending the existing ext2 support (which is in src/sys/ufs/ext2fs), not by rewriting the whole thing over from scratch. It is possible that some of the ostensibly filesystem-independent code that was added along with the ffs WAPBL journaling extensions might be also useable as part of an ext3 implementation; but it also might not be.

The full requirements for this project include complete support for ext3 in both the kernel and the userland tools. It is possible that a reduced version of this project with a clearly defined subset of these requirements could still be a viable GSOC project; if this appeals to you please coordinate with a prospective mentor.

An additional useful add-on goal would be to audit the locking in the existing ext2 code; the ext2 code is not tagged MPSAFE, meaning it uses a biglock on multiprocessor machines, but it is likely that either it is in fact already safe and just needs to be tagged, or can be easily fixed. (Note that while this is not itself directly related to implementing ext3, auditing the existing ext2 code is a good way to become familiar with it.)

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

While we now have mandoc for handling man pages, we currently still need groff in the tree to handle miscellaneous docs that are not man pages.

This is itself an inadequate solution as the groff we have does not support PDF output (which in this day and age is highly desirable) ... and while newer groff does support PDF output it does so via a Perl script. Also, importing a newer groff is problematic for assorted other reasons.

We need a way to typeset miscellaneous articles that we can import into base and that ideally is BSD licensed. (And that can produce PDFs.) Currently it looks like there are three decent ways forward:

  • Design a new roff macro package that's comparable to mdoc (e.g. supports semantic markup) but is for miscellaneous articles rather than man pages, then teach mandoc to handle it.

  • Design a new set of markup tags comparable to mdoc (e.g. supports semantic markup) but for miscellaneous articles, and a different less ratty syntax for it, then teach mandoc to handle this.

  • Design a new set of markup tags comparable to mdoc (e.g. supports semantic markup) but for miscellaneous articles, and a different less ratty syntax for it, and write a new program akin to mandoc to handle it.

These are all difficult and a lot of work, and in the case of new syntax are bound to cause a lot of shouting and stamping. Also, many of the miscellaneous documents use various roff preprocessors and it isn't clear how much of this mandoc can handle.

None of these options is particularly appealing.

There are also some less decent ways forward:

  • Pick one of the existing roff macro packages for miscellaneous articles (ms, me, ...) and teach mandoc to handle it. Unfortunately all of these macro packages are pretty ratty, they're underpowered compared to mdoc, and none of them support semantic markup.

  • Track down one of the other older roff implementations, that are now probably more or less free (e.g. ditroff), then stick to the existing roff macro packages as above. In addition to the drawbacks cited above, any of these programs are likely to be old nasty code that needs a lot of work.

  • Teach the groff we have how to emit PDFs, then stick to the existing roff macro packages as above. In addition to the drawbacks cited above, this will likely be pretty nasty work and it's still got the wrong license.

  • Rewrite groff as BSD-licensed code and provide support for generating PDFs, then stick to the existing roff macro packages as above. In addition to the drawbacks cited above, this is a horrific amount of work.

  • Try to make something else do what we want. Unfortunately, TeX is a nonstarter and the only other halfway realistic candidate is lout... which is GPLv3 and at least at casual inspection looks like a horrible mess of its own.

These options are even less appealing.

Maybe someone can think of a better idea. There are lots of choices if we give up on typeset output, but that doesn't seem like a good plan either.

Posted Sunday evening, November 6th, 2011 Tags: ?category:userland ?difficulty:hard ?project ?status:active

Add support for Apple's extensions to ISO9660 to makefs, especially the ability to label files with Type & Creator IDs. See http://developer.apple.com/technotes/fl/fl_36.html.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

The latest Linux code in madwifi-ng includes a major code overhaul and support for advanced features (SuperAG @ 108Mbps, eXtended Range) available in these parts. It would be nice to have these features in NetBSD, under a BSD license.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Latest Xen versions come with a number of features that are currently not supported by NetBSD: USB/VGA passthrough, RAS (Reliability, Availability and Serviceability) options - CPU and memory hotpluging - , Fault tolerancy with Remus, and debugging with gdbx (lightweight debugger included with Xen).

The purpose of this project is to add the missing parts inside NetBSD. Most of the work is composed of smaller components that can be worked on independently from others.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active

Qtopia Core (formerly QT/Embedded) is a framework for building single-application devices on embedded systems. Currently this only runs on Linux, but many current and future NetBSD systems would benefit from having a light-weight replacement for X11 provided by Qtopia Core.

See also the Qtopia Core documentation and this mail.

Posted Sunday evening, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:hard ?project ?status:active

In a file system with symlinks, the file system can be seen as a graph rather than a tree. The meaning of .. potentially becomes complicated in this environment.

There is a fairly substantial group of people, some of them big famous names, who think that the usual behavior (where crossing a symlink is different from entering a subdirectory) is a bug, and have made various efforts from time to time to "fix" it. One such fix can be seen in the -L and -P options to ksh's pwd.

Rob Pike implemented a neat hack for this in Plan 9. It is described in http://cm.bell-labs.com/sys/doc/lexnames.html. This project is to implement that logic for NetBSD.

Note however that there's another fairly substantial group of people, some of them also big famous names, who think that all of this is a load of dingo's kidneys, the existing behavior is correct, and changing it would be a bug. So it needs to be possible to switch the implementation on and off as per-process state.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

Implement a flash translation layer.

A flash translation layer does block remapping, translating from visible block addresses used by a file system to physical cells on one or more flash chips. This provides wear leveling, which is essential for effective use of flash, and also typically some amount of read caching and write buffering. (And it takes care of excluding cells that have gone bad.)

This allows FFS, LFS, msdosfs, or whatever other conventional file system to be used on raw flash chips. (Note that SSDs and USB flash drives and so forth contain their own FTLs.)

FTLs involve quite a bit of voodoo and there is a lot of prior art and research; do not just sit down and start coding.

There are also some research FTLs that we might be able to get the code for; it is probably worth looking into this.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Create/modify a 802.11 link-adaptation module that works at least as well as SampleRate, but is generic enough to be re-used by device drivers for ADMtek, Atheros, Intel, Ralink, and Realtek 802.11 chips. Make the module export a link-quality metric (such as ETT) suitable for use in linkstate routing. The SampleRate module in the Atheros device driver sources may be a suitable starting point.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 2-4 Weeks

Most of the NetBSD kernel tree still uses printf(9) to send messages to the console, primarily during boot and device configuration. Each printf during device configuration should be audited, and replaced with the appropriate aprint_* function, to make the verbose boot option work properly.

Additionally, printfs in drivers that report status (or errors) during normal kernel operation should be converted to use the log(9) function instead.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:easy ?project ?status:active

Due to the multitude of supported machine architectures NetBSD has to deal with many different partitioning schemes. To deal with them in a uniform way (without imposing artificial restrictions that are not enforced by the underlying firmware or bootloader partitioning scheme) wedges have been designed.

While the kernel part of wedges is mostly done (and missing parts are easy to add), a userland tool to edit wedges and to synthesize defaults from (machine/arch dependent) on-disk content is needed.

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:easy ?project ?status:active

NetBSD currently supports the MIPS32 ISA, but does not include support for the MIPS16e extension, which would be very useful for reducing the size of binaries for some kinds of embedded systems. This is very much like the ARM thumb instructions.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:hard ?project ?status:active

Add socket options to NetBSD for controlling WLAN transmit parameters like transmit power, fragmentation threshold, RTS/CTS threshold, bitrate, 802.11e access category, on a per-socket and per-packet basis. To set transmit parameters, pass radiotap headers using sendmsg(2) and setsockopt(2).

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active

Currently booting a sgimips machine requires different boot commands depending on the architecture. It is not possible to use the firmware menu to boot from CD.

An improved primary bootstrap should ask the firmware for architecture detail, and automatically boot the correct kernel for the current architecture by default.

A secondary objective of this project would be to rearrange the generation of a bootably CD image so that it could just be loaded from the firmware menu without going through the command monitor.

Posted Sunday evening, November 6th, 2011 Tags: ?category:ports ?difficulty:medium ?project ?status:active

The NetBSD website building infrastructure is rather complex and requires significant resources. We need to make it easier for anybody to contribute without having to install a large number of complex applications from pkgsrc or without having to learn the intricacies of the build process.

A more detailed description of the problem is described in this and this email and the following discussion on the netbsd-docs.

This work requires knowledge of XML, XSLT and make. This is not a request for visual redesign of the website.

Posted Sunday evening, November 6th, 2011 Tags: ?category:misc ?difficulty:medium ?project ?status:active
Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active

Enhance zeroconfd, the Multicast DNS daemon, that was begun in NetBSD's Google Summer of Code 2005 (see work in progress: http://netbsd-soc.sourceforge.net/projects/zeroconf/). Develop a client library that lets a process publish mDNS records and receive asynchronous notification of new mDNS records. Adapt zeroconfd to use event(3) and queue(3). Demonstrate comparable functionality to the GPL or APSL alternatives (Avahi, Howl, ...), but in a smaller storage footprint, with no dependencies outside of the NetBSD base system.

Posted Sunday evening, November 6th, 2011 Tags: ?category:networking ?difficulty:medium ?project ?status:active
  • Contact: tech-kern
  • Duration estimate: 1 year for port; 3 years for rewrite by one developer

Implement a BSD licensed XFS. A GPL licensed implementation of XFS is available at http://oss.sgi.com/projects/xfs/.

Alternatively, or additionally, it might be worthwhile to do a port of the GPL code and allow it to run as a kernel module.

See also FreeBSD's port.

Posted Sunday evening, November 6th, 2011 Tags: ?category:filesystems ?difficulty:hard ?project ?status:active

Certain real time chips, and other related power hardware, have a facility within them to allow the kernel to set a specific time and date at which time the machine will power itself on. One such chip is the DS1685 RTC. A kernel API should be developed to allow such devices to have a power-on-time set from userland. Additionally, the API should be made available through a userland program, or added to an existing utility, such as halt(8).

It may also be useful to make this a more generic interface, to allow for configuring other devices, such as Wake-On-Lan ethernet chips, to turn them on/off, set them up, etc.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

The policy code in the kernel that controls file caching and readahead behavior is necessarily one-size-fits-all, and the knobs available to applications to tune it, like madvise() and posix_fadvise(), are fairly blunt hammers. Furthermore, it has been shown that the overhead from user<->kernel domain crossings makes syscall-driven fine-grained policy control ineffective.

Is it possible to create a BPF-like tool (that is, a small code generator with very simple and very clear safety properties) to allow safe in-kernel fine-grained policy control?

Caution: this is a research project.

Posted Sunday evening, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Traditionally, the NetBSD kernel code had been protected by a single, global lock. This lock ensured that, on a multiprocessor system, two different threads of execution did not access the kernel concurrently and thus simplified the internal design of the kernel. However, such design does not scale to multiprocessor machines because, effectively, the kernel is restricted to run on a single processor at any given time.

The NetBSD kernel has been modified to use fine grained locks in many of its different subsystems, achieving good performance on today's multiprocessor machines. Unfotunately, these changes have not yet been applied to the networking code, which remains protected by the single lock. In other words: NetBSD networking has evolved to work in a uniprocessor envionment; switching it to use fine-grained locked is a hard and complex problem.

This project is currently claimed

Funding

At this time, The NetBSD Foundation is accepting project specifications to remove the single networking lock. If you want to apply for this project, please send your proposal to the contact addresses listed above.

Due to the size of this project, your proposal does not need to cover everything to qualify for funding. We have attempted to split the work into smaller units, and you can submit funding applications for these smaller subtasks independently as long as the work you deliver fits in the grand order of this project. For example, you could send an application to make the network interfaces alone MP-friendly (see the work plan below).

What follows is a particular design proposal, extracted from an original text written by Matt Thomas. You may choose to work on this particular proposal or come up with your own.

Tentative specification

The future of NetBSD network infrastructure has to efficiently embrace two major design criteria: Symmetric Multi-Processing (SMP) and modularity. Other design considerations include not only supporting but taking advantage of the capability of newer network devices to do packet classification, payload splitting, and even full connection offload.

You can divide the network infrastructure into 5 major components:

  • Interfaces (both real devices and pseudo-devices)
  • Socket code
  • Protocols
  • Routing code
  • mbuf code.

Part of the complexity is that, due to the monolithic nature of the kernel, each layer currently feels free to call any other layer. This makes designing a lock hierarchy difficult and likely to fail.

Part of the problem are asynchonous upcalls, among which include:

  • ifa->ifa_rtrequest for route changes.
  • pr_ctlinput for interface events.

Another source of complexity is the large number of global variables scattered throughout the source files. This makes putting locks around them difficult.

Subtasks

The proposed solution presented here include the following tasks (in no particular order) to achieve the desired goals of SMP support and modularity:

Work plan

Aside from the list of tasks above, the work to be done for this project can be achieved by following these steps:

  1. Move ARP out of the routing table. See the nexthop cache project.

  2. Make the network interfaces MP, which are one of the few users of the big kernel lock left. This needs to support multiple receive and transmit queues to help reduce locking contention. This also includes changing more of the common interfaces to do what the tsec driver does (basically do everything with softints). This also needs to change the *_input routines to use a table to do dispatch instead of the current switch code so domain can be dynamically loaded.

  3. Collect global variables in the IP/UDP/TCP protocols into structures. This helps the following items.

  4. Make IPV4/ICMP/IGMP/REASS MP-friendly.

  5. Make IPV6/ICMP/IGMP/ND MP-friendly.

  6. Make TCP MP-friendly.

  7. Make UDP MP-friendly.

Radical thoughts

You should also consider the following ideas:

LWPs in user space do not need a kernel stack

Those pages are only being used in case the an exception happens. Interrupts are probably going to their own dedicated stack. One could just keep a set of kernel stacks around. Each CPU has one, when a user exception happens, that stack is assigned to the current LWP and removed as the active CPU one. When that CPU next returns to user space, the kernel stack it was using is saved to be used for the next user exception. The idle lwp would just use the current kernel stack.

LWPs waiting for kernel condition shouldn't need a kernel stack

If an LWP is waiting on a kernel condition variable, it is expecting to be inactive for some time, possibly a long time. During this inactivity, it does not really need a kernel stack.

When the exception handler get an usermode exeception, it sets LWP restartable flag that indicates that the exception is restartable, and then services the exception as normal. As routines are called, they can clear the LWP restartable flag as needed. When an LWP needs to block for a long time, instead of calling cv_wait, it could call cv_restart. If cv_restart returned false, the LWPs restartable flag was clear so cv_restart acted just like cv_wait. Otherwise, the LWP and CV would have been tied together (big hand wave), the lock had been released and the routine should have returned ERESTART. cv_restart could also wait for a small amount of time like .5 second, and only if the timeout expires.

As the stack unwinds, eventually, it would return to the last the exception handler. The exception would see the LWP has a bound CV, save the LWP's user state into the PCB, set the LWP to sleeping, mark the lwp's stack as idle, and call the scheduler to find more work. When called, cpu_switchto would notice the stack is marked idle, and detach it from the LWP.

When the condition times out or is signalled, the first LWP attached to the condition variable is marked runnable and detached from the CV. When the cpu_switchto routine is called, the it would notice the lack of a stack so it would grab one, restore the trapframe, and reinvoke the exception handler.

Posted late Sunday afternoon, November 6th, 2011 Tags: ?category:networking ?difficulty:hard ?project ?status:active

NetBSD supports a number of platforms where both 32bit and 64bit execution is possible. The more well known example is the i386/AMD64 pair and the other important one is SPARC/SPARC64. On this platforms it is highly desirable to allow running all 32bit applications with a 64bit kernel. This is the purpose of the netbsd32 compatibility layer.

At the moment, the netbsd32 layer consists of a number of system call stubs and structure definitions written and maintained by hand. It is hard to ensure that the stubs and definitions are up-to-date and correct. One complication is the difference in alignment rules. On i386 uint64_t has a 32bit alignment, but on AMD64 it uses natural (64bit) alignment. This and the resulting padding introduced by the compiler can create hard to find bugs.

The goal of this project is to replace the manual labour with an automatic tool. This tool should allow both verification / generation of structure definitions for use in netbsd32 code as well as allow generation of system call stubs and conversion functions. For this purpose, the Clang C parser or the libclang frontend can be used to analyse the C code. Generated stubs should also ensure that no kernel stack data is leaked in hidden padding without having to resort to unnecessary large memset calls.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:hard ?project ?status:active

pkgsrc duplicates NetBSD efforts in maintaining X.org packages. pkgsrc X11, being able to cross-build, could replace base X11 distribution.

The latter decoupling should simplify maintainance of software; updating X11 and associated software becomes easier because of pkgsrc's shorter release cycle and more volatility.

Cross-buildable and cross-installable pkgsrc tools could also simplify maintainance of slower systems by utilising power of faster ones.

The goal of this project is to make it possible to bootstrap pkgsrc using available cross-build tools (e.g. NetBSD's).

This project requires good understanding of cross-development, some knowledge of NetBSD build process or ability to create cross-development toolchain, and familiarity with pkgsrc bootstrapping.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Instead of including install scripts from the infrastructure into every binary package, just include the necessary information and split the scripts off into a separate package that is installed first (right after bootstrap, as soon as the first package needs it). This affects user creation, installation of tex packages, ...

Also add support for actions that happen once after a big upgrade session, instead of once per package (e.g. ls-lR rebuild for tex).

An intermediate step would be to replace various remaining INSTALL scripts by declarative statements and install script snippets using them.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Add pkgsrc support to packagekit so the graphical packaging software "just works" for pkgsrc

The pkgsrc/pkgtools/packagekit package currently contains a minimal implementation of pkgsrc support to get the package to compile. This should be extended.

Useful steps:

  • Show all installed packages and let the user delete them
  • Show list of available packages (local/remote) and let the user add them
  • Show lists of available updates
  • Include security information, i.e. show recommended updates based on audit-packages and list of installed / available packages

Additional goals:

  • Show progress and / or console while installing / upgrading
  • pkgsrc's PackageKit backend is to be integrated upstream (PackageKit people are open to contributions, that won't be an issue)
  • It must not be mandatory to install the pkgsrc source tree (binary packages-only needs to work)
  • If installation / upgrade hits a package not available as a binary (typically packages where the license doesn't allow distribution in binary form, e.g. flash, mplayer-share, ...), try to fallback to the pkgsrc source tree if it is available and explain causes and consequences to the user. That could be part of pkgin instead of the PackageKit backend.
  • In case of failure, the full log shall be displayed so that the user has a chance to fix the problem "by hand". Some explanations about the failure would be nice. A useful log is mandatory in order to be able to request help from the pkgsrc-users mailing list or to open a PR.
  • Using a convenient programming language, the student shall write an abstraction layer / API, permitting to easily manipulate pkgsrc.

    This layer shall permit to :

    • Return pkgsrc version
    • Return an installed package list/map
    • Return a "to-upgrade" package list/map
    • Return a full "package map" (non-automatic + dependencies)
    • Return an available package list
    • Install/upgrade/remove a new package and its dependencies
    • Return information about a package (DESCR, version, PLIST, Makefile, options, ...)
    • Display security information
Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

The goal of this project is to provide an alternative version of the NetBSD system installer with a simple, X based graphical user interface.

The installer currently uses a "homegrown" (similar to CUA) text based interface, thus being easy to use over serial console as well as on a framebuffer console.

The current installer code is partly written in plain C, but in big parts uses C fragments embedded into its own definition language, preprocessed by the "menuc" tool into plain C code and linked against libterminfo.

During this project, the "menuc" tool is modified to optionally generate a different version of the C code, which then is linked against standard X libraries. The C stub fragments sprinkled throughout the menu definitions need to be modified to be reusable for both (text and X) versions. Where needed the fragments can just call C functions, which have different implementations (selected via a new ifdef).

Since the end result should be able to run off an enhanced install CD, the selection of widgets used for the GUI is limited. Only base X libraries are available. A look & feel similar to current xdm would be a good start.

Developement can be done on an existing system, testing does not require actuall installation on real hardware.

An optional extension of the project is to modify the creation of one or more port's install CD to make use of the new xsysinst.

The candidate must have:

  • familiarity with the system installer. You should have used sysinst to install the system.
  • familiarity with C and X programming.

The following would also be useful:

  • familiarity with NetBSD.
  • familiarity with user interface programming using in-tree X widgets.

References:

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Refactor utilities in the base system, such as netstat, top, and vmstat, that format & display tables of statistics.

One possible refactoring divides each program into three:

  • one program reads the statistics from the kernel and writes them in machine-readable form to its standard output
  • a second program reads machine-readable tables from its standard input and writes them in a human-readable format to the standard output
  • and a third program supervises a pipeline of the first two programs, browses and refreshes a table.

Several utilities will share the second and third program.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

Write a library of routines that estimate the expected transmission duration of a queue of 802.3 or 802.11 frames based on the current bit rate, applicable overhead, and the backoff level. The kernel will use this library to help it decide when packet queues have grown too long in order to mitigate "buffer bloat."

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Put config files (etc/) installed by pkgsrc into some version control system to help keeping track of changes and updating them.

The basic setup might look like this:

  • There is a repository containing the config files installed by pkgsrc, starting out empty.
  • During package installation, pkgsrc imports the package's config files into the repository onto a branch tagged with the name and version of the package (if available, on a vendor branch). (e.g.: digest-20080510)
  • After installation, there are two cases:

    1. the package was not installed before: the package's config files get installed into the live configuration directory and committed to the head of the config repository
    2. the package was installed before: a configuration update tool should display changes between the new and the previous original version as well as changes between the previous original and installed config file, for each config file the package uses, and support merging the changes made necessary by the package update into the installed config file. Commit the changes to head when the merge is done.
  • Regular automated check-ins of the entire live pkgsrc configuration should be easy to set up, but also manual check-ins of singular files so the local admin can use meaningful commit messages when they change their config, even if they are not experienced users of version control systems

The actual commands to the version control system should be hidden behind an abstraction layer, and the vcs operations should be kept simple, so that other compatibility layers can be written, and eventually the user can pick their vcs of choice (and also a vcs location of choice, in case e.g. the enterprise configuration repository is on a central subversion server).

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:easy ?project ?status:active

Using kernel virtualization offered by rump it is possible to start a virtually unlimited amount of TCP/IP stacks on one host and interlink the networking stacks in an arbitrary fashion. The goal of this project is to create a visual GUI tool for creating and managing the networks of rump kernel instances. The main goal is to support testing and development activities.

The implementation should be split into a GUI frontend and a command-line network-control backend. The former can be used to design and edit the network setup, while the latter can be used to start and monitor the virtual network from e.g. automated tests.

The GUI frontend can be implemented in any language that is convenient. The backend should be implemented in a language and fashion which makes it acceptable to import into the NetBSD base system.

The suggested milestones for the project are:

  1. get familiar and comfortable with starting and stopping rump kernels and configuring virtual network topologies
  2. come up with an initial backend command language. It can be something as simple as bourne shell functions.
  3. come with an initial specification and implementation of the GUI tool.
  4. make the GUI tool be able to save and load network specifications.
  5. create a number of automated tests to evaluate the usefulness of the tool in test creation.

In case of a capable student, the project goals can be set on a more demanding level. Examples of additional features include:

  • defining network characteristics such as jitter and packet loss for links
  • real-time monitoring of network parameters during a test run
  • UI gimmicks such as zooming and node grouping
Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

This project has two subprojects: add support for the NT (network) side of ISDN to the NetBSD ISDN stack and interface ISDN (in NT mode) to the Asterisk PBX, which would allow using existing ISDN PBXes as SIP/VoIP phones, as well as easier testing of new ISDN card drivers.

Previous work in this area can be found at the alternative ISDN driver site.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Currently kernels with options PAX_MPROTECT can not execute dynamically linked binaries on most RISC architectures, because the PLT format defined by the ABI of these architectures uses self-modifying code.

New binutils versions have introduced a different PLT format (enabled with --secureplt) for alpha and powerpc.

This project (for alpha) is to add support for the new PLT formats introduced in binutils 2.17 and gcc4.1 This will require changes to the dynamic loader (ld.elf_so), various assembly headers, and library files.

Support for both the old and new formats in the same invocation will be required.

For all architectures we can improve security by implementing relro.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Use multiple cores efficiently (that already works reasonably well for multiple request streams); actually use faster versions of complex transforms (CBC, counter modes) from our in-tree OpenSSL or elsewhere (eg libtomcrypt); add support for asymmetric operations (public key). Tie public-key operations into veriexec somehow for extra credit (probably a very good start towards an undergrad thesis project).

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Updating an operating system image can be fraught with danger, an error could cause the system to be unbootable and require significant work to restore the system to operation. The aim of this project is to permit a system to be updated while it is running only requiring a reboot to activate the updated system and provide the facility to rollback to a "known good" system in the event that the update causes regressions. The project should do the following:

  • Make a copy of the currently running system
  • Either apply patches, install binary sets, run a source build with the copy as the install target
  • Update critical system files to reference the copy (things like fstab)
  • Update the boot menu to make the copy the default boot target, the current running system should be left as an option to boot to

The following link shows how live upgrade works on Solaris: http://docs.oracle.com/cd/E19455-01/806-7933/overview-3/index.html The aim is to implement something that is functionally similar to this which can not only be used for upgrading but for any risky operation where a reliable back out is required.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active
  • Contact: tech-pkg
  • Duration estimate: 3 months

The goal of this project is to generate a package or packages that will set up a cross-compiling environment for one (slow) NetBSD architecture on another (fast) NetBSD architecture, starting with (and using) the NetBSD toolchain in src.

The package will require a checked out NetBSD src tree (or as a refinement, parts thereof) and is supposed to generate the necessary cross-building tools using src/build.sh with appropriate flags; for the necessary /usr/include and libraries of the slow architecture, build these to fit or use the slow architecture's base.tgz and comp.tgz (or parts thereof). As an end result, you should e.g. be able to install the binary cross/pkgsrc-NetBSD-amd64-to-atari package on your fast amd64 and start building packages for your slow atari system.

Use available packages, like eg pkgtools/pkg_comp to build the cross-compiling environment where feasible.

As test target for the cross-compiling environment, use pkgtools/pkg_install, which is readily cross-compilable by itself.

If time permits, test and fix cross-compiling pkgsrc X.org, which was made cross-compilable in an earlier GSoC project, but may have suffered from feature rot since actually cross-compiling it has been too cumbersome for it to see sufficient use.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

NetBSD provides a BSD licensed implementation of libintl. This implementation is based on the specifications from GNU gettext. It has not kept up with the development of gettext in the last few years though, lacking e.g. support for context specific translations. NetBSD currently also has to depend on GNU gettext to recreate the message catalogs.

Goals for this project include:

  • Restore full API compatibility with current gettext. At the time of writing, this is gettext 0.18.1.1.
  • Implement support for the extensions in the message catalog format. libintl should be able to process all .mo files from current gettext and return the same results via the API.
  • Provide a clean implementation of msgfmt.
  • Other components of gettext like msgmerge and the gettext frontend should be evaluated case-by-case if they are useful for the base system and whether third party software in pkgsrc depends on them.
Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

Currently, the puffs(3) interface between the kernel and userspace uses various system structures for passing information. Examples are struct stat and struct uucred. If these change in layout (such as with the time_t size change in NetBSD 6.0), old puffs servers must be recompiled.

The purpose of the project is to:

  • define a binary-independent protocol
  • implement support
  • measure the performance difference with direct kernel struct passing
  • if there is a huge difference, investigate the possibility for having both an internal and external protocol. The actual decision to include support will be made on the relative complexity of the code for dual support.

While this project will be partially implemented in the kernel, it is fairly well-contained and prior kernel experience is not necessary.

If there is time and interest, a suggested subproject is making sure that p2k(3) does not suffer from similar issues. This is a required subproject if dual support as mentioned above is not necessary.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

pkg_comp is a shell script that allows building packages inside a sandbox in an automated manner. The goal is to allow the user to build packages in a controller manner and to provide a simple mechanism to build packages for a host running a specific NetBSD release (e.g building packages for NetBSD 5.x while running NetBSD-current).

The current version of pkg_comp is written as a pretty big shell script that has grown out of control. It is hard to maintain and hard to extend. And, more importantly, it is not portable at all.

The goal of this project is to rewrite pkg_comp from scratch with portability being the main goal: it has to be possible to use the script in different Unix variants (including Mac OS X).

The language of choice for the reimplementation will be Python. The code will be hosted outside of pkgsrc to be able to distribute it as a standalone tool.

The following items should be accomplished:

  • Implement a "sandbox" library with an interface to programmatically create and manage a sandboxed file system. The structure of the sandbox has to be configurable through a template, and templates have to be provided for a few systems. The sandbox itself will be constructed either by unpacking tarballs with the system contents (such as the NetBSD distfiles), by null-mounting directories of the host file system, or both.
  • Implement a command-line frontend utility for the "sandbox" library. This tool is completely unaware of pkgsrc but will be really nice to have (and trivial to write, as all the juicy bits are in the library). Will be useful as part of server administration, as it will allow isolating services in a fairly easy manner.
  • Implement the "pkg_comp" application, building it on top of the "sandbox" module. It has to provide most of the current functionality, if not all. More specifically, these are a must: unattended builds of a predefined set of packages and use of libkver to override the system version.
  • Implement detailed unit tests for the modules and integration tests for the front-end tools. At this point, the former will most likely be implemented using JUnit and the latter will be implemented using ATF.

As part of this project, the student will get familiar with pkgsrc, construction of sandboxes for different operating systems, unit testing, ATF and Python.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Write a program that can read an email, infer quoting conventions, discern bottom-posted emails from top-posted emails, and rewrite the email using the conventions that the reader prefers. Then, take it a step further: write a program that can distill an entire email discussion thread into one document where every writer's contribution is properly attributed and appears in its proper place with respect to interleaved quotations.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:hard ?project ?status:active

pkgsrc is a very flexible package management system. It provides a comprehensible framework to build, test, deploy, and maintain software in its original form (with porter/packager modifications where applicable) as well as with site local modifications and customizations. All this makes pkgsrc suitable to use in diverse environments ranging from small companies up to large enterprises.

While pkgsrc already contains most elements needed to build an authentication server (or an authentication server failover pair), in order to install one, considerable knowledge about the neccessary elements is needed, plus the correct configuration, while in most cases pretty much identical, is tedious and not without pitfalls.

The goal of this project is to create a meta-package that will deploy and pre-configure an authentication server suitable for a single sign-on infrastructure.

Necessary tasks: provide missing packages, provide packages for initial configuration, package or create corresponding tools to manage user accounts, document.

The following topics should be covered:

  • PAM integration with OpenLDAP and DBMS;
  • Samba with PAM, DBMS and directory integration;
  • Kerberos setup;
  • OpenLDAP replication;
  • DBMS (PostgreSQL is a must, MySQL optional, if time permits), replication (master-master, if possible);
  • DNS server with a sane basic dynamic DNS update config using directory and database backend;
  • user account management tools (web interface, command line interface, see user(8) manual page, perhaps some scripting interface);
  • configuration examples for integration of services (web services, mail, instant messaging, PAM is a must, DBMS and directory optional).

All covered services should be documented, in particular documentation should include:

  • initial deployment instructions;
  • sample configuration for reasonably simple case;
  • instructions how to test if the software works;
  • references to full documentation and troubleshooting instructions (if any), both should be provided on-site (i.e. it should be possible to have everything available given pkgsrc snapshot and corresponding distfiles and/or packages on physical media).

In a nutshell, the goal of the project is to make it possible for a qualified system administrator to deploy basic service (without accounts) with a single pkg_add.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:easy ?project ?status:active

NetBSD currently uses amd for automatically mounting (network) file systems. This software package implements an automounter file system as a userland NFS daemon. While this generally works it has major drawbacks:

  • File systems are not mounted directly on the desired mount point. As a result applications frequently use incorrect pathnames (e.g. /amd/server/home/user instead of /home/user) for automatically mounted directories or files beneath them. This is especially problematic in heterogeneous enviroments where not all machines use the same automounter.
  • The automounter daemon cannot handle high I/O load very well; file access occasionally fails with intermittent errors.

The goal of this project is to implement a new automounter solution which addresses the above issues. This could either be done via a Solaris/Linux compatible autofs(4) full in-kernel file system or via a userland daemon using puffs.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

The curses library is an important part of the NetBSD operating system, many applications rely on the correct functioning of the library. Performing modifications on the curses library can be difficult because the effects of the change may be subtle and can introduce bugs that are not detected for a long time.

The aim of this project is to produce a suite of high quality tests for the curses library. These tests should exercise every aspect of the library functionality.

The testing framework has been written to run under the atf framework but has not been committed to the tree yet. The student undertaking this project will be provided with the testing framework and will use this to generate test cases for curses library calls. Most of the work will require analytical skills to verify the output of the test is actually correct before encapsulating that output into a validation file.

This project will need a good understanding of the curses library and will provide the student with a much deeper understanding of the operation of curses.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

The following features should be added to inetd:

  • Prefork: Support pre-forking multiple children and keeping them alive for multiple invocations.
  • Per service configuration file: Add a per-service configuration file similar to xinetd.
  • Ability for non privileged users to add/remove/change their own services using a separate utility.
Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:easy ?project ?status:active

Make NetBSD behave gracefully when a "live" USB/FireWire disk drive is accidentally detached and re-attached by, for example, creating a virtual block device that receives block-read/write commands on behalf of the underlying disk driver. This device will delegate reads and writes to the disk driver, but it will keep a list of commands that are "outstanding," that is, reads that the disk driver has not completed, and writes that have not "hit the platter," so to speak. Following disk re-attachment, the virtual block device replays its list of outstanding commands. A correct solution will not replay commands to the wrong disk if the removable was replaced instead of re-attached. Provide a character device for userland to read indications that a disk in use was abruptly detached.

Open questions: Prior art? Isn't this how the Amiga worked? How will this interact with mount/unmount—is there a use-count on devices? Can you leverage "wedges" in your solution? Does any/most/all removable storage indicate reliably when a block written has actually reached the medium?

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

Produce lessons and a list of affordable parts and free software that NetBSD hobbyists can use to teach themselves JTAG. Write your lessons for a low-cost embedded system or expansion board that NetBSD already supports.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:hard ?project ?status:active

Change infrastructure so that dependency information (currently in buildlink3.mk files) is installed with a binary package and is used from there

This is not an easy project.

pkgsrc currently handles dependencies by including buildlink3.mk files spread over pkgsrc. The problem is that these files describe the current state of pkgsrc, not the current state of the installed packages.

For this reason and because most of the information in the files is templated, the buildlink3.mk files should be replaced by the relevant information in an easily parsable manner (not necessarily make syntax) that can be installed with the package. Here the task is to come up with a definitive list of information necessary to replace all the stuff that's currently done in buildlink3.mk files (including: dependency information, lists of headers to pass through or replace by buildlink magic, conditional dependencies, ...)

The next step is to come up with a proposal how to store this information with installed packages and how to make pkgsrc use it.

Then the coding starts to adapt the pkgsrc infrastructure to do it and show with a number of trivial and non-trivial packages that this proposal works.

It would be good to provide scripts that convert from the current state to the new one, and test it with a bulk build.

Of course it's not expected that all packages be converted to the new framework in the course of this project, but the further steps should be made clear.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:hard ?project ?status:active

Anita is a tool for automated testing of NetBSD. Anita automates the process of downloading a NetBSD distribution, installing it into a fresh virtual machine and running the ATF tests in the distribution in a fully-automated manner. Originally, the main focus of Anita was on testing the sysinst installation procedure and on finding regressions that cause the system to fail to install or boot, but Anita is now used as the canonical platform for running the ATF test suite distributed with NetBSD. (You can see the results of such tests in the Test Run Logs page.)

At the moment, Anita only supports qemu as the system to create the virtual machine with. qemu gives us the ability to test several ports of NetBSD (because qemu emulates many different architectures), but qemu is very slow because it lacks hardware virtualization support in NetBSD. The goal of this project is to extend Anita to support other virtual machine systems.

There are many virtual machine systems out there, but this project focuses on adding support to, at least, the following two: Xen and VirtualBox. Xen because NetBSD has native support to run as a dom0 and a domU so Anita could be used out of the box. VirtualBox because it is the canonical free virtual machine system for workstation setups.

This project has the following milestones, in this order:

  1. Abstract the VM-specific code in Anita to provide a modular interface that supports different virtual machines at run time. This will result in one single module implementation for qemu.
  2. Create a module to provide support for Xen dom0.
  3. Create a module to provide support for VirtualBox.
  4. Update the pkgsrc package misc/py-anita to support the different virtual machine systems. This must be done by providing new packages, not by using package options.
  5. If time permits: add extra modules.

The obvious deliverable is a new version of Anita that can use any of the three mentioned virtual machines at run time, without having to be rebuilt, and the updated pkgsrc packages to install such updated version.

The candidate is supposed to know Python and be familiar with at least one virtual machine system.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Launchd is a MacOS/X utility that is used to start and control daemons similar to init(8), but much more powerful. There was an effort to port launchd to FreeBSD, but it seems to be abandoned. We should first investigate what happened to the FreeBSD effort to avoid duplicating work. The port is not trivial because launchd uses a lot of mach features.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

There existed (in past) an option to deploy portable software on Microsoft Windows systems through the use of Interix. However deploying Interix is tricky, sometimes it is impossible. Third-party software usually doesn't support running under Interix too, which reduces its utility.

MinGW present easier way to deploy familiar Unix-like environment. It is supported much better than Interix by upstream developers, and thus software is easier to get running correctly. It is easier than Interix to setup too.

Microsoft Windows is perhaps the only widespread platform that isn't supported by pkgsrc well enough. Unfortunately, pkgsrc needs to be adapted to run on MinGW. Note that obache has ported pkgsrc to Cygwin.

The goal of this project is to port pkgsrc tools to MinGW to the extent of building at least infrastructural packages.

Posted late Saturday night, November 6th, 2011 Tags: ?category:pkgsrc ?difficulty:medium ?project ?status:active

Programmatic interfaces in the system (at various levels) are currently specified in C: functions are declared with C prototypes, constants with C preprocessor macros, etc. While this is a reasonable choice for a project written in C, and good enough for most general programming, it falls down on some points, especially for published/supported interfaces. The system call interface is the most prominent such interface, and the one where these issues are most visible; however, other more internal interfaces are also candidates.

Declaring these interfaces in some other form, presumably some kind of interface description language intended for the purpose, and then generating the corresponding C declarations from that form, solves some of these problems. The goal of this project is to investigate the costs and benefits of this scheme, and various possible ways to pursue it sanely, and produce a proof-of-concept implementation for a selected interface that solves a real problem.

Note that as of this writing many candidate internal interfaces do not really exist in practice or are not adequately structured enough to support this treatment.

Problems that have been observed include:

  • Using system calls from languages that aren't C. While many compilers and interpreters have runtimes written in C, or C-oriented foreign function interfaces, many do not, and rely on munging system headers in (typically) ad hoc ways to extract necessary declarations.

  • Generating test or validation code. For example, there is some rather bodgy code attached to the vnode interface that allows crosschecking the locking behavior observed at runtime against a specification. This code is currently generated from the existing vnode interface specification in vnode_if.src; however, the specification, the generator, and the output code all leave a good deal to be desired. Other interfaces where this kind of validation might be helpful are not handled at all.

  • Generating marshalling code. The code for (un)marshalling system call arguments within the kernel is all handwritten; this is laborious and error-prone, especially for the various compat modules. A code generator for this would save a lot of work. Note that there is already an interface specification of sorts in syscalls.master; this would need to be extended or converted to a better description language. Similarly, the code used by puffs and refuse to ship file system operations off to userland is largely or totally handwritten and in a better world would be automatically generated.

  • Generating trace or debug code. The code for tracing system calls and decoding the trace results (ktrace and kdump) is partly hand-written and partly generated with some fairly ugly shell scripts. This could be systematized; and also, given a complete specification to work from, a number of things that kdump doesn't capture very well could be improved. (For example, kdump doesn't decode the binary values of flags arguments to most system calls.)

  • Add your own here. (If proposing this project for GSOC, your proposal should be specific about what problem you're trying to solve and how the use of an interface definition language will help solve it. Please consult ahead of time to make sure the problem you're trying to solve is considered interesting/worthwhile.)

The project requires: (1) choose a target interface and problem; (2) evaluate existing IDLs (there are many) for suitability; (3) choose one, as roll your own should be only a last resort; (4) prepare a definition for the target interface in your selected IDL; (5) implement the code generator to produce the corresponding C declarations; (6) integrate the code generator and its output into the system (this should require minimal changes); (7) implement the code generator and other material to solve your target problem.

Note that while points 5, 6, and 7 are a reasonably simple matter of programming, parts 1-3 and possibly 4 constitute research -- these parts are as important as the code is, maybe more so, and doing them well is critical to success.

The IDL chosen should if at all possible be suitable for repeating steps 4-7 for additional choices of target interface and problem, as there are a lot of such choices and many of them are probably worth pursuing in the long run.

Some things to be aware of:

  • syscalls.master contains a partial specification of the functions in the system call interface. (But not the types or constants.)
  • vnode_if.src contains a partial specification of the vnode interface.
  • There is no specification, partial or otherwise, of the other parts of the VFS interface, other than the code and the (frequently outdated) section 9 man pages.
  • There are very few other interfaces within the kernel that are (as of this writing) structured enough or stable enough to permit using an IDL without regretting it. The bus interface might be one.
  • Another possible candidate is to pursue a specification language for handling all the conditional visibility and namespace control material in the userspace header files. This requires a good deal of familiarity with the kinds of requirements imposed by standards and the way these requirements are generally understood and may be too large a project for the GSOC time frame.

Note: the original form of this project description was written assuming that the interface description language would be XML. The belief that plain XML would be an adequate IDL probably arose from too much koolaid. It is unlikely that an XML-based IDL will be accepted by the community as the general consensus is that XML is not human readable and is not suitable as a source format.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:hard ?project ?status:active

Amazon Web Services offers virtual hosts on demand via its Amazon Elastic Comput Cloud (or EC2). Users can run a number of operating systems via so-called "Amazon Machine Images" (AMIs), currently including various Linux flavors and OpenSolaris, but not NetBSD.

This project would entail developing a NetBSD "AMI" and providing an easy step-by-step guide to create custom NetBSD AMIs. The difficulty of this is estimated to not be very high, as prior work in this direction exists (instructions for creating a FreeBSD AMI, for example, are available). Depending on the progress, we may adjust the final goals -- possibilities include adding necessary (though optional) glue to the release process to create official AMIs for each release or to provide images with common applications pre-installed.

See also: this thread on the port-xen mailing list as well as this page.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Chrooted enviroments that need ptys cannot currently use ptyfs because only one instance of ptyfs at a time is supported. The task is to enhance ptyfs so that it can be mounted multiple times. The problems that need to be solved are:

  • The ptyfs code needs to be modified so that the correct path is returned depending on the mount point in the TIOCPTMGET ioctl. There was code there to do this, but it was removed because it was not working properly.
  • Since there can be only one instance of each major/minor pty in the system, each mount point should only display the ptys that were created on that mount point and not others.
  • ptyfs replaces the handlers for the traditional BSD style ptys when it is mounted and puts them back once it is unmounted. This handler hijacking will not work with multiple mounts. Perhaps a solution is to do it only for the first mount?

This project is implemented all in the kernel and has no documentation requirements.

For extra credit you can investigate if providing continuous numbers on each mount point (without gaps) is feasible without too much recoding, and implement it.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:easy ?project ?status:active

File systems are historically mounted by specifying which type of file system is mounted from where (e.g. mount -t ffs /dev/wd0a /mnt). However, sometimes a user is only interested in making the data available and not particularly interested in how or from where it is made available.

This project investigates the first steps in turning file systems into network-transparent services by making it possible to mount any kernel file system type from any location on the network. The file system components to be used are puffs and rump. puffs is used to forward local file system requests from the kernel to userspace and rump is used to facilitate running the kernel file system in userspace as a service daemon.

The subtasks are the following:

  • Write the necessary code to be able to forward requests from one source to another. This involves most likely reworking a bit of the libpuffs option parsing code and creating an puffs client, say, mount_puffs to be able to forward requests from one location to another. The puffs protocol should be extended to include the necessary new features or another protocol defined.

    Proof-of-concept code for this has already been written.

  • Currently the puffs protocol used for communication between the kernel and userland is machine dependent. To facilitate forwarding the protocol to remote hosts, a machine independent version must be specified.

  • To be able to handle multiple clients, the file systems must be converted to daemons instead of being utilities. This will also, in the case of kernel file system servers, include adding locking to the communication protocol.

The end result will look something like this:

# start serving ffs from /dev/wd0a on port 12675
onehost> ffs_serv -p 12675 /dev/wd0a
# start serving cd9660 from /dev/cd0a on port 12676
onehost> cd9660_serv -p 12675 /dev/cd0a

# meanwhile in outer space, mount anotherhost from port 12675
anotherhost> mount_puffs -t tcp onehost:12675 /mnt
anotherhost> mount
...
anotherhost:12675 on /mnt type <negotiated>
...
anotherhost> cd /mnt
anotherhost> ls
  ... etc

The student should have some familiarity with file systems and network services. The application should include an answer to the following question: "how is this different from nfs?"

Posted late Saturday night, November 6th, 2011 Tags: ?category:filesystems ?difficulty:medium ?project ?status:active

The current lpr/lpd system in NetBSD is ancient, and doesn't support modern printer systems very well. Interested parties would do a from scratch rewrite of a new, modular lpr/lpd system that would support both the old lpd protocol and newer, more modern protocols like IPP, and would be able to handle modern printers easily.

This project is not intrinsically difficult, but will involve a rather large chunk of work to complete.

Note that the goal of this exercise is not to reimplement cups -- cups already exists and one like it was enough.

Some notes:

It seems that a useful way to do this would be to divide the printing system in two: a client-side system, which is user-facing and allows submitting print jobs to arbitrary print servers, and a server-side system, which implements queues and knows how to talk to actual printer devices. In the common case where you don't have a local printer but use printers that are out on the network somewhere, the server-side system wouldn't be needed at all. When you do have a local printer, the client-side system would submit jobs to the local server-side system using the lpr protocol (or IPP or something else) over a local socket but otherwise treat it no differently from any other print server.

The other important thing moving forward: lpr needs to learn about MIME types and accept an argument to tell it the MIME types of its input files. The current family of legacy options lpr accepts for file types are so old as to be almost completely useless; meanwhile the standard scheme of guessing file types inside the print system is just a bad design overall. (MIME types aren't great but they're what we have.)

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Design and implement a mechanism that allows for fast user level access to kernel time data structures for NetBSD. For certain types of small data structures the system call overhead is significant. This is especially true for frequently invoked system calls like clock_gettime(2) and gettimeofday(2). With the availability of user level readable high frequency counters it is possible to create fast implementations for precision time reading. Optimizing clock_gettime(2) and alike will reduce the strain from applications frequently calling these system calls and improves timing information quality for applications like NTP. The implementation would be based on a to be modified version of the timecounters implementation in NetBSD.

See also the Paper on Timecounters by Poul-Henning Kamp.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Write a script that aids refactoring C programs by extracting subroutines from fragments of C code.

Do not reinvent the wheel: wherever possible, use existing technology for the parsing and comprehension of C code. Look at projects such as sparse and Coccinelle.

Your script should work as a filter that puts the free variables at the top, encloses the rest in curly braces, does something helpful with break, continue, and return statements.

That's just a start.

Posted late Saturday night, November 6th, 2011 Tags: ?category:userland ?difficulty:medium ?project ?status:active

Design and implement a general API for control of LED and LCD type devices on NetBSD. The API would allow devices to register individual LED and LCD devices, along with a set of capabilities for each one. Devices that wish to display status via an LED/LCD would also register themselves as an event provider. A userland program would control the wiring of each event provider, to each output indicator. The API would need to encompass all types of LCD displays, such as 3 segment LCDs, 7 segment alphanumerics, and simple on/off state LED's. A simple example is a keyboard LED, which is an output indicator, and the caps-lock key being pressed, which is an event provider.

There is prior art in OpenBSD; it should be checked for suitability, and any resulting API should not differ from theirs without reason.

Posted late Saturday night, November 6th, 2011 Tags: ?category:kernel ?difficulty:medium ?project ?status:active

Port valgrind to NetBSD for pkgsrc, then use it to do an audit of any memory leakage.

See also http://valgrind.org and http://vg4nbsd.berlios.de for work in progress.

Posted late Saturday night, November 6th, 2011 Tags: ?category:misc ?difficulty:medium ?project ?status:active
Add a comment