title="Lockless, atomic and generic Radix/Patricia trees"
This project proposal is a subtask of [[smp_networking]].
The goal of this project is to implement lockless, atomic and generic Radix
and Patricia trees. BSD systems have always used a radix tree for their
routing tables. However, the radix tree implementation is showing its age.
Its lack of flexibility (it is only suitable for use in a routing table)
and overhead of use (requires memory allocation/deallocation for insertions
and removals) make replacing it with something better tuned to today's
processors a necessity.
Since a radix tree branches on bit differences, finding these bit
differences efficiently is crucial to the speed of tree operations. This
is most quickly done by XORing the key and the tree node's value together
and then counting the number of leading zeroes in the result of the XOR.
Many processors today (ARM, PowerPC) have instructions that can count the
number of leading zeroes in a 32 bit word (and even a 64 bit word). Even
those that do not can use a simple constant time routine to count them:
clz(unsigned int bits)
int zeroes = 0;
if (bits == 0)
if (bits & 0xffff0000) bits &= 0xffff0000; else zeroes += 16;
if (bits & 0xff00ff00) bits &= 0xff00ff00; else zeroes += 8;
if (bits & 0xf0f0f0f0) bits &= 0xf0f0f0f0; else zeroes += 4;
if (bits & 0xcccccccc) bits &= 0xcccccccc; else zeroes += 2;
if (bits & 0xaaaaaaaa) bits &= 0xaaaaaaaa; else zeroes += 1;
The existing BSD radix tree implementation does not use this method but
instead uses a far more expensive method of comparision. Adapting the
existing implementation to do the above is actually more expensive than
writing a new implementation.
The primary requirements for the new radix tree are:
* Be self-contained. It cannot require additional memory other than what
is used in its data structures.
* Be generic. A radix tree has uses outside networking.
To make the radix tree flexible, all knowledge of how keys are represented
has to be encapsulated into a `pt_tree_ops_t` structure with these
* `bool ptto_matchnode(const void *foo, const void *bar, pt_bitoff_t
max_bitoff, pt_bitoff_t *bitoffp, pt_slot_t *slotp);`
Returns true if both `foo` and `bar` objects have the identical string of
bits starting at `*bitoffp` and ending before `max_bitoff`. In addition
to returning true, `*bitoffp` should be set to the smaller of
`max_bitoff` or the length, in bits, of the compared bit strings. Any
bits before `*bitoffp` are to be ignored. If the string of bits are not
identical, `*bitoffp` is set to the where the bit difference occured,
`*slotp` is the value of that bit in `foo`, and false is returned. The
`foo` and `bar` (if not `NULL`) arguments are pointers to a key member
inside a tree object. If bar is `NULL`, then assume it points to a key
consisting of entirely of zero bits.
* `bool ptto_matchkey(const void *key, const void *node_key,
pt_bitoff_t bitoff, pt_bitlen_t bitlen);`
Returns true if both `key` and `node_key` objects have identical strings
of `bitlen` bits starting at `bitoff`. The `key` argument is the same
key argument supplied to `ptree_find_filtered_node`.
* `pt_slot_t ptto_testnode(const void *node_key, pt_bitoff_t bitoff,
Returns `bitlen` bits starting at `bitoff` from `node_key`. The
`node_key` argument is a pointer to the key members inside a tree object.
* `pt_slot_t ptto_testkey(const void *key, pt_bitoff_t bitoff,
Returns `bitlen` bits starting at `bitoff` from key. The `key` argument
is the same key argument supplied to `ptree_find_filtered_node`.
All bit offsets are relative to the most significant bit of the key,
The ptree programming interface should contains these routines:
* `void ptree_init(pt_tree_t *pt, const pt_tree_ops_t *ops, size_t
ptnode_offset, size_t key_offset);`
Initializes a ptree. If `pt` points at an existing ptree, all knowledge
of that ptree is lost. The `pt` argument is a pointer to the `pt_tree_t`
to be initialized. The `ops` argument is a pointer to the
`pt_tree_ops_t` used by the ptree. This has four members: The
`ptnode_offset` argument contains the offset from the beginning of an
item to its `pt_node_t` member. The `key_offset` argument contains the
offset from the beginning of an item to its key data. This is used if 0
is used, a pointer to the beginning of the item will be generated.
* `void *ptree_find_filtered_node(pt_tree_t *pt, const void *key,
pt_filter_t filter, void *filter_ctx);`
The filter argument is either `NULL` or a function `bool (*)(const
void *, void *, int);`
* `bool ptree_insert_mask_node(pt_tree_t *pt, void *item, pt_bitlen_t
* `bool ptree_insert_node(pt_tree_t *pt, void *item);`
* `void *ptree_iterate(pt_tree_t *pt, const void *node, pt_direction_t
* `void ptree_remove_node(pt_tree_t *pt, const pt_tree_ops_t *ops,
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb