Annotation of wikisrc/projects/project/atomic_radix_patricia_trees.mdwn, revision 1.2

1.1       jmmv        1: [[!template id=project
                      3: title="Lockless, atomic and generic Radix/Patricia trees"
                      5: contact="""
                      6: [tech-kern](,
                      7: [board](,
                      8: [core](
                      9: """
                     11: category="kernel"
                     12: difficulty="hard"
                     14: description="""
1.2     ! riastrad   15: This project proposal is a subtask of [[smp_networking]].
1.1       jmmv       16: 
                     17: The goal of this project is to implement lockless, atomic and generic Radix
                     18: and Patricia trees.  BSD systems have always used a radix tree for their
                     19: routing tables.  However, the radix tree implementation is showing its age.
                     20: Its lack of flexibility (it is only suitable for use in a routing table)
                     21: and overhead of use (requires memory allocation/deallocation for insertions
                     22: and removals) make replacing it with something better tuned to today's
                     23: processors a necessity.
                     25: Since a radix tree branches on bit differences, finding these bit
                     26: differences efficiently is crucial to the speed of tree operations.  This
                     27: is most quickly done by XORing the key and the tree node's value together
                     28: and then counting the number of leading zeroes in the result of the XOR.
                     29: Many processors today (ARM, PowerPC) have instructions that can count the
                     30: number of leading zeroes in a 32 bit word (and even a 64 bit word).  Even
                     31: those that do not can use a simple constant time routine to count them:
                     33:     int
                     34:     clz(unsigned int bits)
                     35:     {
                     36:         int zeroes = 0;
                     37:         if (bits == 0)
                     38:             return 32;
                     39:         if (bits & 0xffff0000) bits &= 0xffff0000; else zeroes += 16;
                     40:         if (bits & 0xff00ff00) bits &= 0xff00ff00; else zeroes += 8;
                     41:         if (bits & 0xf0f0f0f0) bits &= 0xf0f0f0f0; else zeroes += 4;
                     42:         if (bits & 0xcccccccc) bits &= 0xcccccccc; else zeroes += 2;
                     43:         if (bits & 0xaaaaaaaa) bits &= 0xaaaaaaaa; else zeroes += 1;
                     44:         return zeroes;
                     45:     }
                     47: The existing BSD radix tree implementation does not use this method but
                     48: instead uses a far more expensive method of comparision.  Adapting the
                     49: existing implementation to do the above is actually more expensive than
                     50: writing a new implementation.
                     52: The primary requirements for the new radix tree are:
                     54: * Be self-contained.  It cannot require additional memory other than what
                     55:   is used in its data structures.
                     57: * Be generic.  A radix tree has uses outside networking.
                     59: To make the radix tree flexible, all knowledge of how keys are represented
                     60: has to be encapsulated into a `pt_tree_ops_t` structure with these
                     61: functions:
                     63: * `bool ptto_matchnode(const void *foo, const void *bar, pt_bitoff_t
                     64:   max_bitoff, pt_bitoff_t *bitoffp, pt_slot_t *slotp);`
                     66:   Returns true if both `foo` and `bar` objects have the identical string of
                     67:   bits starting at `*bitoffp` and ending before `max_bitoff`.  In addition
                     68:   to returning true, `*bitoffp` should be set to the smaller of
                     69:   `max_bitoff` or the length, in bits, of the compared bit strings.  Any
                     70:   bits before `*bitoffp` are to be ignored.  If the string of bits are not
                     71:   identical, `*bitoffp` is set to the where the bit difference occured,
                     72:   `*slotp` is the value of that bit in `foo`, and false is returned.  The
                     73:   `foo` and `bar` (if not `NULL`) arguments are pointers to a key member
                     74:   inside a tree object.  If bar is `NULL`, then assume it points to a key
                     75:   consisting of entirely of zero bits.
                     77: * `bool ptto_matchkey(const void *key, const void *node_key,
                     78:   pt_bitoff_t bitoff, pt_bitlen_t bitlen);`
                     80:   Returns true if both `key` and `node_key` objects have identical strings
                     81:   of `bitlen` bits starting at `bitoff`.  The `key` argument is the same
                     82:   key argument supplied to `ptree_find_filtered_node`.
                     84: * `pt_slot_t ptto_testnode(const void *node_key, pt_bitoff_t bitoff,
                     85:   pt_bitlen_t bitlen);`
                     87:   Returns `bitlen` bits starting at `bitoff` from `node_key`.  The
                     88:   `node_key` argument is a pointer to the key members inside a tree object.
                     90: * `pt_slot_t ptto_testkey(const void *key, pt_bitoff_t bitoff,
                     91:   pt_bitlen_t bitlen);`
                     93:   Returns `bitlen` bits starting at `bitoff` from key.  The `key` argument
                     94:   is the same key argument supplied to `ptree_find_filtered_node`.
                     96: All bit offsets are relative to the most significant bit of the key,
                     98: The ptree programming interface should contains these routines:
                    100: * `void ptree_init(pt_tree_t *pt, const pt_tree_ops_t *ops, size_t
                    101:   ptnode_offset, size_t key_offset);`
                    103:   Initializes a ptree.  If `pt` points at an existing ptree, all knowledge
                    104:   of that ptree is lost.  The `pt` argument is a pointer to the `pt_tree_t`
                    105:   to be initialized.  The `ops` argument is a pointer to the
                    106:   `pt_tree_ops_t` used by the ptree.  This has four members: The
                    107:   `ptnode_offset` argument contains the offset from the beginning of an
                    108:   item to its `pt_node_t` member.  The `key_offset` argument contains the
                    109:   offset from the beginning of an item to its key data.  This is used if 0
                    110:   is used, a pointer to the beginning of the item will be generated.
                    112: * `void *ptree_find_filtered_node(pt_tree_t *pt, const void *key,
                    113:   pt_filter_t filter, void *filter_ctx);`
                    115:   The filter argument is either `NULL` or a function `bool (*)(const
                    116:   void *, void *, int);`
                    118: * `bool ptree_insert_mask_node(pt_tree_t *pt, void *item, pt_bitlen_t
                    119:   masklen);`
                    121: * `bool ptree_insert_node(pt_tree_t *pt, void *item);`
                    123: * `void *ptree_iterate(pt_tree_t *pt, const void *node, pt_direction_t
                    124:   direction);`
                    126: * `void ptree_remove_node(pt_tree_t *pt, const pt_tree_ops_t *ops,
                    127:   void *item);`
                    128: """
                    129: ]]
                    131: [[!tag smp_networking]]

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb