Annotation of wikisrc/projects/project/findoptimizer.mdwn, revision 1.2

1.1       dholland    1: [[!template id=project
                      3: title="Query optimizer for find(1)"
                      5: contact="""
                      6: [tech-userlevel](
                      7: """
                      9: mentors="""
                     10: [David Holland](
                     11: """
1.2     ! dholland   13: category="userland"
1.1       dholland   14: difficulty="medium"
                     15: duration="2-8 months depending on ambition"
                     17: description="""
                     18: Add a query optimizer to find(1).
                     20: Currently find builds a query plan for its search, and then executes
                     21: it with little or no optimization. Add an optimizer pass on the plan
                     22: that makes it run faster.
                     24: Things to concentrate on are transforms that allow skipping I/O: not
                     25: calling stat(2) on files that will not be matched, for example, or not
                     26: recursing into subdirectories whose contents cannot ever match.
                     27: Combining successive string matches into a single match pattern might
                     28: also be a win; so might precompiling match patterns into an executable
                     29: match form (like with regcomp(3)).
                     31: To benefit from many of the possible optimizations it may be necessary
                     32: to extend the fts(3) interface and/or extend the query plan schema or
                     33: the plan execution logic. For example, currently it doesn't appear to
                     34: be possible for an fts(3) client to take advantage of file type
                     35: information returned by readdir(3) to avoid an otherwise unnecessary
                     36: call to stat(2).
                     38: Step 1 of the project is to choose a number of candidate
                     39: optimizations, and for each identify the internal changes needed and
                     40: the expected benefits to be gained.
                     42: Step 2 is to implement a selected subset of these based on available
                     43: time and cost/benefit analysis.
                     45: It is preferable to concentrate on opportunities that can be found in
                     46: find invocations likely to actually be typed by users or issued by
                     47: programs or infrastructure (e.g. in pkgsrc), vs. theoretical
                     48: opportunities unlikely to appear in practice.
                     49: """
                     50: ]]
                     52: [[!tag gsoc]]

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb