Annotation of wikisrc/projects/project/findoptimizer.mdwn, revision 1.2

1.1       dholland    1: [[!template id=project
                      2: 
                      3: title="Query optimizer for find(1)"
                      4: 
                      5: contact="""
                      6: [tech-userlevel](mailto:tech-userlevel@NetBSD.org)
                      7: """
                      8: 
                      9: mentors="""
                     10: [David Holland](mailto:dholland@NetBSD.org)
                     11: """
                     12: 
1.2     ! dholland   13: category="userland"
1.1       dholland   14: difficulty="medium"
                     15: duration="2-8 months depending on ambition"
                     16: 
                     17: description="""
                     18: Add a query optimizer to find(1).
                     19: 
                     20: Currently find builds a query plan for its search, and then executes
                     21: it with little or no optimization. Add an optimizer pass on the plan
                     22: that makes it run faster.
                     23: 
                     24: Things to concentrate on are transforms that allow skipping I/O: not
                     25: calling stat(2) on files that will not be matched, for example, or not
                     26: recursing into subdirectories whose contents cannot ever match.
                     27: Combining successive string matches into a single match pattern might
                     28: also be a win; so might precompiling match patterns into an executable
                     29: match form (like with regcomp(3)).
                     30: 
                     31: To benefit from many of the possible optimizations it may be necessary
                     32: to extend the fts(3) interface and/or extend the query plan schema or
                     33: the plan execution logic. For example, currently it doesn't appear to
                     34: be possible for an fts(3) client to take advantage of file type
                     35: information returned by readdir(3) to avoid an otherwise unnecessary
                     36: call to stat(2).
                     37: 
                     38: Step 1 of the project is to choose a number of candidate
                     39: optimizations, and for each identify the internal changes needed and
                     40: the expected benefits to be gained.
                     41: 
                     42: Step 2 is to implement a selected subset of these based on available
                     43: time and cost/benefit analysis.
                     44: 
                     45: It is preferable to concentrate on opportunities that can be found in
                     46: find invocations likely to actually be typed by users or issued by
                     47: programs or infrastructure (e.g. in pkgsrc), vs. theoretical
                     48: opportunities unlikely to appear in practice.
                     49: """
                     50: ]]
                     51: 
                     52: [[!tag gsoc]]

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb