File:  [NetBSD Developer Wiki] / wikisrc / projects / project / apropos.mdwn
Revision 1.3: download - view: text, annotated - select for diffs
Mon Feb 27 07:44:02 2012 UTC (2 years, 1 month ago) by wiki
Branches: MAIN
CVS tags: HEAD
web commit by spz

[[!template id=project

title="Apropos replacement based on mandoc and SQLite's FTS"

contact="""
[tech-userlevel](mailto:tech-userlevel@NetBSD.org)
"""

mentors="""
[Jörg Sonnenberger](mailto:joerg@NetBSD.org)
"""

category="userland"
difficulty="easy"
duration="3 months"
done_by="Abhinav Upadhyay"

description="""
NetBSD ships a lot of useful documentation in the form of manual pages.
Finding the right manual page can be difficult though.
If you look for a library function, it will sometimes fail, because it is part of a larger manual page and doesn't have a MLINKS entry.
If you look for a program, but don't know the exact name, it can be hard to find as well.

Historically, the content of the NAME section of each manual page has been extracted and put into a special file.
The apropos command has been used to search this file based on keywords.
This brings severe limitations as it restricts the list of potential matches significantly and requires very good descriptions of the content of a manual page in typically one line.

The goal of this project is to provide a modern replacement based on the [Full Text Search of SqLite](http://sqlite.org/fts3.html).
The most basic version of the new apropos builds an index from the text output of [mandoc](http://mdocml.bsd.lv/mandoc.1.html) and queries it using appropriate SQL syntax.
Some basic form of position indications should be provided as well (e.g. line number).

A more advanced version could use the mandoc parser directly too.
This would easily allow relatively precise position marks for the HTML version of manual pages.
It would also allow weighting the context of a word.
Consider Google's preference of URLs that contain the keywords or documents containing them in the head lines as an example.

Another idea is to use the index for directly handling manual page aliases.
This could replace the symbolic links currently used via the MLINKS mechanism.
The aliases can be derived automatically from the .Nm macros in the manual page.
"""
]]

[[!tag gsoc]]

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb