File:  [NetBSD Developer Wiki] / wikisrc / bugtracking.mdwn
Revision 1.7: download - view: text, annotated - select for diffs
Thu Jun 26 13:52:33 2014 UTC (5 years, 11 months ago) by schmonz
Branches: MAIN
CVS tags: HEAD
Add bsiegert's wish for more meaningful states of "closed".

    1: # Bugtracking in NetBSD
    2: 
    3: Currently NetBSD uses gnats for bugtracking.
    4: Gnats is a horrible legacy tool.
    5: There is a whole page of [[stuff we dislike about gnats]].
    6: 
    7: It has been clear for a long time (years) that we need to migrate to
    8: some other bugtracker.
    9: Various tools have been proposed, usually without much thought being
   10: applied.
   11: 
   12: The way this usually works is that
   13: the topic comes up and then a dozen people say "Let's use
   14: $MY_FAVORITE_TOOL! It's great!"
   15: Then a shouting match ensues and the people who call for requirements
   16: analysis or even a list of criteria to pick one tool over another get
   17: slagged for standing in the way of 'progress'.
   18: These arguments inevitably take place with little understanding of
   19: either the project's needs or the problems that a bugtracker needs to
   20: handle for us.
   21: Little or no information is generated, and nothing happens, except
   22: that the particpants tend to get alienated and demotivated.
   23: 
   24: In order to avoid going around this barn again and again I'm creating
   25: this page to try to document some of the genuine issues, as well as
   26: conclusions that have been drawn in the past about requirements and
   27: paths forward.
   28: 
   29: 
   30: ## Problems with bugtracking in NetBSD
   31: 
   32: This section lists and discusses some of the challenges that arise
   33: handling NetBSD's bug reports.
   34: This is not meant to be a list of gripes about gnats -- there's
   35: another page for that (above) -- but a list of things that are still
   36: issues no matter what bugtracker we're using.
   37: 
   38: ### We already have a bug database.
   39: 
   40: Any plan for moving forward has to be able to import the existing bug
   41: database without losing information.
   42: This is basically not negotiable -- we cannot throw away the existing
   43: bug data.
   44: (The alternative to importing the existing data is to keep gnats
   45: running indefinitely in parallel with a new system.
   46: Besides being confusing, this does nothing to solve the problems with
   47: gnats itself.)
   48: 
   49: ### The bug database is large.
   50: 
   51: NetBSD's existing bug database currently contains almost 48,000 bugs.
   52: This is not especially large compared to other large projects
   53: (consider the likely size of the Windows bug database, if you will...)
   54: but it is large compared to _most projects_.
   55: A lot of bugtrackers will creak, groan, and tip over if asked to track
   56: this many bugs.
   57: Or, they may be able to store and retrieve the bugs fine but the user
   58: interface for handling them just fails to scale to the database size.
   59: 
   60: Many people who propose their favorite tool have used it for assorted
   61: projects but never actually tried using it on a large bug database.
   62: Some of these tools turn out to work ok on large databases, and others
   63: don't.
   64: 
   65: There are also currently some 5400 open bug reports; some bugtrackers
   66: that scale adequately to 50,000 bugs in the database turn out to not
   67: be able to handle having so many of them open at once.
   68: 
   69: Another consequence of the database size is that the schema conversion
   70: for any migration must be automatic.
   71: It is not feasible to hand-edit or even review all bugs, or even all
   72: open bugs, as part of a transition to a new system.
   73: This creates problems for many otherwise decent choices for a new
   74: bugtracker, as most bugtrackers (just like gnats itself) have their
   75: own hardcoded assumptions about the schema and about things like what
   76: states bugs can be in, and no two are the same.
   77: 
   78: ### The bug database is broad and not readily subdivided.
   79: 
   80: There is a wide range of software in NetBSD, and an even wider range
   81: in pkgsrc, and we get bug reports on all of it.
   82: There are plenty of identifiable units in this, such as specific
   83: pkgsrc packages, but many bugs can't be linked directly to one of
   84: these.
   85: 
   86: Furthermore, the existing database is only divided into broad
   87: categories (kern, bin, pkg, etc.) and even these don't work all that
   88: well sometimes.
   89: (And the deployed base of send-pr scripts causes new PRs to come in
   90: with only this much classification, something that can be changed only
   91: slowly.)
   92: 
   93: The result of this is that it's hard to find things in the bug
   94: database by looking around.
   95: You can browse the database based on metadata (or you could if gnats
   96: sucked less, currently it's hard) but we don't have the metadata
   97: needed to do this effectively.
   98: You can also search the database based on metadata (even gnats can do
   99: this) but it doesn't really produce useful results for the same
  100: reason.
  101: This will remain true if we just switch to a different bugtracker;
  102: to make progress on this problem we need more and better metadata.
  103: 
  104: Many large bug databases (CPAN's bug database was recently floated as
  105: an example) can be clearly subdivided into individual projects or
  106: subprojects and don't have this problem.
  107: You can just look at bugs for the (sub)project you're interested in
  108: and the number of those is manageable.
  109: 
  110: This problem is not unique to NetBSD (FreeBSD shares it, for example)
  111: but as far as I know it's not common outside of OS projects because
  112: most other projects are not broad in the same way.
  113: 
  114: ### Search doesn't work too well.
  115: 
  116: Because of the nature of the names of Unix entities (programs,
  117: drivers, virtually everything), searching for them in a large text
  118: corpus like the bug database doesn't work too well.
  119: This problem is exacerbated if you're trying to find bugs filed
  120: against programs that often appear incidentally in bug reports, like
  121: make or sh.
  122: 
  123: This is not just a consequence of gnats issues; search won't work all
  124: that well no matter what we do.
  125: (Try typing "sh site:gnats.netbsd.org" into Google.
  126: When Google can't do it, no bugtracker is going to do better.)
  127: 
  128: This means that text search really does not work as an alternative to
  129: be able to find things by browsing or via metadata.
  130: 
  131: 
  132: ## Some observations about the problems
  133: 
  134: The most basic problem we have is _finding stuff_.
  135: Back when I first started tackling the bug database, I found that the
  136: best way to make progress was not to search (either for text or
  137: metadata) or to browse but to ask for a randomly selected open PR.
  138: This basically constitutes a total failure of the bugtracker: it was
  139: completely unable to provide useful information of any kind.
  140: 
  141: I ([[dholland]]) have since learned some tricks and have also accumulated
  142: an external index for the database; this means I can get stuff out of
  143: it now, at least sometimes, but most developers are in the position I
  144: was then: the bug database is a completely useless black hole.
  145: Several developers have recently said so; also we have the same
  146: problem that FreeBSD observed in their database some time back, which
  147: is that new bugs come in and get seen, and maybe they get fixed, but
  148: if they don't get fixed fairly soon they get forgotten and hang around
  149: indefinitely.
  150: 
  151: This is partly a consequence of gnats issues, and this is why gnats
  152: must go.
  153: However, as described above it isn't entirely because of gnats:
  154: finding stuff by navigating (or searching) metadata is hard because we
  155: don't have adequate metadata, and finding stuff by searching for text
  156: is hard because it's a fundamentally hard text-retrieval problem.
  157: 
  158: Therefore, if we want to actually improve the situation, any migration
  159: plan needs to include a way to get more metadata into the database.
  160: This metadata will mostly need to be hand-applied; this is expensive
  161: but not insurmountable (for 5400 open existing PRs) and not that big a
  162: deal for incoming new PRs... provided the new bugtracker has adequate
  163: support for arbitrary metadata, which many don't.
  164: 
  165: Note that in addition to the above analysis we also have some
  166: supporting results.
  167: Based on the analysis I ([[dholland]]) started maintaining an annotated
  168: browseable index (aka the "buglists" pages) of the bug database.
  169: This basically amounted to additional per-bug metadata of several
  170: kinds, organized in a fashion that allowed generating an index as a
  171: tree of web pages.
  172: Unfortunately because it was a gimcrack thing only I could update it,
  173: and unfortunately it also needed to be synchronized with the gnats
  174: database by hand, with the result that when my available time dried up
  175: it went out of date and is now pretty much useless.
  176: 
  177: However, while it existed people used it and it helped them.
  178: Before we had it the number of open PRs had been steadily increasing
  179: over time.
  180: (The occasional hackathon brought down the count from time to time,
  181: but never persistently.)
  182: During the time we had it, the number of open PRs remained more or
  183: less stable at around 4800-4900.
  184: Now we don't have it again and we're up to 5400 open PRs.
  185: The influx has not changed much since I got behind on it; if anything
  186: it's dropped.
  187: What this means is that the rate PRs are getting fixed has dropped,
  188: and that's because it's become impossible to find anything again.
  189: 
  190: It seems to me that one of the chief things we want from a new
  191: bugtracker is to be able to provide something like this browseable
  192: index.
  193: Therefore it must be able to support the kinds of metadata that the
  194: buglists tree was using.
  195: 
  196: If we move to a bugtracker without this support, it may solve some of
  197: the more glaring problems with gnats, but it isn't going to help us
  198: _find_ stuff in the bug database and it isn't going to do anything to
  199: help make the large backlog of unfixed bugs go away.
  200: 
  201: 
  202: ## Metadata types
  203: 
  204: After working the bug database for some years and also after
  205: maintaining the buglists pages, I've come to the conclusion that
  206: we need the following _types_ of metadata:
  207: 
  208: * fields containing tags
  209: * fields containing one of an enumerated list of choices
  210: * fields containing a classification according to a hierarchical taxonomy
  211: * fields containing free-form text
  212: 
  213: And also, importantly, we need arbitrarily many such fields, not a
  214: fixed set concocted at the time the database is set up.
  215: 
  216: ### Tags
  217: 
  218: A tag field contains zero or more entries from a list of allowable
  219: choices.
  220: This is a well-understood concept and most bug databases support tags
  221: in one way or another; however, I don't think most support arbitrarily
  222: many different tags fields with their own sets of allowable tags.
  223: 
  224: Two questions immediately arise from this description: why do we need
  225: to restrict tags to allowable values, and why do we need multiple tags
  226: fields instead of just one?
  227: 
  228: The first question is easy: when you have a database of 50,000 things
  229: you need to place some controls on what gets entered or you eventually
  230: end up with trash.
  231: This is just a fact of life with databases when they get big enough.
  232: We have enough problems without having to deal with misspelled tags
  233: and typos.
  234: 
  235: The second follows partly from the first and partly from ensuing human
  236: interface concerns: if you just have one tags field, the number of
  237: possible tag values grows without bound as more and more tags get
  238: added, and before too long the list becomes itself hard to work with.
  239: Grouping tags into logical sets (e.g. all releng tags for which bugs
  240: are critical for which releases in one field) makes it much easier to
  241: search for them, and also much easier to browse the database looking
  242: for bugs that aren't tagged but should be.
  243: 
  244: Also this makes it possible to have developer-only tag fields, private
  245: personal tags, and so forth, without undue complications.
  246: 
  247: ### Enumerations
  248: 
  249: An enumeration field contains exactly one entry from a list of
  250: allowable choices.
  251: This differs from a tag field in certain obvious ways.
  252: We already have some of these in gnats, but they're hardcoded fields 
  253: rather than being instances of a general metadata type.
  254: 
  255: If we're going to have arbitrary metadata fields at all (rather than a
  256: fixed set of predefined fields) we more or less need enumeration
  257: fields to be able to migrate the existing database.
  258: 
  259: Also some things that one might abuse tags for if one only had tags
  260: are perhaps better handled as enumerations; e.g. no bug should be both
  261: "critical" for a release and also "would be nice" for the same
  262: release.
  263: 
  264: That said, a bugtracker that only has tags fields is probably adequate
  265: (though not entirely desirable) because one can abuse tags instead.
  266: 
  267: ### Hierarchical taxonomy
  268: 
  269: A hierarchical taxonomy is a scheme for identifying (and thus,
  270: finding) things based on a nested series of choices.
  271: The hierarchical taxonomy most people are most familiar with is
  272: probably the scheme for species in biology.
  273: That (especially in its more modern forms) is more complex than we
  274: need but the basic principle is the same: at the top you have
  275: everything, and then you pick one of several kinds of things and then
  276: you're dealing with a restricted subset, and so forth until you get
  277: down to a manageable number of things to look at at once that all have similar properties.
  278: 
  279: The buglists pages supported tags as well but were fundamentally based
  280: on a hierarchical classification of bugs based on where in the system
  281: they occur.
  282: This (as noted above) has been extremely useful in practice.
  283: 
  284: There is another hierarchical taxonomy that I'd like to deploy, but
  285: which I wasn't about to try to do without better tools: classifying
  286: bugs based on their symptoms.
  287: 
  288: If we were going to pick just one of these metadata features that's
  289: the most important, it would be this.
  290: The problem is: most bugtrackers do not support hierarchical
  291: taxonomies.
  292: In fact, so far no existing bugtracker has been found to do so.
  293: 
  294: This is an extremely important point, because it means that we need to
  295: find a way to do it.
  296: This is going to involve writing code, either new code or an extension
  297: to some bugtracker we otherwise like.
  298: 
  299: ### Free-form text
  300: 
  301: Free-form text fields have two uses: one is for enumerations where the
  302: set of things being enumerated is too large to be manageable as an
  303: enumeration (e.g. pkgsrc packages, or NetBSD version numbers, or
  304: programs in /usr/bin) and the other is for text that we think text
  305: retrieval tools will be able to process usefully.
  306: 
  307: I have no examples of the latter kind on hand but I expect some will
  308: appear; there are several of the former kind that we definitely want
  309: to be able to support.
  310: One is the pkgsrc package (by pkgpath) a PR is about; not all pkgsrc
  311: PRs apply to only one package, but most do.
  312: Another (for base system PRs) is the name of the man page most closely
  313: associated with what's broken.
  314: This has been found (by me and also by FreeBSD) to be a useful way of
  315: organizing things.
  316: The version number field is probably another one.
  317: 
  318: ### Untagged vs. inappropriate
  319: 
  320: Note that the database needs to be able to distinguish between "this
  321: metadata does not apply to this PR", as is the case for the pkgsrc
  322: package field and a bug report on make, from "this PR has not been
  323: tagged with this metadata yet", which in the near to medium term will
  324: be the case for all new incoming PRs.
  325: 
  326: 
  327: ## Some other points
  328: 
  329: The fact that send-pr comes with the system and doesn't require
  330: signing up for anything (or anything other than a more-or-less working
  331: mail configuration) has long been a strength of the project.  This has
  332: been cited many times by many people, and it's a feature we want to
  333: retain.  There is not, in practice, a problem with people dumping
  334: useless PRs without valid return addresses; it happens occasionally
  335: but not enough to worry about.
  336: 
  337: Subscribing to PRs (so you get notices of changes) is important; once
  338: you find PRs you generally have to be able to follow them too.
  339: This is something most bugtrackers other than gnats do ok, so it isn't
  340: a big issue, but should nonetheless be noted.
  341: 
  342: Merging duplicate PRs is a nice feature but it's not critical; the
  343: chief reason not being able to do it is annoying right now is that
  344: gnats doesn't handle subscribing intelligently.
  345: If you can crossreference the duplicates and everyone involved can
  346: subscribe as needed, merging becomes less significant.
  347: 
  348: Keeping track of which PRs are blocking which other PRs is often cited
  349: as a desirable or even critical feature in a bugtracker.
  350: This is probably true in general, but for us it doesn't matter that
  351: much: because the bug database (and the system) is broad, most bug
  352: reports are independent of one another and blocking dependencies
  353: rarely arise.
  354: 
  355: 
  356: ## Conclusions on requirements
  357: 
  358: These are not set in stone but reflect my best estimate of the
  359: situation and what does and doesn't matter.
  360: This is weighted some towards backend issues, particularly in
  361: connection with gnats.
  362: 
  363: Please don't edit this randomly; talk it over first.
  364: 
  365: Hard requirements
  366: 
  367: * Must be able to import the existing bug database.
  368: * Doing so must not lose information.
  369: * Must be able to accept incoming email from deployed send-pr scripts.
  370: * Must handle confidential PRs in a way that does not make them
  371: accessible to non-NetBSD people.
  372: * Must be able to accept and file commit messages.
  373: * It is not necessary to sign up to file a problem report.
  374: * Nothing may be written in php.
  375: 
  376: Very strongly desired based on problem analysis:
  377: 
  378: * Support for arbitrary metadata fields not precooked in the database.
  379: * Support for hierarchical taxonomies.
  380: * Support for systems of tags.
  381: * A decent workflow for retrieving incoming PRs and tagging them with
  382: the desired new metadata.
  383: * Support for free-form text metadata fields (for pkgsrc)
  384: 
  385: Desired based on problem analysis:
  386: 
  387: * Support for enumerated metadata fields.
  388: 
  389: Very strongly desired because we have existing workflows and habits:
  390: 
  391: * Command-line access (search, update, administer)
  392: * Web access (search)
  393: 
  394: Desired because we have existing workflows and habits:
  395: 
  396: * Web access (update, maybe also administer)
  397: 
  398: Very strongly desired because we're tired of gnats:
  399: 
  400: * Proper handling of incoming MIME attachments.
  401: * Some mechanism to prevent commit messages from accidentally
  402: spamming the database.
  403: * A way to file comments on a PR from a web browser.
  404: * A web-based search form that works usefully.
  405: * Crosslinks in the web interface to allow browsing.
  406: * Command-line search that doesn't involve query-pr's nasty little
  407: query "language".
  408: * A nondegenerate way to subscribe to PRs, for both developers and
  409: ordinary folks, at least by email and preferably also via RSS.
  410: * A mail ingester that returns broken PR submissions instead of
  411: filing sometimes-mangled versions for manual attention.
  412: * A mail ingester that honors the confidential field of incoming PRs
  413: properly.
  414: * At least slightly automated handling of email bounces.
  415: * A way to update email addresses without hand-editing a bajillion
  416: PRs one at a time.
  417: 
  418: Desired because we're tired of gnats:
  419: 
  420: * A way to file comments on a PR directly from the command line.
  421: * Something like a newsreader for working the bug database.
  422: * Feedback nag mail that comes out such that replying directly to it
  423: does something useful.
  424: * A way to configure the contents of responsible nag mail to sort by
  425: personal priority or other criteria.
  426: * A way to turn off mail for bouncing addresses.
  427: * A way to move misfiled comments from one PR to another.
  428: * A way to mark bugs not merely as "closed" but one of fixed,
  429:   invalid, obsolete, or "won't fix".
  430: 
  431: Some other stuff that would be nice:
  432: 
  433: * Being able to vote PRs up and down from the web interface.
  434: * A smartphone app for working the database.
  435: 
  436: Things that are less important:
  437: 
  438: * Merging multiple PRs on the same subject.
  439: * Explicit crosslinks when one PR is blocking progress on another.
  440: 
  441: Things we don't care that much about:
  442: 
  443: * Padded cells for juvenile developers.
  444: * Click-and-drool support for developers without basic clues.
  445: 
  446: 
  447: ## The (old) plan
  448: 
  449: Given all the above, some years ago a plan was formulated and
  450: even provisionally approved.
  451: This has not materialized owing to (partly) a lack of time and
  452: (partly) a near-total lack of response to requests for feedback or
  453: input or assistance.
  454: At this point some other plan may be better, but nothing has really
  455: changed much in the meantime.
  456: 
  457: There are two key points in the material above:
  458: 
  459: * Schema conversion (to just about anything) without losing
  460: information is going to be hard.
  461: * Nothing that already exists off the shelf is going to handle the
  462: most important thing we/I want anyway.
  463: 
  464: There is another point that is not obvious to those who haven't dealt
  465: with gnats at length:
  466: 
  467: * gnats does very little.
  468: 
  469: Gnats contains a fair amount of code, but most of that code is storage
  470: code (not user interface or analysis or other valuable material) and
  471: it doesn't do a particularly good job of it.
  472: If we moved the data to a real database, the amount of work needed to
  473: replace the functionality of gnats is small: there are half a dozen or
  474: so access programs that gnats comes with, of which the only one that
  475: does anything nontrivial is edit-pr, and a half a dozen or so more
  476: programs that we wrote that we can update as needed.
  477: 
  478: Therefore, the plan was to move the data to a real database, replace
  479: the access programs, and deploy the results.
  480: This was not expected to take long; it hasn't happened because even
  481: the small amount of time required hasn't been available.
  482: Just doing this much would not itself help a lot, but it would leave
  483: us in a much better position for further improvements.
  484: 
  485: First, with the data moved to a real database and all the stuff
  486: accessing it under our control, instead of being legacy gnats code
  487: nobody wants to touch, we'd be in a position to adjust the schema:
  488: make incremental changes with the end goal of working it into
  489: something that can be imported safely into some other bugtracker,
  490: make changes with the goal of improving the metadata, or whatever else
  491: seemed useful.
  492: 
  493: Second, in the course of doing this we could eliminate the worst of
  494: the problems with gnats.
  495: And we'd be in a position to fix up more such problems as desired.
  496: 
  497: Third, we'd now have backend support for the buglists so it wouldn't
  498: need hand-syncing (which has been expensive) and so other people could
  499: help with the tagging.
  500: This would free up more of my time to actually fix bugs (or do other
  501: stuff) instead of administer.
  502: 
  503: The original plan was to build a new system (which got called
  504: "swallowtail" after the Irish jig because swallows are insectivorous),
  505: beat things into a better state than they ever could be with gnats,
  506: and then take stock and decide whether to work on swallowtail or plan
  507: a migration to something else.
  508: 
  509: In the course of trying to figure out which gnats bogosity was the
  510: most critical to deal with, multiple versions of swallowtail got
  511: planned out (more or less) and the possible subsequent migration part
  512: got mostly forgotten, but that was always intended to be one of the
  513: options on the table.
  514: 
  515: 
  516: ## The (new) plan
  517: 
  518: Nothing has actually changed much since the old plan was formulated.
  519: 
  520: It is possible that accurate schema conversion is not going to be as
  521: painful as we thought at the time; but this is unproven.
  522: Importing the gnats data into postgres has been done but is pretty
  523: easy; nobody has actually transformed the data into a schema some
  524: existing bugtracker can use.
  525: 
  526: Nor has anybody seriously looked into adding hierarchical taxonomy
  527: support to any existing bugtracker.
  528: 
  529: When/if that gets done, and if the results are positive, it might make
  530: sense to forget about swallowtail and move directly to that system.
  531: 
  532: However, rushing to adopt something new without considering any of
  533: this, which has been the apparent goal of recent argumentation, is
  534: foolish.

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb