Annotation of wikisrc/bugtracking.mdwn, revision 1.7
1.1 dholland 1: # Bugtracking in NetBSD
2:
3: Currently NetBSD uses gnats for bugtracking.
4: Gnats is a horrible legacy tool.
5: There is a whole page of [[stuff we dislike about gnats]].
6:
7: It has been clear for a long time (years) that we need to migrate to
8: some other bugtracker.
9: Various tools have been proposed, usually without much thought being
10: applied.
11:
12: The way this usually works is that
13: the topic comes up and then a dozen people say "Let's use
14: $MY_FAVORITE_TOOL! It's great!"
15: Then a shouting match ensues and the people who call for requirements
16: analysis or even a list of criteria to pick one tool over another get
17: slagged for standing in the way of 'progress'.
18: These arguments inevitably take place with little understanding of
19: either the project's needs or the problems that a bugtracker needs to
20: handle for us.
21: Little or no information is generated, and nothing happens, except
22: that the particpants tend to get alienated and demotivated.
23:
24: In order to avoid going around this barn again and again I'm creating
25: this page to try to document some of the genuine issues, as well as
26: conclusions that have been drawn in the past about requirements and
27: paths forward.
28:
29:
30: ## Problems with bugtracking in NetBSD
31:
32: This section lists and discusses some of the challenges that arise
33: handling NetBSD's bug reports.
34: This is not meant to be a list of gripes about gnats -- there's
35: another page for that (above) -- but a list of things that are still
36: issues no matter what bugtracker we're using.
37:
38: ### We already have a bug database.
39:
40: Any plan for moving forward has to be able to import the existing bug
41: database without losing information.
42: This is basically not negotiable -- we cannot throw away the existing
43: bug data.
44: (The alternative to importing the existing data is to keep gnats
45: running indefinitely in parallel with a new system.
46: Besides being confusing, this does nothing to solve the problems with
47: gnats itself.)
48:
49: ### The bug database is large.
50:
51: NetBSD's existing bug database currently contains almost 48,000 bugs.
52: This is not especially large compared to other large projects
53: (consider the likely size of the Windows bug database, if you will...)
54: but it is large compared to _most projects_.
55: A lot of bugtrackers will creak, groan, and tip over if asked to track
56: this many bugs.
57: Or, they may be able to store and retrieve the bugs fine but the user
58: interface for handling them just fails to scale to the database size.
59:
60: Many people who propose their favorite tool have used it for assorted
61: projects but never actually tried using it on a large bug database.
62: Some of these tools turn out to work ok on large databases, and others
63: don't.
64:
1.4 dholland 65: There are also currently some 5400 open bug reports; some bugtrackers
1.1 dholland 66: that scale adequately to 50,000 bugs in the database turn out to not
67: be able to handle having so many of them open at once.
68:
69: Another consequence of the database size is that the schema conversion
70: for any migration must be automatic.
71: It is not feasible to hand-edit or even review all bugs, or even all
72: open bugs, as part of a transition to a new system.
73: This creates problems for many otherwise decent choices for a new
74: bugtracker, as most bugtrackers (just like gnats itself) have their
75: own hardcoded assumptions about the schema and about things like what
76: states bugs can be in, and no two are the same.
77:
78: ### The bug database is broad and not readily subdivided.
79:
80: There is a wide range of software in NetBSD, and an even wider range
81: in pkgsrc, and we get bug reports on all of it.
82: There are plenty of identifiable units in this, such as specific
83: pkgsrc packages, but many bugs can't be linked directly to one of
84: these.
85:
86: Furthermore, the existing database is only divided into broad
87: categories (kern, bin, pkg, etc.) and even these don't work all that
88: well sometimes.
89: (And the deployed base of send-pr scripts causes new PRs to come in
90: with only this much classification, something that can be changed only
91: slowly.)
92:
93: The result of this is that it's hard to find things in the bug
94: database by looking around.
95: You can browse the database based on metadata (or you could if gnats
96: sucked less, currently it's hard) but we don't have the metadata
97: needed to do this effectively.
98: You can also search the database based on metadata (even gnats can do
99: this) but it doesn't really produce useful results for the same
100: reason.
101: This will remain true if we just switch to a different bugtracker;
102: to make progress on this problem we need more and better metadata.
103:
104: Many large bug databases (CPAN's bug database was recently floated as
105: an example) can be clearly subdivided into individual projects or
106: subprojects and don't have this problem.
107: You can just look at bugs for the (sub)project you're interested in
108: and the number of those is manageable.
109:
110: This problem is not unique to NetBSD (FreeBSD shares it, for example)
111: but as far as I know it's not common outside of OS projects because
112: most other projects are not broad in the same way.
113:
114: ### Search doesn't work too well.
115:
116: Because of the nature of the names of Unix entities (programs,
117: drivers, virtually everything), searching for them in a large text
118: corpus like the bug database doesn't work too well.
119: This problem is exacerbated if you're trying to find bugs filed
120: against programs that often appear incidentally in bug reports, like
121: make or sh.
122:
123: This is not just a consequence of gnats issues; search won't work all
124: that well no matter what we do.
125: (Try typing "sh site:gnats.netbsd.org" into Google.
126: When Google can't do it, no bugtracker is going to do better.)
127:
128: This means that text search really does not work as an alternative to
129: be able to find things by browsing or via metadata.
130:
131:
132: ## Some observations about the problems
133:
134: The most basic problem we have is _finding stuff_.
135: Back when I first started tackling the bug database, I found that the
136: best way to make progress was not to search (either for text or
137: metadata) or to browse but to ask for a randomly selected open PR.
138: This basically constitutes a total failure of the bugtracker: it was
139: completely unable to provide useful information of any kind.
140:
1.7 ! schmonz 141: I ([[dholland]]) have since learned some tricks and have also accumulated
1.1 dholland 142: an external index for the database; this means I can get stuff out of
143: it now, at least sometimes, but most developers are in the position I
144: was then: the bug database is a completely useless black hole.
145: Several developers have recently said so; also we have the same
146: problem that FreeBSD observed in their database some time back, which
147: is that new bugs come in and get seen, and maybe they get fixed, but
148: if they don't get fixed fairly soon they get forgotten and hang around
149: indefinitely.
150:
151: This is partly a consequence of gnats issues, and this is why gnats
152: must go.
153: However, as described above it isn't entirely because of gnats:
154: finding stuff by navigating (or searching) metadata is hard because we
155: don't have adequate metadata, and finding stuff by searching for text
156: is hard because it's a fundamentally hard text-retrieval problem.
157:
158: Therefore, if we want to actually improve the situation, any migration
159: plan needs to include a way to get more metadata into the database.
160: This metadata will mostly need to be hand-applied; this is expensive
161: but not insurmountable (for 5400 open existing PRs) and not that big a
162: deal for incoming new PRs... provided the new bugtracker has adequate
163: support for arbitrary metadata, which many don't.
164:
165: Note that in addition to the above analysis we also have some
166: supporting results.
1.7 ! schmonz 167: Based on the analysis I ([[dholland]]) started maintaining an annotated
1.1 dholland 168: browseable index (aka the "buglists" pages) of the bug database.
169: This basically amounted to additional per-bug metadata of several
170: kinds, organized in a fashion that allowed generating an index as a
171: tree of web pages.
172: Unfortunately because it was a gimcrack thing only I could update it,
173: and unfortunately it also needed to be synchronized with the gnats
174: database by hand, with the result that when my available time dried up
175: it went out of date and is now pretty much useless.
176:
177: However, while it existed people used it and it helped them.
178: Before we had it the number of open PRs had been steadily increasing
179: over time.
180: (The occasional hackathon brought down the count from time to time,
181: but never persistently.)
182: During the time we had it, the number of open PRs remained more or
183: less stable at around 4800-4900.
184: Now we don't have it again and we're up to 5400 open PRs.
185: The influx has not changed much since I got behind on it; if anything
186: it's dropped.
187: What this means is that the rate PRs are getting fixed has dropped,
188: and that's because it's become impossible to find anything again.
189:
190: It seems to me that one of the chief things we want from a new
191: bugtracker is to be able to provide something like this browseable
192: index.
193: Therefore it must be able to support the kinds of metadata that the
194: buglists tree was using.
195:
196: If we move to a bugtracker without this support, it may solve some of
197: the more glaring problems with gnats, but it isn't going to help us
198: _find_ stuff in the bug database and it isn't going to do anything to
199: help make the large backlog of unfixed bugs go away.
200:
201:
202: ## Metadata types
203:
204: After working the bug database for some years and also after
205: maintaining the buglists pages, I've come to the conclusion that
206: we need the following _types_ of metadata:
207:
208: * fields containing tags
209: * fields containing one of an enumerated list of choices
210: * fields containing a classification according to a hierarchical taxonomy
211: * fields containing free-form text
212:
213: And also, importantly, we need arbitrarily many such fields, not a
214: fixed set concocted at the time the database is set up.
215:
216: ### Tags
217:
218: A tag field contains zero or more entries from a list of allowable
219: choices.
220: This is a well-understood concept and most bug databases support tags
221: in one way or another; however, I don't think most support arbitrarily
222: many different tags fields with their own sets of allowable tags.
223:
224: Two questions immediately arise from this description: why do we need
225: to restrict tags to allowable values, and why do we need multiple tags
226: fields instead of just one?
227:
228: The first question is easy: when you have a database of 50,000 things
229: you need to place some controls on what gets entered or you eventually
230: end up with trash.
231: This is just a fact of life with databases when they get big enough.
232: We have enough problems without having to deal with misspelled tags
233: and typos.
234:
235: The second follows partly from the first and partly from ensuing human
236: interface concerns: if you just have one tags field, the number of
237: possible tag values grows without bound as more and more tags get
238: added, and before too long the list becomes itself hard to work with.
239: Grouping tags into logical sets (e.g. all releng tags for which bugs
240: are critical for which releases in one field) makes it much easier to
241: search for them, and also much easier to browse the database looking
242: for bugs that aren't tagged but should be.
243:
244: Also this makes it possible to have developer-only tag fields, private
245: personal tags, and so forth, without undue complications.
246:
247: ### Enumerations
248:
249: An enumeration field contains exactly one entry from a list of
250: allowable choices.
251: This differs from a tag field in certain obvious ways.
252: We already have some of these in gnats, but they're hardcoded fields
253: rather than being instances of a general metadata type.
254:
255: If we're going to have arbitrary metadata fields at all (rather than a
256: fixed set of predefined fields) we more or less need enumeration
257: fields to be able to migrate the existing database.
258:
259: Also some things that one might abuse tags for if one only had tags
260: are perhaps better handled as enumerations; e.g. no bug should be both
261: "critical" for a release and also "would be nice" for the same
262: release.
263:
264: That said, a bugtracker that only has tags fields is probably adequate
265: (though not entirely desirable) because one can abuse tags instead.
266:
267: ### Hierarchical taxonomy
268:
269: A hierarchical taxonomy is a scheme for identifying (and thus,
270: finding) things based on a nested series of choices.
271: The hierarchical taxonomy most people are most familiar with is
272: probably the scheme for species in biology.
273: That (especially in its more modern forms) is more complex than we
274: need but the basic principle is the same: at the top you have
275: everything, and then you pick one of several kinds of things and then
276: you're dealing with a restricted subset, and so forth until you get
277: down to a manageable number of things to look at at once that all have similar properties.
278:
279: The buglists pages supported tags as well but were fundamentally based
280: on a hierarchical classification of bugs based on where in the system
281: they occur.
282: This (as noted above) has been extremely useful in practice.
283:
284: There is another hierarchical taxonomy that I'd like to deploy, but
285: which I wasn't about to try to do without better tools: classifying
286: bugs based on their symptoms.
287:
288: If we were going to pick just one of these metadata features that's
289: the most important, it would be this.
290: The problem is: most bugtrackers do not support hierarchical
291: taxonomies.
292: In fact, so far no existing bugtracker has been found to do so.
293:
294: This is an extremely important point, because it means that we need to
295: find a way to do it.
296: This is going to involve writing code, either new code or an extension
297: to some bugtracker we otherwise like.
298:
299: ### Free-form text
300:
301: Free-form text fields have two uses: one is for enumerations where the
302: set of things being enumerated is too large to be manageable as an
303: enumeration (e.g. pkgsrc packages, or NetBSD version numbers, or
304: programs in /usr/bin) and the other is for text that we think text
305: retrieval tools will be able to process usefully.
306:
307: I have no examples of the latter kind on hand but I expect some will
308: appear; there are several of the former kind that we definitely want
309: to be able to support.
310: One is the pkgsrc package (by pkgpath) a PR is about; not all pkgsrc
311: PRs apply to only one package, but most do.
312: Another (for base system PRs) is the name of the man page most closely
313: associated with what's broken.
314: This has been found (by me and also by FreeBSD) to be a useful way of
315: organizing things.
316: The version number field is probably another one.
317:
318: ### Untagged vs. inappropriate
319:
320: Note that the database needs to be able to distinguish between "this
321: metadata does not apply to this PR", as is the case for the pkgsrc
322: package field and a bug report on make, from "this PR has not been
323: tagged with this metadata yet", which in the near to medium term will
324: be the case for all new incoming PRs.
325:
326:
327: ## Some other points
328:
329: The fact that send-pr comes with the system and doesn't require
330: signing up for anything (or anything other than a more-or-less working
331: mail configuration) has long been a strength of the project. This has
332: been cited many times by many people, and it's a feature we want to
333: retain. There is not, in practice, a problem with people dumping
334: useless PRs without valid return addresses; it happens occasionally
335: but not enough to worry about.
336:
337: Subscribing to PRs (so you get notices of changes) is important; once
338: you find PRs you generally have to be able to follow them too.
339: This is something most bugtrackers other than gnats do ok, so it isn't
340: a big issue, but should nonetheless be noted.
341:
342: Merging duplicate PRs is a nice feature but it's not critical; the
343: chief reason not being able to do it is annoying right now is that
344: gnats doesn't handle subscribing intelligently.
345: If you can crossreference the duplicates and everyone involved can
346: subscribe as needed, merging becomes less significant.
347:
348: Keeping track of which PRs are blocking which other PRs is often cited
349: as a desirable or even critical feature in a bugtracker.
350: This is probably true in general, but for us it doesn't matter that
351: much: because the bug database (and the system) is broad, most bug
352: reports are independent of one another and blocking dependencies
353: rarely arise.
354:
355:
356: ## Conclusions on requirements
357:
358: These are not set in stone but reflect my best estimate of the
359: situation and what does and doesn't matter.
360: This is weighted some towards backend issues, particularly in
361: connection with gnats.
362:
363: Please don't edit this randomly; talk it over first.
364:
365: Hard requirements
1.3 dholland 366:
1.2 dholland 367: * Must be able to import the existing bug database.
368: * Doing so must not lose information.
369: * Must be able to accept incoming email from deployed send-pr scripts.
1.6 dholland 370: * Must handle confidential PRs in a way that does not make them
371: accessible to non-NetBSD people.
1.2 dholland 372: * Must be able to accept and file commit messages.
373: * It is not necessary to sign up to file a problem report.
374: * Nothing may be written in php.
1.1 dholland 375:
376: Very strongly desired based on problem analysis:
1.3 dholland 377:
1.2 dholland 378: * Support for arbitrary metadata fields not precooked in the database.
379: * Support for hierarchical taxonomies.
380: * Support for systems of tags.
381: * A decent workflow for retrieving incoming PRs and tagging them with
1.1 dholland 382: the desired new metadata.
1.5 wiz 383: * Support for free-form text metadata fields (for pkgsrc)
1.1 dholland 384:
385: Desired based on problem analysis:
1.3 dholland 386:
1.2 dholland 387: * Support for enumerated metadata fields.
1.1 dholland 388:
389: Very strongly desired because we have existing workflows and habits:
1.3 dholland 390:
1.2 dholland 391: * Command-line access (search, update, administer)
392: * Web access (search)
1.1 dholland 393:
394: Desired because we have existing workflows and habits:
1.3 dholland 395:
1.2 dholland 396: * Web access (update, maybe also administer)
1.1 dholland 397:
398: Very strongly desired because we're tired of gnats:
1.3 dholland 399:
1.2 dholland 400: * Proper handling of incoming MIME attachments.
401: * Some mechanism to prevent commit messages from accidentally
1.1 dholland 402: spamming the database.
1.2 dholland 403: * A way to file comments on a PR from a web browser.
404: * A web-based search form that works usefully.
405: * Crosslinks in the web interface to allow browsing.
406: * Command-line search that doesn't involve query-pr's nasty little
1.1 dholland 407: query "language".
1.2 dholland 408: * A nondegenerate way to subscribe to PRs, for both developers and
1.1 dholland 409: ordinary folks, at least by email and preferably also via RSS.
1.2 dholland 410: * A mail ingester that returns broken PR submissions instead of
1.1 dholland 411: filing sometimes-mangled versions for manual attention.
1.2 dholland 412: * A mail ingester that honors the confidential field of incoming PRs
1.1 dholland 413: properly.
1.2 dholland 414: * At least slightly automated handling of email bounces.
415: * A way to update email addresses without hand-editing a bajillion
1.1 dholland 416: PRs one at a time.
417:
418: Desired because we're tired of gnats:
1.3 dholland 419:
1.2 dholland 420: * A way to file comments on a PR directly from the command line.
421: * Something like a newsreader for working the bug database.
422: * Feedback nag mail that comes out such that replying directly to it
1.1 dholland 423: does something useful.
1.2 dholland 424: * A way to configure the contents of responsible nag mail to sort by
1.1 dholland 425: personal priority or other criteria.
1.2 dholland 426: * A way to turn off mail for bouncing addresses.
427: * A way to move misfiled comments from one PR to another.
1.7 ! schmonz 428: * A way to mark bugs not merely as "closed" but one of fixed,
! 429: invalid, obsolete, or "won't fix".
1.1 dholland 430:
431: Some other stuff that would be nice:
1.3 dholland 432:
1.2 dholland 433: * Being able to vote PRs up and down from the web interface.
434: * A smartphone app for working the database.
1.1 dholland 435:
436: Things that are less important:
1.3 dholland 437:
1.2 dholland 438: * Merging multiple PRs on the same subject.
439: * Explicit crosslinks when one PR is blocking progress on another.
1.1 dholland 440:
441: Things we don't care that much about:
1.3 dholland 442:
1.2 dholland 443: * Padded cells for juvenile developers.
444: * Click-and-drool support for developers without basic clues.
1.1 dholland 445:
446:
447: ## The (old) plan
448:
449: Given all the above, some years ago a plan was formulated and
450: even provisionally approved.
451: This has not materialized owing to (partly) a lack of time and
452: (partly) a near-total lack of response to requests for feedback or
453: input or assistance.
454: At this point some other plan may be better, but nothing has really
455: changed much in the meantime.
456:
457: There are two key points in the material above:
1.3 dholland 458:
1.2 dholland 459: * Schema conversion (to just about anything) without losing
1.1 dholland 460: information is going to be hard.
1.2 dholland 461: * Nothing that already exists off the shelf is going to handle the
1.1 dholland 462: most important thing we/I want anyway.
463:
464: There is another point that is not obvious to those who haven't dealt
465: with gnats at length:
1.3 dholland 466:
1.2 dholland 467: * gnats does very little.
1.1 dholland 468:
469: Gnats contains a fair amount of code, but most of that code is storage
470: code (not user interface or analysis or other valuable material) and
471: it doesn't do a particularly good job of it.
472: If we moved the data to a real database, the amount of work needed to
473: replace the functionality of gnats is small: there are half a dozen or
474: so access programs that gnats comes with, of which the only one that
475: does anything nontrivial is edit-pr, and a half a dozen or so more
476: programs that we wrote that we can update as needed.
477:
478: Therefore, the plan was to move the data to a real database, replace
479: the access programs, and deploy the results.
480: This was not expected to take long; it hasn't happened because even
481: the small amount of time required hasn't been available.
482: Just doing this much would not itself help a lot, but it would leave
483: us in a much better position for further improvements.
484:
485: First, with the data moved to a real database and all the stuff
486: accessing it under our control, instead of being legacy gnats code
487: nobody wants to touch, we'd be in a position to adjust the schema:
488: make incremental changes with the end goal of working it into
489: something that can be imported safely into some other bugtracker,
490: make changes with the goal of improving the metadata, or whatever else
491: seemed useful.
492:
493: Second, in the course of doing this we could eliminate the worst of
494: the problems with gnats.
495: And we'd be in a position to fix up more such problems as desired.
496:
497: Third, we'd now have backend support for the buglists so it wouldn't
498: need hand-syncing (which has been expensive) and so other people could
499: help with the tagging.
500: This would free up more of my time to actually fix bugs (or do other
501: stuff) instead of administer.
502:
503: The original plan was to build a new system (which got called
504: "swallowtail" after the Irish jig because swallows are insectivorous),
505: beat things into a better state than they ever could be with gnats,
506: and then take stock and decide whether to work on swallowtail or plan
507: a migration to something else.
508:
509: In the course of trying to figure out which gnats bogosity was the
510: most critical to deal with, multiple versions of swallowtail got
511: planned out (more or less) and the possible subsequent migration part
512: got mostly forgotten, but that was always intended to be one of the
513: options on the table.
514:
515:
516: ## The (new) plan
517:
518: Nothing has actually changed much since the old plan was formulated.
519:
520: It is possible that accurate schema conversion is not going to be as
521: painful as we thought at the time; but this is unproven.
522: Importing the gnats data into postgres has been done but is pretty
523: easy; nobody has actually transformed the data into a schema some
524: existing bugtracker can use.
525:
526: Nor has anybody seriously looked into adding hierarchical taxonomy
527: support to any existing bugtracker.
528:
529: When/if that gets done, and if the results are positive, it might make
530: sense to forget about swallowtail and move directly to that system.
531:
532: However, rushing to adopt something new without considering any of
533: this, which has been the apparent goal of recent argumentation, is
534: foolish.
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb