File:  [NetBSD Developer Wiki] / wikisrc / users / dholland / mercurial.mdwn
Revision 1.2: download - view: text, annotated - select for diffs
Mon Sep 2 01:23:40 2013 UTC (7 months, 3 weeks ago) by dholland
Branches: MAIN
CVS tags: HEAD
oops fix typo; thanks agc

## Using Mercurial and mq to work on NetBSD

This page contains directions for using Mercurial as a commit buffer
for NetBSD.

(It will not do you much good if you're trying to convert the master
NetBSD tree to Mercurial or to work with such a converted tree.)

### What it is

Mercurial is a distributed version control system ("DVCS").
mq is an extension to Mercurial for handling patch queues.
The concept of patch queues was introduced by Quilt some years back.

This document assumes you already know more or less how to use
Mercurial but may not have used mq before.

### The model we're using here

What we're going to do is commit a NetBSD CVS working tree into a
Mercurial repository.
You can then use Mercurial to merge; it is better at this than CVS.
You can also commit changes locally and ship them back to the CVS
master later; this is useful in a variety of ways.
You can potentially also clone the Mercurial tree and work jointly
with other people, but there are limits to this as we'll discuss in a
moment.

Because the NetBSD tree is rather large, you will find that if you
commit the whole thing into Mercurial that a number of operations
(anything that scans the working tree for changes) become annoyingly
slow.
It isn't slow enough to be unusable, and it's quite a bit faster than a
comparable CVS option (like running cvs update from the top level),
but it's slow enough to be annoying.

For this reason, in most cases, I recommend committing only part of
the tree into Mercurial and telling it to ignore the rest. This means
you can't in general usefully clone the resulting Mercurial repo
(although that depends on exactly what you leave out) but this is not
a major problem unless you're specifically trying to work with someone
else.

So the basic model here is that you check out a CVS working tree and
use Mercurial to manage local changes to part of it, then later on
commit those changes back to the master CVS repository.

### Branches vs. patches

There are two ways you can manage your changes: as a branch, or as a
patch queue.
The advantage of a patch queue is that you can easily commit each
patch individually back to CVS, and you can go back and forth between
them and debug and polish each one separately.
The disadvantage is that the merge facilities are (as far as I know
anyway) relatively limited.

Conversely, if you commit your changes to a branch, you get all the
native merging support in Mercurial.
However, it is painful to try to commit anything other than one big
diff for the whole branch back to CVS.
(You might be able to do it via bookmarks and rebasing, but I've never
tried and have no desire to figure out how.)

If you don't want to keep the incremental history of your local
commits, use a branch.
If you do, use a patch queue.

It is possible to use multiple branches to allow you to commit back in
several stages.
However, managing this is a major pain and I don't recommend it -- you
might get away with two branches but more than that is probably a bad
idea.

There's a Mercurial extension called the "patch branch extension" that
lets you manage a whole graph of patches using branches.
I haven't tried using it in some years; at the time it had scaling
problems such that it became horrifyingly slow once you had more than
a handful of such branches.
That might have been improved in the meantime; if you find yourself
wanting to use both branches and patches, it might be worth looking
into.

It is also fairly probable that there is now a solution for merging
with patch queues; it's been a while since I had time to look closely.

### Setting up

First, check out a CVS working tree.
You probably want to use a different one for each project, because
different projects require changing different parts of the tree and so
you will probably want to have Mercurial ignore different subtrees for
different projects.
(At least, I find it so; it depends on what you're working on.)

	% cvs -d cvs.netbsd.org:/cvsroot checkout -P src

Now create a Mercurial repository at the top level.
(If you are working only in a subtree and you are *sure* that you will
never need to change anything in other parts of the tree, you can
create the Mercurial repository in a subtree.
But unless you're absolutely certain, don't take the risk.)

	% cd src
	% hg init

If you're going to be using a patch queue, now enable mq.

	% vi .hg/hgrc
and add
	[extensions]
	hgext.mq =
(Since the extension is built into Mercurial, that's all you need.)
You can if you prefer also put this in your .hgrc so mq is always on.
Then do
	hg qinit -c
The -c option tells mq that you'll be checkpointing your patches,
which is usually a good idea.

Now prepare a .hgignore file.
This file contains one regular expression per line; Mercurial ignores
files (and subdirectories) whose paths from the repository root match
one of the regexps.
Add at least:
	^CVS$
	/CVS$
to ignore all the CVS control directories in the CVS checkout.
While you can commit these to Mercurial, there's no point and it gets
awkward if owing to mistakes later you end up having to merge them.

If you aren't arranging to put the tree's object directories somewhere
else, then also add
	^obj\.[0-9a-z]$
	/obj\.[0-9a-z]$
and you might want
	^sys/arch/[0-9a-z]*/compile/[A-Z]
to ignore kernel build directories.

Ignore subtrees that you aren't working in.
You don't have to bother to be very selective; the goal is to rapidly
rule out a few large subtrees that you definitely don't care about, in
order to avoid wasting time scanning them for changes.
Unless you plan to be working with 3rd-party software,

	^external$
	^gnu/dist$

is a good starting point.
Alternatively, if you aren't going to be working on MD kernel stuff or
bootloaders,

	^sys/arch$

is a good choice as it's also large.

You can always unignore stuff later, so don't worry about remote
possibilities.

Now commit the .hgignore file:

	% hg add .hgignore
	% hg commit -m 'add .hgignore file' .hgignore

Now add and commit the contents of the working tree:

	% hg add
	% hg commit -m 'HEAD of 20130101'
(or whatever date)

You are now in business.

### Working

If you're using a branch, remember to change branches before you
commit anything:
	% hg branch mystuff
You want to keep the default branch an untouched CVS tree so you can
use Mercurial to merge.
(And also so you can use Mercurial to extract diffs against CVS HEAD
and so forth.)

Similarly, if you're using a patch queue, put everything in patches
and don't commit.
(There's a section below about working with mq if you aren't familiar
with it.)

You can edit and build and test as normal.
Use hg commit or hg qrefresh to sync stuff into Mercurial.

If you're using mq, it's a good idea to checkpoint your patch queue
periodically.
This is done as follows:
	% hg qcommit
The patches directory (.hg/patches) is stored in its own Mercurial
repository, and this commits the patches to that repository.
If necessary you can then fetch older versions of the patches back and
so forth.

### Updating from CVS

First, make sure all your changes are committed.
(If you have unfinished changes that aren't ready to commit, there's a
Mercurial extension for stashing them temporarily.
If you have stuff that you don't want to commit at all, like debugging
printouts or quick hacks, it's often convenient to keep those in their
own mq patch, even if you aren't using mq for development.)

Now go back to a clean CVS tree.
If using branches, go back to the default branch:
	% hg update -r default
If using mq, pop all the patches:
	% hg qpop -a

DO NOT run cvs update until/unless you have done this; it will make a
mess.
When you eventually do this by accident, see the section below on
recovering from mistakes.

Now run cvs update from the top of the source tree:
	% cvs -q update -dP

You should get no conflicts from CVS and nothing should show as
modified.
(It is usually a good habit to save the cvs update output to a file to
be able to check this.)

Tell hg to sync up:
	% hg addremove

Use hg to check what it thinks has changed:
	% hg status

Commit the changes to Mercurial:
	% hg commit -m 'Updated to 20130202"

Now you get to merge.

If you're using a branch, you want to merge the changes into your
branch rather than merge your branch into the changes:
	% hg update -r mystuff
	% hg merge default
	(edit and resolve as needed)
	% hg commit -m 'sync with HEAD'

If it tells you "update crosses branches" when trying to update back
to your branch, update to the parent changeset (the previous version
from CVS) first, as that's an ancestor of your branch.

If you're using mq, the thing to do now is to push all your patches,
and if any reject, clean up the mess and refresh them.

If patch tells you "hunk N succeeded at offset MMM with fuzz Q", it's
a good idea to manually inspect the results -- patch being what it is,
sometimes this means it's done the wrong thing.
Edit if needed.
Then (even if you didn't edit) refresh the patch so it won't happen
again.

As I said above, it's quite likely that by now there's a better scheme
for merging with mq that I don't know about yet.

### Pushing back to CVS

When you're ready to push your changes back to CVS (so they're really
committed), first (unless you're absolutely sure it's not necessary)
update from CVS as above and merge.
Then:

If you're using a branch, go back to the default branch and merge your
changes into it:
	% hg update -r default
	% hg merge mystuff
	% hg commit -m "prepare to commit back to cvs"
Now cvs add any new directories and files; be sure not to forget this.
It is a good idea to crosscheck with cvs diff and/or cvs update:
	% cvs diff -up | less
	% cvs -nq update -dP
Then you can cvs commit:
	% cvs commit
Because of RCSIDs, committing into cvs changes the source files.
So now you need to do:
	% hg commit -m 'cvs committed'
and if you intend to keep working in this tree, you want to merge that
changeset back into your branch to avoid having it cause merge
conflicts later.
Do that as above.


If you're using a patch queue, usually it's because you want to commit
each patch back to CVS individually.
First pop all the patches:
	% hg qpop -a
Now, for each patch:
	% hg qpush
	% hg qfinish -a
	% cvs commit
	% hg commit -m "cvs committed previous"
With a long patch queue, you'll want to use the patch comments as the
CVS commit messages.
Also, running cvs commit from the top for every patch is horribly slow.
Both these problems can be fixed by putting the following in a script:
	hg log -v -r. | sed '1,/^description:$/d' > patch-message
	cat patch-message
	echo -n 'cvs commit -F patch-message '
	hg log -v -r. | grep '^files:' | sed 's/^files://'
(I call this "dogetpatch.sh") and then the procedure is:
	% hg qpop -a
then for each patch:
	% hg qpush && hg qfinish -a && dogetpatch.sh
	% cvs commit [as directed]
	% hg commit -m "cvs committed previous"
(This could be automated further but doing so seems unwise.)

### Using CVS within Mercurial

You can successfully do any read-only CVS operation in the hybrid
tree: diff, annotate, log, update -p, etc.
Read-write operations should be avoided; if you mix upstream changes
with your changes you will find it much harder to commit upstream
later, and you may get weird merge conflicts or even accidentally
revert other people's changes and cause problems.

If you clone the Mercurial tree and you didn't include the CVS control
files in it, you won't be able to do CVS operations from clones.
Including the CVS control files in the Mercurial tree is one way
around that.

You will find that any large CVS operation on a clone is horribly
slow.
This is because making a clone causes CVS to think all the files in
the clone have been modified since you last ran it; it then re-fetches
every file you ask it about so it can update its own information.
For this reason cloning the Mercurial tree usually isn't worthwhile
and even when it is, including the CVS files in the Mercurial tree
isn't.

Another consequence of this: do not try to cvs update in a cloned
Mercurial repository; use only the original.
Updating a clone basically downloads the entire tree over again from
the CVS server.

DO NOT CVS COMMIT FROM A CLONE.
It is known that some operations that muck with the timestamps in a
CVS working tree can cause CVS to lose data.
It is not clear if hg clone is such an operation; don't be the person
who finds out the hard way.

### Recovering from mistakes

The most common mistake is CVS updating when the Mercurial tree is not
in the proper state from that; e.g. onto your branch or while you have
patches applied.

The basic strategy for this is to use hg revert to restore the part of
the tree it knows about, then go back to CVS, clean up the mess there,
and update properly.

If you're using a branch:
	% hg revert -C
	% hg update -r default
If you're using a patch queue:
	% hg revert -C
	% hg qpop -a

The problem is, CVS will now think you've changed every file that
Mercurial is managing, and the modifications are to revert all the
changes that have happened since your previous update.
You do *not* want that to turn into reality.
Hunt down (with cvs -n update) any files that CVS thinks are modified,
then rm them and run cvs update on them.
CVS will print "Warning: foo was lost" and restore an unmodified copy.

When you have no files left that CVS thinks are modified, do a CVS
update on the whole tree and merge it as described above.
(You must do this, as the parts of the tree that Mercurial is ignoring
will otherwise be out of sync with the parts it's managing.)

If you stored the CVS control files in Mercurial, then the revert will
restore them, but your tree will still be inconsistent so you still
need to do a proper update and merge immediately.

### mq

The basic idea of mq (like quilt) is it maintains a series of patches
against the source tree, that are to be applied in order.
By applying them or removing them one at a time, you can move the tree
to any intermediate state; and then you can update the topmost patch,
insert a new patch, or whatever.

To see the list of patches:
	% hg qseries

To apply the next patch:
	% hg qpush

To remove the current patch:
	% hg qpop

To merge current working tree changes into the current patch:
	% hg qrefresh

To also update the current patch's change comment:
	% hg qrefresh -e

To collect current working tree changes (if any) into a new patch:
	% hg qnew PATCHNAME 

When there's an mq patch applied, you can't commit.
(Doing qrefresh is basically equivalent to committing the current
patch.)
Diff will show the changes against the last refreshed version of the
current patch; to see the complete changes for the current patch
(including current changes), use "hg qdiff".

You can delete patches with "hg qrm" and rename them with "hg qmv".

Patches are applied with patch, unfortunately, which means that if
they don't apply (which can happen if you or someone else changes
something under one) you get .rej files you have to clean up by hand
rather than a Mercurial merge.

When a patch is ready to be committed for real, you do "hg qfinish" on
it.
This removes it from the patch queue and converts it to a normal
Mercurial changeset.

To change the ordering of patches, you edit the file
.hg/patches/series.
If the patches aren't orthogonal you'll have to fix the rejections
when you next apply them.
(Don't do this with patches that are currently applied.)

Use "hg help mq" to see the full list of mq-related commands.

I'm sure there are better mq tutorials out there.

### Using mq

The basic process when using mq is that you start a new patch, edit
and hack for a while, use hg qrefresh to commit it (once or many
times), and when you're done go on to the next one.

If you find a bug in an earlier patch, you can go back to the patch
that introduced it and fix the bug there, creating a new version of
the offending patch that no longer contains the bug.
(Or you can create a new patch that fixes the bug, but insert it
immediately after the patch that created the bug.)

When a patch is ready to be seen by other people, you "finish" it and
then it becomes a normal immutable changeset.

One catch is that you can't push or pop the patch queue while you have
unsynced (uncommitted) changes.
There are two ways around this; there's a separate "stash" extension
that lets you put unfinished changes aside while you do something else.
Or, alternatively, you can create a new temporary patch holding your
unfinished changes, and then later use hg qfold to combine that with
the patch you originally meant this for.

A variant of this problem is when you discover a bug, open an editor,
fix it, and then realize that you wanted to make the edit in an
earlier patch.
Then you go to pop the queue and it complains that you have a modified
file.
If the modification in question is the only uncommitted change, the
best way to deal with this is to create a new patch for it, then pop
to where you wanted it to go and use hg qfold to apply it there.

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb