Source control

by havoc

I’ve been reading version control system manuals lately for some
reason, joining the Colin
club. Also talked to Graydon a little bit about monotone since he’s in town for
the Java summit.

Here’s how I’d like to go about comparing these systems. (I didn’t do
it yet, I may be too lazy for now.) We should be able to look at how
our software development process works and make a list of the tasks
involved; and see how well the various source control systems support
those tasks. I admit, this ignores the potential for an
earth-shattering revolution in the software development process by
presupposing what the high-level tasks are. But it sounds useful to
see which of these systems most nicely supports our current process
and social organization. Or perhaps we’ll see that they are all
essentially the same.

Here are some example tasks:

  • Starting work on a new patch with a defined goal
    (“I want to fix bug #123456”)
  • Submitting a patch to the mainline project
    (“Dear maintainer, here is my fix for #123456, please apply”)
  • Reviewing pending patches for one’s own mainline project
  • Accept/reject patches, with comments
  • Make a new development release
  • Make a new bugfixes-only stable release
  • Backport a fix from the development version to the stable
    version
  • Restore a recent (perhaps never-submitted-to-mainline) version of
    a file you’re working on, since you just accidentally deleted it
  • See what changed recently that may have broken the build
  • Get the latest version of the software
  • Adapt a patch to work with a different version than the one you
    wrote it for (aka resolve conflicts)

For each of these tasks, we should be able to completely describe the
stuff you have to type and/or click on to perform them using each
system. Right now many of them involve bugzilla, mailing lists,
etc. in addition to a source control system. Which points to the
potential usefulness of a comprehensive UI for programmer
collaboration, designed for the open source process… someone write
that. 😉

In addition to the steps to do each task, we might want to note any
interesting properties of each task using each system. For example,
whether the task can be done offline; whether it’s slow or fast;
security concerns; etc.

Part of the reason I’m thinking along these lines is that in looking
at monotone, I get the feeling I could do a lot of really neat things
with it but that right now it’s a little too “policy-free” and these
neat things might require too much thinking on my part.

A fun thing about monotone is that it’s peer-to-peer based on public
key crypto; you can make assertions about changes to the source tree,
and you can collect other people’s assertions. To define the
“mainline” of a project you would say something like “use all changes
to branch foo signed by person a, b, or c” – while with CVS you say
something like “all changes to HEAD that are on the server gnome.org.”
Given the recent freedesktop.org hacking incident, I can see the merit
of the monotone approach. There’s no way to tamper with a source tree
without stealing a private key and passphrase.

Graydon said today that he’s looking at including in the repository
itself the list of “committers” at each point in the revision
history. So say I start a project and work on it for a while; at first
the project mainline is defined as all revisions I have signed with my
key. At some point I could change the default committers for the
branch to include Colin; and then anyone checking out the branch who
doesn’t specify otherwise would get stuff either one of us had
signed. Patches Colin submitted before he became a committer, however,
would not suddenly be added. And if I later kicked Colin out of the
project, his post-kickout changes would not be used either.

With monotone today, as I understand it you have to specify on your
own local system all keys you trust to make changes, and then all
changes signed by those people are in the source code you check out.
For something like GNOME you can imagine the mess as our
trusted-persons lists got out of sync, and we each had a somewhat
different version of the code.

I know Graydon could explain all this more accurately than I can.

Here are a couple more feature ideas I have.

  • The version control system keeps track of how to submit patches
    (mailing list, bugzilla, etc.) and has a “submit patch” command.
    (Prereq: the system has a concept of a patch you’re working on for submission)
  • The version control system tracks submitted patches for the maintainer
    and supports easy review and acceptance/rejection of them.
  • Easy way to have a conversation about any patch (as we do now in
    bugzilla typically)
  • Allow the maintainer to easily hack on the patch, merge their changes
    with it, and then bounce the patch back to the submitter for more
    work
    – I often want to just fix the nitpicks instead of writing
    them down, then give someone the patch back to fix the big stuff.
  • Often you want to commit a number of times before you submit a patch
    officially (sometimes people will make a CVS branch for this).
    A nice feature would be to avoid the need to do this manually;
    just have the developer tools automatically “commit” every time you
    save from the editor, or even make the whole undo buffer persistent.
    Much more plausible with monotone than with CVS.

Anyhow, there’s unquestionably a lot of room for improvement in our
developer tools. One of the questions in my mind is how far you can
get if you define the problem as only version control; at some point
the really useful stuff would involve the editor, the bug tracker, and
so forth as well.

(This post was originally found at http://log.ometer.com/2004-11.html#19)

My Twitter account is @havocp.
Interested in becoming a better software developer? Sign up for my email list and I'll let you know when I write something new.