Sat, 11 Dec 2004
One of the harder aspects of version control is dealing with merging issues. Normal development is straightforward – all you’re essentially doing is providing an annotated “undo” feature. darcs manages that, IMO, perfectly. And to be honest, that’s probably 80% of what I want form a version control system. But dealing with merging different lines of development is important too – it’s probably 80% of the remaining 20% :)
darcs doesn’t actually do too badly there – when you’re working on separate parts of the code, darcs will do a merge for you automatically quite happily. Where it falls apart is when the changes affect the same bit of code, and can’t be resolved automatically; I find myself really disliking darcs’ behaviour there, even independent of the performance issues.
Wed, 08 Dec 2004
Cripes. This was meant to be a quick followup note about some more quick darcs hacks. So much for that – I’ve had to write an outline for this post for heaven’s sake.
(Side note: if someone wants a new title for their blog, the above’s free of charge!)
So, when last we met, darcs-repo had just come into the world, and we were still choking on the cigar smoke. Following that there were a couple of discussion threads. Interesting mails include this one, so that you just ask for a repository rather than a “branch” of a “project”, and the program works out how that’s stored, or this one (and its followups) about naming a collection of related repositories an “archive”, and changing the name from darcs-repo to darcshive. This one (and followups from December) includes some (applied!) patches to darcs itself to let me get rid of the horrific ssh/scp hacks.
Where does that leave us? Pretty much at the point of moving from a prototype/proof-of-concept darcs-repo to a functional darcshive. It’s been essentially self-hosting from the beginning, but a more challenging task is hosting darcs itself – since it’s likely that darcs excercises most of the interesting features of the darcs repository format.
Thu, 14 Oct 2004
I think it’s reasonable to consider two sorts of “repository” when dealing with darcs – public repositories that are used to reflect a particular line of development, and private working directories that are used to actually do development. Unfortunately there’s some overlap here, pretty much taking the form of “copying your working directory around”.
The difference between the two main classes are nice and clear: for working directories you want as much control over what happens as you can get; and for public repositories you want consistency and accessibility. Which means working directories need to be local, public repositories can be remote; public repositories need to be consistent and append-only, while working directories can be “unpulled”, “unrecorded” and “reverted” as often as you like.
Now, darcs already handles working directories fine; but it’s arguably a bit too flexible as far as public repositories are concerned. We’ll just ignore the “in between” case and, presuming that one or the other extreme will be good enough in practice, work on adding some better support for public repositories.
(As an aside, I’m writing this entry concurrently with designing the actual code, kind-of a weird amalgam of blogging and literate programming. I wonder how it’ll work out.)
Sun, 03 Oct 2004
Continuing the darcs theme, it does seem to be fairly pleasant
to actually use. Having darcs record go through and prompt
you for each change (which you can avoid by saying -a)
makes for interesting habits – I’m finding I’m much more inclined to
commit once per feature addition, and when I happen to fix a bug while
implementing a feature I’m actually feeling encouraged to commit the
two changes separately. For similar reasons it seems like a good match
for refactoring, which encourages you to make a sequence of small,
independent, and trivially correct as you hack.
The ability to just copy the _darcs/ directory to another
code tree is pleasant – it really does make it feel like the code
you’re working on and the repository you’re working against are separate,
independent things which seems sensible and appropriate; and the ability
to make an unpacked source tarball suddenly be version controlled,
whether it’s had additional modifications or not, is definitely a feature.
On the downside, darcs and nvi don’t cooperate – apparently you have
to specifically tell nvi its IO is coming from /dev/tty for
it to not die. Oh well. vim, emacs and nano work; and the only reason to
use an editor at all is for long log messages, which I’ve only actually
wanted once so far – all my other changes have been granular enough
to be properly described with a single line. Interestingly, Martin’s
librsync darcs hacking seems to be similar, with only about 4%
of his changes having more than just the single line description.
I haven’t bothered with patch dependencies yet, which is arguably buggy on my behalf. Not sure how much I should care about that – presumably it’ll become obvious with more use.
Fri, 01 Oct 2004
After a little more looking at darcs, I think I’m willing to live with its flaws. I don’t think I mind the lack of a nice repository for long-term storage – I haven’t managed to grow to like any of the others I’ve seen (cvs, tla, subversion, aegis), anyway. Tarballs will do in the meantime, and not having to worry about a heavy-weight repository when I don’t want to is cool.
Not having support for metadata (timestamps, permissions, or ownership) does still concern me though, so I decided to have a poke at darcs’ internals to see if that can be fixed. That happens to mean I need to learn Haskell (which I’ve been meaning to do since 1997, admittedly), so maybe when Andrae continues his programming theory blogging I’ll actually be able to follow what he’s talking about. Scary.
Anyway, Haskell’s a nice language to express darcs in; pattern matching definitely pays off, and monads do seem to keep the code reasonably clear. It’s still pretty complicated: reading three thousand lines of code implementing something you don’t understand in a language you don’t know, with an extended form of a grammar you’ve mostly forgotten anyway doesn’t make for a walk in the park. In any case, I think I grok it enough to think a fix for the metadata issue is possible, and David Roundy (the darcs author) seems to largely agree. Cool. Going from possible to patched isn’t trivial though.
In the meantime, and given I’ve decided not to use darcs as a primary/permanent/public storage format (yay tarballs!), it seems like now’s a good time to check various things into darcs and see what happens. For regular programming it does seem like timestamps shouldn’t matter, and while not having execute bits might be annoying, I can certainly live without everything else.
Tue, 28 Sep 2004
I’ve been coming to realise that I’m not really as satisfied with arch as I’d like to be; in spite of being an ardent fanboy for a while now. My main requirement for software is that it be simple and stay out of my way; and while arch is fairly simple, it’s evidently proven not simple enough for me to actually use it regularly. A brief chat with Greg Black on the topic at HUMBUG over the weekend finally made me decide to look into this again; so inspired by Martin’s occassional raves, I decided to have a poke at darcs.
darcs is yet another reinvention of revision control. It’s key change is that it doesn’t bother with a repository per se at all – instead it’s a tool to manage a revision history for a single source tree. It doesn’t even do version numbering for you, let alone branching. What it does do is let you manipulate your history in the form of individual patches, and in particular lets you copy patches from one directory to another. Impressively, this turns out to be all you need.
The cool thing about this is it makes for really lightweight
version control – since there’s no external repository at all, you don’t
have to worry about any setup or how you might affect anyone or anything
else. You can just get a source tree (from darcs, from a tarball, or from
CVS), run darcs init; darcs add -r . and start working. If
you decide you were wasting your time, you can just rm -rf
the directory, and there’s nothing else to worry about cleaning up. That’s
pretty sweet. And since it ignores the issue of versions and branches, you
can set them up to work however you like – branches are just a matter of
making a new directory somewhere on the filesystem, and version numbers
are just a string you use when tagging a version.
There’s a few downsides too, of course. Obviously there’s no pre-made
“repository” that’ll store all your branches and handle permissions and
keep them safe from an accidental rm -rf for you. If you
want a version control system to stop your developers from screwing
up and losing stuff, that’s a big loss. Making a repository out of a
bunch of darcs working directories is possible, but not terribly space
efficient, and you have to control access to it yourself (though using
ssh and sudo is directly supported). darcs isn’t terribly fast, either,
if you’re dealing with lots of changesets, aiui. Also annoying is that
darcs doesn’t cope with preserving file metadata, as far as I can see –
so timestamps, permissions and ownerships aren’t in the revision control,
though they are kind-of preserved. On the other hand, the only annoying
filenaming is the _darcs directory where all the darcs
information goes – and it’s only one directory per project/branch which
is less annoying than CVS dirs everywhere, and an underscore’s nowhere
near as obnoxious as curly braces and leading plusses. I’m presuming
it’s an underscore instead of a dot for better Windows compatability.
