Fri, 10 Oct 2008
Thu, 09 Oct 2008
Lunch for today. Recipe:
- Lamb forequarter chop (200g). Cook on frypan ‘til medium, sprinkle with garlic salt to taste.
- One mandarin, peeled, broken, scattered.
- One cavendish banana, peeled, sliced, scattered.
- Blueberries.
- Strawberries.
Total time to cook, consume, clean, and blog about an hour. Total cost, about $5. Missing were mango, cheese and beer. Oh well, maybe next time.
Sun, 28 Sep 2008
As part of trying to convince myself to commit to my next project I spent some time wondering just what it might mean if the project really is beyond my abilities. At what point does it make sense to say “don’t even try in the first place?” or “okay, you’ve given it a fair shot, don’t throw good money/time/whatever after bad” or “pshaw, don’t be a quitter”?
Most of the time that’s not terribly hard – you work out if it’s within your abilities, and if it’s not you don’t bother, and if it is, you do it, figure out if you made any mistakes and fix them, and take a bow. But when you get to the very edge of what you’re capable of, just fixing your mistakes can be complicated enough that you can make mistakes there too, and if you make more mistakes fixing those…
Sat, 27 Sep 2008
So it’s been a bit over five years of indolence, time for a change. The original title for this blog was in homage to Andrew’s incoherence log, and found by trolling for interesting sounding words beginning with “in-”. Not seeing any reason to change a winning formula, I tried the same again, and came across the lovely word “inamorata” from the Italian “innamorare”, meaning “to inspire with love”. Hence, “inamerrata”, which clearly means “random stuff I find cool”.
Sun, 03 Aug 2008
Faith is an interesting concept to try to disentangle from religion.
faith
n 1: a strong belief in a supernatural power or powers that
control human destiny; "he lost his faith but not his
morality" [syn: {religion}, {faith}, {religious belief}]
2: complete confidence in a person or plan etc; "he cherished
the faith of a good woman"; "the doctor-patient relationship
is based on trust" [syn: {faith}, {trust}]
3: an institution to express belief in a divine power; "he was
raised in the Baptist religion"; "a member of his own faith
contradicted him" [syn: {religion}, {faith}, {organized
religion}]
4: loyalty or allegiance to a cause or a person; "keep the
faith"; "they broke faith with their investors"
Without religion and supernatural powers, you immediately lose the first and third definitions, and if you’re just left with “complete confidence” and “loyalty”, it seems like you’re missing out on a lot.
Sat, 02 Aug 2008
I’ve been a fan of chaos theory and emergent order for a long time now – the idea that simple rules repeatedly applied build complex and creative results is just beautiful to me, and seeing the same principles apparently apply in different areas – from evolutionary theory to the Wisdom of Crowds – for equally astounding results is outright awe inspiring. Ultimately, in various ways, it hits all my buttons – complicated maths, relevance to lots of different things, egalitarianism and making cool things happen; and as a result of all that, these days it’s a fundamental part of my belief set.
As should be the case with all good beliefs, that’s been challenged lately in a variety of ways. Debian, for example, has had a real mean streak for a while that just doesn’t mesh with my sense of the sort of good result that should spontaneously appear from a bunch of generous people working together. Likewise, Wikipedia seems to lately be suffering increasingly from something like censorship by bureaucracy rather than the wild free-for-all that somehow produces something better than the best experts. You could pretty easily call those errors of taste – maybe they’re good results, and I just happen not to like them – but I don’t think it’s a big call to think otherwise, and it’s certainly not a big call for me to prefer results I happen to like. And, of course, you could just call it an over-generalisation: maybe random anarchy is good to a point, but only to a point.
Tue, 17 Jun 2008
(Alternative title: I aten’t dead)
My name’s Anthony and, like a lot of other people, I suffer from depression. It’s not something I like to talk about – my normal philosophy is just to filter out that part of my life and just talk and think about the interesting and fun parts of life. The downside is that when it gets particularly bad, everything gets filtered out and I more or less just vanish.
Fri, 18 Apr 2008
One of the freedoms I value is the freedom to choose what you spend your time on and who you spend it with. And while I’ve spent a lot of time arguing that people in key roles in Debian still have those freedoms (hey, 2.1(1), don’t you know), reality these days seems to be otherwise. But hey, solving that quandry just requires a mail to DSA.
To folks on the core teams I’ve been involved with: it’s been a pleasure and an honour working with you; if not always, at least mostly. Best of luck, and I hope y’all accept patches.
Tue, 15 Apr 2008
Last month we had a brief discussion on debian-devel about what images would be good to have for lenny – we’re apparently up to about 30 CDs or 4 DVDs per architecture, which over 12 architectures adds to about 430GB in total. That’s a lot, given it’s only one release, and meanwhile the entire Debian archive is only 324GB.
The obvious way to avoid that is to make use of jigdo – which lets you recreate an iso from a small template and the existing Debian mirror network. I’ve personally never used jigdo much, half because I don’t usually use isos anyway, but also because the few times I have tried jigdo it always seemed really unnecessarily slow. So the other day I tried writing my own jigdo download tool focussed on making sure it was as fast as possible.
The official jigdo download tool, ttbomk, is jigdo-lite – which you give a .jigdo file, and the url of a local mirror. It then downloads the first ten files using wget, and once they’re all downloaded, it calls jigdo-file to get them merged into the output image. This gets repeated until all the files have been downloaded.
By doing the download in sequence like this, you miss out on using your full network connection in two ways: one during the connection setup latency when starting to download the next package, and also while jigdo-lite stops downloading to run jigdo-file. And if you’ve got a fast download link, but a slower CPU or disk, you can also find yourself constrained in that you’re maxing those out while running jigdo-file, but leaving them more or less idle while downloading.
To avoid this, you want to do multiple things at once: most importantly, to be writing data to the image at the same time as you’re downloading more data. With jigdodl (the name I’ve given to my little program), I went a little bit overboard, and made it not only do that, but also manage four downloads and the decompression of the raw data from the template. That’s partly due to not being entirely sure what needed to be done to get a speedy jigdo program, and partly because the communicate module I’d just written to deal with this sort of parallelism making that somewhat natural.
In the end, it works: from wireless over ADSL to my ISP’s Debian mirror, I get the following output:
Jigsaw download: Filename: debian-40r3-amd64-CD-1.iso Length: 675477504 MD5sum: d3924cdaceeb6a3706a6e2136e5cfab2 Total: 679 s; d/l: 586 MB at 883 kB/s; dump: 57 MB at 57 MB/s Finished!
which is only slightly short of maxing out my downstream bandwidth, taking a total of about 11m20s. Running jigdodl with a closer mirror works pretty well too, though evidently some of my more recent changes weren’t so great, because I’ve gone from 9153 kB/s on a 100 Mbps link down to 7131 kB/s or lower. The CPU usage also seems a bit high, hovering at between five to ten percent at 900 kB/s.
For comparison, running jigdo-lite on the same file took 17m41s, which is about 566 kB/s, with the overhead being about 6m20s. What that means is if I doubled my bandwidth to about 20Mbps, jigdodl would halve its time for the download to about 5m50s, while jigdo-lite would still have about the same non-download overhead, and thus take 12m10, which is still 69% of its original speed. Going from 10Mbps ADSL speed to 100Mbps LAN gets jigdodl down to 1m31s (13% of the time, with optimal being 10%), while jigdo-lite would be expected to still be about 7m51s (43% of its original time).
I suspect the next thing to do is to rewrite the downloading code to use python-curl instead of running curl, and thus downloading multiple files with a single connection, and tweaking the code so that it writes the file in order, rather than updating whichever parts are ready first.
Anyway, debs are available for anyone who wants to try it out, along with source in the new git source package format.
In a couple of days, DPL-elect Steve McIntyre takes over as DPL, after being elected by around four hundred of his peers… Because I can’t help myself, I thought I might poke at election numbers and see if anything interesting fell out.
First the basics: I get the same results as the official ones when recounting the vote. Using first-past-the-post, Steve wins with 147 first preference votes against Raphael’s 124, Marc’s 90 and NOTA’s 19 (with votes that specify a tie for first dropped). Using instant-runoff / single transferable vote, the winner is also Steve, with NOTA elimited first and Marc collecting collecting 5 votes, Steve 4 and Raphael 2, followed by Marc getting eliminated with Steve collecting 50 votes, against Raphael’s 26.
So, as usual, different voting systems would have given the same result, presuming people voted in basically the same way.
NOTA really didn’t fare well at all in this election, with a majority of voters ranking it beneath all candidates (268 of 401, 53.5%). For comparison, only 18 voters ranked all candidates beneath NOTA, with 9 of those voters then ranking all candidates equally. (For comparison, in 2007, 312 of 482 voters (about 65%) ranked some candidate below NOTA, though that drops to 225 voters (47%) if you ignore voters that just left some candidates unranked. Only 98 voters (20%) voted every candidate above NOTA)
With NOTA excluded from consideration, things simplify considerably, with only 13 possible different votes remaining. Those come in four categories: ranking everyone equal (17 votes, 9 below NOTA as mentioned above, and 8 above NOTA), ranking one candidate below the others (13 votes total, 7 ranking Raphael last, 3 each for Steve and Marc), ranking one candidate above the others (66 votes; 30 ranking Steve first, 18 each ranking Raphael and Marc first), and the remainder with full preferences between the candidates:
70 V: 213
63 V: 123
56 V: 132
52 V: 231
38 V: 312
26 V: 321
The most interesting aspect of that I can see is that of the people who ranked Raphael first, there was a 1.8:1 split in preferring Steve to Marc, and for those who preferred Marc first, there was a 2:1 split preferring Steve to Raphael. For those who preferred Steve, there was only a 1.1:1 split favouring Raphael over Marc.
I think it’s fair to infer from that that not only was Steve the preferred candidate overall, but that he’s considered a good compromise canidate for supporters of both the alternative candidates (though if all the people who ended up supporting Steve hadn’t been voting, Raphael would have won by something like 26 votes (129:103) with a 1.25:1 majority; if they had been voting, but Steve hadn’t been a candidate, Raphael’s margin would’ve increased absolutely to 33 votes (192:159) but decreased in ratio to a 1:1.2 majority.
Thu, 10 Apr 2008
One of the loveliest things about Unix is the select() function (or its replacement, poll()), and the way it lets a single thread handle a host of concurrent tasks efficiently by just using file descriptors as work queues.
Unfortunately, it can be a nuisance to use – you end up having to structure your program as a state machine around the select() invocation, rather than the actual procedure you want to have happen. You can avoid that by not using select() and instead just having a separate thread/process for every task you want to do – but that creates a bunch of tedious overhead for the OS (and admin) to worry about.
But magically making state machines is what Python’s generators are all about; so for my little pet project that involves forking a bunch of subprocesses to do the interesting computational work my python program wants done, I thought I’d see if I could use that to make my code more obvious.
What I want to achieve is to have a bunch of subprocesses accepting some setup data, then a bunch of two byte ids, terminated by two bytes of 0xFF, and for each of the two byte inputs to output a line of text giving the calculation result. For the time being at least, I want the IO to be asynchronous: so I’ll give it as many inputs as I can, rather than waiting for the result before sending the next input.
So basically, I want to write something like:
def send_inputs(f, s, n): f.write(s) # write setup data for i in xrange(n): f.write(struct.pack("!H", i)) f.write(struct.pack("!H", 0xFFFF)) def read_output(f): for line in f: if is_interesting(line): print line
Except of course, that doesn’t work directly because writing some data or reading a line can block, and when it does, I want it to be doing something else (reading instead of writing or vice-versa, or paying attention to another process).
Generators are the way to do that in Python, with the “yield” keyword passing control flow and some information back somewhere else, so adopting the theory that: (a) I’ll only resume from a “yield” when it’s okay to write some more data, (b) if I “yield None” there’s probably no point coming back to me unless you’ve got some more data for me to read, and (c) I’ll provide a single parameter which is an iterator that will give me input when it’s available and None when it’s not, I can code the above as:
def send_inputs(_): # s, n declared in enclosing scope yield s for i in xrange(n): yield struct.pack("!H", i)) yield struct.pack("!H", 0xFFFF) def read_output(f): for line in f: if line is None: yield None; continue if is_interesting(line): print line
There’s a few complications there. For one, I could be yielding more data than can actually be written, so I might want to buffer there to avoid blocking. (I haven’t bothered; just as I haven’t worried about “print” possibly blocking) Likewise, I might only receive part of a line, or I might receive more than one line at once, and afaics a buffer there is unavoidable. If I were doing fixed size reads (instead of line at a time), that might be different.
So far, the above seems pretty pleasant to me – those functions describe what I want to have happen in a nice procedural manner (almost as if they had a thread all to themselves) with the only extra bit the “None, None, continue” line, which I’m willing to accept in order not to use threads.
Making that actually function does need a little grunging around, but happily we can hide that away in a module – so my API looks like:
p = subprocess.Popen(["./helper"], stdin=PIPE, stdout=PIPE, close_fds=True) comm = communicate.Communication() comm.add(send_inputs, p.stdin, None) comm.add(read_output, None, p.stdout, communicate.ByLine()) comm.communicate()
The comm.add() function takes a generator function, an output fd (ie, the subprocess’s stdin), an input fd (the subprocess’s output), and an (optional) iterator. The generator gets created when communication starts, with the iterator passed as the argument. The iterator needs to have an “add” function (which gets given the bytes received), a “waiting” function, which returns True or False depending on whether it can provide any more input for the generator, and a “finish” function that gets called once EOF is hit on the input. (Actually, it doesn’t strictly need to be an iterator, though it’s convenient for the generator if it is)
The generator functions once “executed” return an object with a next() method that’ll run the function you defined until the next “yield” (in which case next() will return the value yielded), or a “return” is hit (in which case the StopIteration exception is raised).
So what we then want to do to have this all work then, is this: (a) do a select() on all the files we’ve been given; (b) for the ones we can read from, read them and add() to the corresponding iterators; (c) for the generators that don’t have an output file, or whose output file we can write to, invoke next() until either: they raise StopIteration, they yield a value for us to output, or they yield None and their iterator reports that it’s waiting. Add in some code to ensure that reads from the file descriptors don’t block, and you get:
def communicate(self): readable, writable = [], [] for g,o,i,iter in self.coroutines: if i is not None: fcntl.fcntl(i, fcntl.F_SETFL, fcntl.fcntl(i, fcntl.F_GETFL) | os.O_NONBLOCK) readable.append(i) if o is not None: writable.append(o) while readable != [] or writable != []: read, write, exc = select.select(readable, writable, []) for g,o,i,iter in self.coroutines: if i in read: x = i.read() if x == "": # eof iter.finish() readable.remove(i) else: iter.add(x) if o is None or o in write: x = None try: while x is None and not iter.waiting(): x = g.next() if x is not None: o.write(x) except StopIteration: if o is not None: writable.remove(o) return
You can break it by: (a) yielding more than you can write without blocking (it’ll block rather than buffer, and you might get a deadlock), (b) yielding a value from a generator that doesn’t have a file associated with it (None.write(x) won’t work), (c) having generators that don’t actually yield, and (d) probably some other ways. And it would’ve been nice if I could have somehow moved the “yield None” into the iterator so that it was implicit in the “for line in f”, rather than explicit.
But even so, I quite like it.
Sat, 22 Mar 2008
One of the challenges maintaining the Debian archive kit (dak) is dealing with Debian-specific requirements: fundamentally because there are a lot of them, and they can get quite hairy – yet at the same time, you want to keep them as separate as possible both so dak can be used elsewhere, and just so you can keep your head around what’s going on. You can always add in hooks, but that tends to make the code even harder to understand, and it doesn’t really achieve much if you hadn’t already added the hook.
However, dak’s coded in python, and being an interpreted language
with lots of support for introspection, that more or less means there’s
already hooks in place just about everywhere. For example, if you don’t
like the way some function in some other module/class works, you can
always change it (other_module.function = my_better_function).
Thus, with some care and a bit of behind the scenes kludging, you can have python load a module from a file specified in dak.conf that can both override functions/variables in existing modules, and be called directly from other modules where you’ve already decided a configurable hook would be a good idea.
So, at the moment, as a pretty simple example there’s an
init() hook invoked from the main dak.py
script, which simply says if userext.init is not None:
userext.init(cmdname).
But more nifty is the ability to replace functions, simply by writing something like:
# Replace process_unchecked.py's check_signed_by_key @replace_dak_function("process-unchecked", "check_signed_by_key") def check_signed_by_key(old_chk_key): changes = dak_module.changes reject = dak_module.reject ... old_chk_key()
That’s made possible mostly by the magic of python decorators – the
little @-sign basically passes the new check_signed_by_key
function to replace_dak_function (or, more accurately,
the function replace_dak_function(...) returns), which does
the dirty work replacing the function in the real module. To be just a
little bit cleverer, it doesn’t replace it with the function we define,
but its own function with simply invokes our function with an additional
argument to whatever the caller supplied, so we can invoke the original
function if we choose (the old_chk_key parameter – the
original function takes no arguments, so our function only takes one).
Right now, we don’t do much interesting with it; but that should change once Ganneff’s little patch is finished, which should be RSN…
Hopefully, this might start making it easier to keep dak maintained in a way that’s useful for non-Debian installs – particularly if we can get to the point where hacking it for Debian generally just implies changing configuration and extension stuff – then we can treat updating all the real scripts as a regular software upgrade, just like it is outside Debian.
Fri, 07 Mar 2008
Continuing from where we left off…
The lower bound for me becoming a DD was 8th Feb ‘98 when I applied; for comparison, the upper bound as best I can make out was 23rd Feb, when I would have received this mail through the debian-private list:
Resent-Date: 23 Feb 1998 18:18:57 -0000 From: Martin SchulzeTo: Debian Private Subject: New accepted maintainers Hi folks, I wish you a pleasant beginning of the week. Here are the first good news of the week (probably). This is the weekly progress report about new-maintainers. These people have been accepted as new maintainer for Debian GNU/Linux within the last week. [...] Anthony Towns <ajt@debian.org> Anthony is going to package the personal proxy from distributed.net - we don't have the source... He may adopt the transproxy package, too. Regards, Joey
I never did adopt transproxy – apparently Adam Heath started fixing bugs in it a few days later anyway, and it was later taken over by Bernd Eckenfels (ifconfig upstream!) who’s maintained it ever since. Obviously I did do other things instead, which brings us back to where we left off…
Sun, 02 Mar 2008
So, sometime over the past few weeks I clocked up ten years as a Debian developer:
From: Anthony Towns <aj@humbug.org.au>
Subject: Wannabe maintainer.
Date: Sun, 8 Feb 1998 18:35:28 +1000 (EST)
To: new-maintainer@debian.org
Hello world,
I'd like to become a debian maintainer.
I'd like an account on master, and for it to be subscribed to the
debian-private list.
My preferred login on master would have been aj, but as that's taken
ajt or atowns would be great.
I've run a debian system at home for half a year, and a system at work
for about two months. I've run Linux for two and a half years at home,
two years at work. I've been active in my local linux users' group for
just over a year. I've written a few programs, and am part way through
packaging the distributed.net personal proxy for Debian (pending
approval for non-free distribution from distributed.net).
I've read the Debian Social Contract.
My PGP public key is attached, and also available as
<http://azure.humbug.org.au/~aj/aj_key.asc>.
If there's anything more you need to know, please email me.
Thanks in advance.
Cheers,
aj
--
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. PGP encrypted mail preferred.
On Netscape GPLing their browser: ``How can you trust a browser that
ANYONE can hack? For the secure choice, choose Microsoft.''
-- <oryx@pobox.com> in a comment on slashdot.org
Apparently that also means I’ve clocked up ten and a half years as a Debian user; I think my previous two years of Linux (mid-95 to mid-97) were split between Slackware and Red Hat, though I couldn’t say for sure at this point.
There’s already been a few other grand ten-year reviews, such as Joey Hess’s twenty-part serial, or LWN’s week-by-week review, or ONLamp’s interview with Bruce Perens, Eric Raymond and Michael Tiemann on ten years of “open source”. I don’t think I’m going to try matching that sort of depth though, so here are some of my highlights (after the break).
Wed, 16 Jan 2008
Oh yay, another argument about sexism. I thought we were over this. Aigars writes:
Trying to restrict what words people can or can not use (by labeling them sexist, racist or obscene) is the bread and butter of modern day media censorship. It is censorship and not “just political correctness”. While I would not want people trying to limit contributions to Debian only to “smart and educated white people” (racism) or “logically thinking males” (sexism), going the other way and excluding people from Debian because their remarks or way of thinking might offend someone is just offensive to me.
One: it’s not censorship for Debian to limit discussion of various things on Debian channels. It’s censorship when you prevent discussion of something anywhere.
Two: if you think excluding people is bad, then supporting jerks whose mysogyny repulses people isn’t compatible with that.
Three: doing something in someone else’s name, that isn’t supported by them, is wrong. If you’re in a channel called “debian-something” don’t act in ways that don’t match Debian’s goals. If you want to be free to go against the principles of the DFSG by, eg, discriminating against people or groups, make up your own name for a channel.
Sat, 12 Jan 2008
Wow. Pretty.
Fri, 11 Jan 2008
An article by Sam Varghese appeared on ITwire today, entitled linux.conf.au: What is Novell doing here?:
A GNU/Linux system does not normally load modules that are not released under an approved licence. So why should Australia’s national Linux conference take on board a sponsor who engages in practices that are at odds with the community?
What am I talking about? A company which should not be in the picture has poked its nose in as a sponsor. Novell, which indicated the level of its commitment to FOSS by signing a deal with Microsoft in November 2006, will be one of the supporting sponsors for the conference.
Novell was also a minor sponsor of the 2007 conference, and Sam wrote an article in January expressing similar thoughts, which included this quote from Bruce Perens:
“I’d rather they hadn’t accepted a Novell sponsorship. It wasn’t very clueful of them, given Novell’s recent collaboration with Microsoft in spreading fear and doubt about Linux and software patents,” Perens said.
Ultimately, I think that’s a mistaken view. Linux.conf.au is what it is thanks to the contributions of four groups:
- the organisers
- who create the conference, get a venue, organise a schedule of events, help the speakers and attendees to get there, and generally make it easy for everyone to just get immersed in cool Linux stuff
- the speakers
- who provide the core of the schedule, the reason for attendees to go, and a core depth of awesome technical knowledge and ideas
- the attendees
- who fill in the organisational/content gaps that the organisers and speakers miss, who make for fascinating corridor and dinner conversations, who make side events like the miniconfs, the hackfest or open day interesting, and who pay the rego fees that lets the conference happen
- the sponsors
- who provide a chunk of money to fill out the conference budget letting us commit to venues and events earlier (when we might otherwise have to wait to see how many people come), and let us do extra things that registration fees alone wouldn’t cover
Obviously sometimes you have to exclude people from participating, but that’s mostly only if they’re actually causing trouble for the event. For sponsors, that pretty much means trying to interfere in the conference itself, or not paying on time. Otherwise, if you’re contributing to the conference, and not causing problems, you certainly should be recognised for that, as far as I can see.
For me, the same thing would apply if Microsoft was offering to sponsor the conference – if they’re willing to contribute, and not cause problems, I’m all for it. If they happen to not be doing anything constructive in Linux-space anywhere else, well, it seems perfectly fine to me to start contributing by helping make linux.conf.au awesome.
In Microsoft’s case that would be hard, because all the people going “oh my gosh, Microsoft, Linux! Wolves, sheeps! Hell, snow!” along with possible mixed messages from Microsoft and our long-term major sponsors HP and IBM about the future of Linux and whatnot could really distract us from all the cool technical stuff the conference is fundamentally about. I don’t think there’s anything Microsoft could offer to justify that much disruption, but having more of the world’s software companies involved in free software would probably be worth a bit of hassle, if the disruption could be minimised.
Ultimately, I guess my disagreement comes down to these couple of comments from Sam’s article:
Asked whether it was right that Novell should be allowed to be a sponsor for a conference such as this - which, in my view, is a privilege - […]
[…] Novell, obviously, is hoping that, as public memory is woefully short, it will be able to wriggle its way back into the community. Providing such leeway is, in my opinion, a big mistake.
In my opinion, the ability to contribute to open source isn’t a privelege, it’s something that should be open to everyone, including people who’ve made mistakes in the past: and that’s precisely what the “free” in free software is all about.
OTOH, if you want to see who’s been participating most in the Linux world lately, you’re much better off looking at the list of speakers than sponsors. Novell (or at least SuSE) folks giving talks in the main conference this year seem to include John Johansen and Nick Piggin. Interestingly, the count of HP folks seems a bit low this year, with only two that I can see, which leaves them not only merely equalling Novell/SuSE, but beaten by both Intel and Catalyst. Tsk! I guess we’ll have to wait and see if that changes when we can see the list of attendees’ companies in the booklet this year…
Tue, 08 Jan 2008
With the whole incipient git obsession I’ve been cleaning out some of my scratch dirs. In one, last touched in mid-2006, I found:
Oh. My. God. Becky, look at that bloat! It's so big... It looks like one of those Microsoft products... Who understands those Microsoft guys anyway? They only code that crap because they're paid by the line... I mean the bloat... It's just so slow... I can't believe it's so laggy... It's just bloated... I mean, gross... Look, that just ain't a Hack. I like big apps and I cannot lie. You other bruthas can't deny, That when some perl comes by, not a symbol to waste Like line-noise, cut and paste -- You're bewitched; But now my context's switched, Coz I notice that glest's got glitz. Oh BABY! I wanna apt-get ya, Coz you got pictures, Those hackers tried to warn me, But the bling you got /Make me so horny/ Oooo, app fantastic, You say you wanna fill up my drive? Well, use me, use me, coz you ain't that average GUI. I've seen them typing, To hell with reciting, I point, and click, and never miss a single trick. I'm tired of tech websites, Sayin' command lines are the thing. Ask the average power user what makes them tick -- You gotta point and click. So hackers! (Yeah!) Hackers! (Yeah!) Has your UI got the G? (Hell Yeah!) Well click it (click it), click it (click it), and use that healthy glitz, Baby got bloat. (vi code with a KDE UI...)
And before you ask, no, I don’t know what I was drinking…
Sat, 05 Jan 2008
Inspired mostly by Joey’s nonchalant way of dealing with the death of his laptop…
This seems less of a disaster than other times a laptop’s disk has died on me. When did it start to become routine? […] My mr and etckeeper setup made it easy to check everything back out from revision control. […]
…I’ve been looking at getting all my stuff version controlled too. I’ve just gotten round to checking all my dotfiles into git, and it crossed my mind that it’d be nice if I could just set an environment variable to tell apps to create their random new dot-files directly in my “.etc-garbage” repo. I figured using “$USER_ETC/foo” instead of “$HOME/.foo” would be pretty easy, and might be a fun release goal that other Debian folks might be interested in, so I did a quick google to see if something similar had already been suggested.
The first thing I stumbled upon was a mail from the PLD Linux folks who apparently were using $HOME_ETC at one time which sounded pretty good, though it doesn’t seem to have gotten anywhere. That thread included a pointer to the system that has gotten somewhere which is the XDG spec.
It’s actually pretty good, if you don’t mind it being ugly as all hell.
They define three classes of directory – configuration stuff, non-essential/cached data, and other data. That more or less matches the /etc, /var/cache and /var/lib directories for the system-wide equivalents, though if the “other data” is stuff that can be distributed by the OS vendor it might go in /usr/lib or /usr/share (or the /usr/local/ equivalents) too.
Which is all well and good. Where it gets ugly is the naming.
For the “/etc” configuration stuff, we have the environment variable $XDG_CONFIG_HOME, which defaults to ~/.config, and has a backup path defined by $XDG_CONFIG_DIRS, which defaults to /etc/xdg.
For the “/var/lib” other data stuff, we have the environment variable $XDG_DATA_HOME, which defaults to ~/.local/share, and has a backup path defined by $XDG_DATA_DIRS, which defaults to /usr/local/share:/usr/share. (Though if you’re using gdm, it’ll get set for you to also include /usr/share/gdm)
And for the “/var/cache” stuff, we have the environment variable $XDG_CACHE_HOME, which defaults to ~/.cache.
That seems to me like exactly the right idea, with way too much crap on it. If you simplify it obsessively – using existing names, dropping the desktop-centrism, you end up with:
Put configuration files in $HOME_ETC/foo or $HOME/.foo. For shared/fallback configuration, search $PATH_ETC if it’s set, or just /etc if it’s not.
Put data files in $HOME_LIB/foo or $HOME/.foo. For shared data, search $PATH_LIB if it’s set, or look through /var/lib, /usr/local/{lib,share} and /usr/{lib,share} if it’s not.
Put caches in $HOME_CACHE/foo or $HOME/.foo. For shared caches, search $PATH_CACHE if it’s set, or just look in /var/cache if it’s not.
That seems much simpler to me to the point of being self-explanatory, and much more in keeping with traditional Unix style. It’s also backwards compatabile if you use both old and new versions of a program with the same home directory (or you happen to like dotfiles). And having the XDG variables set based on the above seems pretty easy too.
I wonder what other people think – does {HOME,PATH}_{ETC,LIB,CACHE} seem sensible, or is XDG_{CONFIG,DATA,CACHE}_{HOME,DIRS} already entrenched enough that it’s best just to accept what’s fated?
Thu, 03 Jan 2008
I blogged a fair bit about darcs some time ago, but since then I’ve not been able to get comfortable with the patch algebra’s approach to dealing with conflicting merges – I think mostly because it doesn’t provide a way for the user to instruct darcs on how to recover from a conflict and continue on. I’ve had a look at bzr since then, but it just feels slow, to the point where I tend to rsync things around instead of using it properly, and it just generally hasn’t felt comfortable.
On the other hand, a whole bunch of other folks I respect have been a bit more decisive than I have on this, and from where I sit, there’s been a notable trend:
- Keith Packard, Oct 2006
- Repository formats matter, Tyrannical SCM selection
- Ted Tso, Mar 2007
- Git and hg
- Joey Hess, Oct 2007
- Git transitions, etckeeper, git archive as distro package format
Of course, Rusty swings the other way, as do the OpenSolaris guys. The OpenSolaris conclusions seem mostly out of date if you’re able to use git 1.5, and I haven’t learnt quilt to miss its mode of operation the way Rusty does. And as far as the basics go, Carl Worth did an interesting exercise in translating an introduction to Mercurial into the equivalent for git, so that looks okay for git too.

