Britney’s Delay

(For those coming in late, “britney” is the name of the scripts that manage Debian’s testing distribution; it uses a number of criteria, including time since upload, number of open bugs, where it’s been ported to, and dependency information to maintain a set of packages that meet Debian’s release goals. I’m the Debian release manager at the moment, and britney’s my baby. See also: Debian, testing distro, RC bugs.

The biggest problem with britney at the moment seems to be that people aren’t taking the hint to fix release-critical bugs throughout the release cycle. That causes individual packages to get stuck, which then causes other packages to get stuck, and causes a huge logjam. Unfortunately “testing” itself is evidently not enough of an incentive (by letting people get their updates into a “released” distribution within 10 days) for people to fix RC bugs quickly, so we probably need to setup some new ones.

One idea is to mess with Britney’s delaying tactics some more. Really, ten days is frequently too long a period (for people like Joey Hess, who make minor updates to many of his packages every few days), and even more frequently far too short a period (we don’t really have a reliable QA group finding problems and we’ve even gotten to the point where people can tend not to bother filing bugs under the expectation they won’t be fixed anyway, with the result being that packages get into testing in spite of still having serious bugs).

Two factors thus influence the delay: we want to get packages into testing fairly quickly so that people can start working on new stuff in unstable (and so people following testing get the latest updates), but we want them to stay in unstable for a little while so that we can have some confidence the package isn’t too horrible. The idea, then, is to delay every package exactly long enough to find all its RC bugs. There are obvious problems with the knowability of that, so instead we need to make it proportional to the “riskiness” of the package, which is roughly proportional to how long it’s been since the last major changes.

So if you upload version 2.0-1 on the 1st which is a major rewrite, and 2.1-1 on the 4th with some minor changes to the documentation, you’ve got a high risk – and long delay – carried on from 2.0-1, and a small risk and a low delay beginning with 2.0-2. If you upload 2.1-2 with a couple of packaging changes a month later, your previous risks have run out, so you’ve only got a very small risk, and a small delay, left. The delays could then look like:

  • 1st: 2.0-1, 14 day delay, ’til the 15th
  • 4th: 2.1-1, 2 day delay (6th), carried over ’til the 15th
  • 15th: 2.1-1 goes into testing
  • 20th: 2.1-2, 2 day delay (22nd)

The question then is how to work out what a risky change is. One way is just to have the maintainer nominate a value in the changelog, and keep track of that in a similar way to how urgency values are already tracked. This should work mostly, but probably needs to be supplemented by something a little less discretionary. Another option would be to say that every change that fixes an RC bug is a major one. That’s probably justifiable (at the very least, it should be true that the change that caused the RC bug was major, and making the delay start at the time the fix is uploaded instead of the bug shouldn’t be too troublesome.

Another thing you could do is allow people to review changes, and having done so, mark them as less risky. So it might take 30 days of sitting in unstable for X 4.3 to be considered acceptable for testing, but the QA team, having looked over the patches, and taken into account other things might be willing to vouch for it, and get it in after only 20 or 15 days.

Hrm. While that provides some extra incentives, and I think some more sense in the time, I don’t think it’s remotely close to solving the real problem of getting packages better maintained.

One Comment

  1. hothead says:

    Hi,
    I think you srsly can’t judge a packaged software’s quality by version number, imagine you do a major repack/release and often these new versions include many new bugs as developers do often work on new major versions without any public review or feedback process.
    Maybe something like a weighting system that calculates package “priorities” or importance of availability (maybe through popcon?) evaluates all “risk factors” of dependency packages and decides upon the result. this way britney could also find the most “bad-weighing” dependencies that hold back many packages, or maybe even show which bugfixes enable the most packages to be moved to testing and thus reduce the total time that is minimally needed, because developers can set priorities accordingly.
    i’m not really familiar with that topic, but maybe this could be an idea to start with.

    regards,

    hothead

Leave a Reply