Rambles around computer science

Diverting trains of thought, wasting precious time

Wed, 16 Mar 2011

Config filesystems, not config files

Like most computer scientists, I really hate tinkering with computers. Actually, that's not true. Like many computer scientists, I like tinkering with computers in a constructive way that achieves something interesting and novel (read: practical CS research). But I hate tinkering that is provoked by stuff not working. I use a lot of software that has episodes of the latter kind---most of it is free software (and, erm, arguably most free software is of that kind, but that's another story).

One recurring pain is that I learn how to configure software the way I like it, but then all that hard-learnt knowledge is eroded by change: change in what software is “current” or “supported”, and also change in the way any given bit of software manages its configuration. If you like, software's semantics often change, particularly at the configuration level.

So often, I'm faced with a bunch of hassle just to keep my configuration working the way I like it. Recent examples have included: KDE 4 clobbering my X resources when it starts up (in a way that KDE 3 didn't); Xorg forcing me now to use xrandr to set up multiple monitors; wireless networks now being preferentially handled using Network Manager not ifupdown.

In dealing with this complexity, one recurring principle has been clear to me. The closer a configuration system stays to the Unix filesystem abstraction, and the less abstraction it tries to re-build on top of the filesystem, the easier it is to deal with. This is because using a direct filesystem encoding, I can use standard tools, from a shell, to inspect and search and modify (and even generate) my configuration, and to debug problems. This is also why gconf sucks, just as the Windows registry sucks: they represent hierarchical data using a custom encoding layered on flat files, rather than embracing the hierarchy already given to them by the filesystem. (This approach is even less excusable on Unix than on Windows, because Unix filesystems are somewhat optimised for small files in a way that Windows-land filesystems traditionally weren't, as exemplified by FAT.)

In some quarters there's been a drive to actively embrace the filesystem as a means of capturing the structure of configuration data. Configuration directories (like xorg.conf.d) are one example, although it has now gone away; symlink structures like the traditional System V init runlevel directories are another; the run-parts idea of directories encoding control structures is a third. Configuration is easy to understand and modify when it's laid out transparently in the filesystem. When it's instead recorded as opaque data in random files somewhere, this is not true.

Unfortunately this drive towards transparency is not universal. Today I've been debugging a configuration issue with power management on the recent Ubuntu. When I close my laptop lid with no AC power, it suspends to RAM. When I close the lid on AC power, it doesn't---but I want it to. I had assumed the matter was under the control of the scripts in /etc/acpi/, but a quick inspection of the lid.sh script revealed that it didn't deal with suspending to RAM at all. It turns out that KDE 4 has something called “PowerDevil” and that this is responsible. I can configure it using KDE's graphical systemsettings application. But this whole set-up is unsatisfactory. How does it interact with other system settings, such as the /etc/acpi/ scripts? Why is a KDE-specific tool replicating the functionality that is already provided at a lower level? And now I have one more chunk of configuration to deal with, at one more path in the filesystem, and one more model of the settings domain to understand---squirreled away inside its own configuration file (mercifully in plain-text format).

Now, the researcher will say that there's a problem here: why should a simple need, such as the gconf-like desire to implement a familiar abstraction (or something close to it) with different performance characteristics, bring such a huge cost in terms of tool support, convenience and integration cost? It's not really an answer, as I have proposed, to “just use the filesystem, stupid”. For the same reason, even filesystem-embracing approaches don't go so far as to have one file per setting, say---there is some amount of filesystem-opaque flat structure. I'll save some comments on this for a (near)-future post.

[/research] permanent link contact

validate this page