Rambles around computer science

Diverting trains of thought, wasting precious time

Sun, 14 Nov 2010

Completeness

As researchers, we are naturally wary of addressing problems which don't have a well-defined end point. We like to be able to say, “we've solved it”, or perhaps “we've solved it for the following formally-characterized subclass of problem”. In my work, that doesn't really make sense. Cake is a practical tool for tackling interface diversity. No such tool can be complete, because for any nontrivial language, or interface encoding scheme, there is no limit to the crazy encodings that could be dreamed up. When Kent enumerated “the many forms of a single fact”, he wasn't making a complete list, which would clearly be impossible---but he was trying to come up with a catalogue that included a reasonably high proportion of what people actually do in practice.

Although I could try to formally characterise the class of mismatch which my current Cake language is capable of reconciling, I've never seen much worth in doing so. It's hard, if not impossible, to come up with a formal characterisation that actually has practical meaning. In any such exercise, what tends to happen is that a bunch of practically meaningful cases are excluded, and a bunch of practically meaningless cases are included. One reason for this is that humans brains are messy, and don't respect the formal boundaries induced by conventional mathematical thinking. Mathematical formalisms strive to tackle complexity primarily by purposely engineered compositionality; human evolution has done so by a random walk. The point is that humans are full of specialisations, resource limitations and arbitrary cut-offs. Too often, we embrace the formal concerns and ignore the human reality. This is the formalism wagging the research, and it's something that irritates me.

One of my favourite examples of formalism-versus-reality is our use of nesting in sentences, and in particular, what linguists call “centre-embedded” or “self-embedding” structures. (The concepts are distinct, but the distinction is somewhat confused in the literature.) Sentences such as “The rat the cat the dog chased ate died.” are perfectly grammatical according to any formal grammar of English you might reasonably come up with---that is, any grammar that could also generate the sentences we actually use in real English. If you buy that the role of grammars is to model the languages that people actually use, then this is clearly a failure of modelling---yet is a consequence of the recursively generative nature of grammars loved by formalists. In practice, we humans don't work in this neatly recursive way---the complex, messy architecture of our brains means that only some kinds of embedding are easily processed, and it turns out, as expounded by Richard Hudson, that the actual criteria are far more complex than anyone would have thought. For example, it appears to make a difference whether the subjects of the clauses being nested are pronouns or not.

A related and much more familiar example for most readers will be the halting problem. We know that no tool which attempts to answer an undecidable question can be complete. Therefore, a lot of researchers just avoid those problems. Byron Cook, who in recent years has been doing stellar and pioneering work on proving program termination, has been known to talk about the variety of bafflement, disdain and ridicule which his work provoked in its inception. It's simply anathema to conventional CS research approaches to attack problems for which it's known that there can be no complete solution. In fact, far too many self-professing computer scientists don't even understand the distinction between a partial and total solution. We have to get over this, simply because lots of important problems are in this class! I like to think my work on Cake is doing its bit. The halting problem is somewhat different from Cake-like problems, in that its' a formally defined, provably-undecidable problem, whereas the incompleteness of systems like Cake lies in that we can't even completely define the problem space in any formal way. I should add that I'm not putting my work in the same bracket as Byron's! But there is at least this small thematic similarity in the flavour of criticism that tends to come up. Speaking of which, I could do with remembering this argument when it's time for my viva....

[/research] permanent link contact

validate this page