Rambles around computer science

Diverting trains of thought, wasting precious time

Wed, 17 Jun 2015

“Type safety” and observability: an irony

As I've written before, people mean a lot of different things when they say “type-safety”.

The meaning I like best is that of Ungar et al: “the behavior of any program, correct or not, can be easily understood in terms of the source-level language semantics”. This is a dynamic property. It means we can explain failures in terms of the source language; they can't make the program do strange or arbitrary things. This is usually what people mean when they talk informally about “type-safety” in relatively lower-level languages like C, C++, or even Java, in dynamic languages (the context Ungar and friends were coming from), and also by Krishnamurthi and Felleisen. Language safety is an absence of corrupting failures; only understandable, explainable failures are allowed. (I find it irksome that “type-safety” has come to denote this general safety property rather than anything directly to do with type errors. Oh well....)

If type safety is something to do with explaining, what is an “explanation”? The most common kind of explanation is a stack trace. Stack traces are interesting because they combine a certain amount of live execution state and a certain amount of execution history. They tell us not only how the program was trying to proceed (its continuation), but also how it got where it is (its history). An explanatory stack trace will be heavy on the historical information, even if that information is not needed for continuation, because “how it got there” generally offers a clearer explanation than “what it would have done next”. For example, an ideal stack trace would include not only a list of active function calls on the stack, but also the values of their parameters—ideally the parameters' values on entry. Many of these values will be “dead” when the trace is generated: execution has moved on since the entry to the call, so those values are history (literally). This history is useless to the program in continuing its execution, but is invaluable to the programmer reading a stack trace, because they serve as explanation. This phenomenon—that “dead” state can nevertheless be useful as explanation—is one we'll return to.

If one kind of “type safety” is defined in terms of explanation of failure states, what's another sense of type safety? It's that “well-typed programs don't go wrong”. This is a completely different property, and I prefer to call it type-soundness. (Also note that it's using a different meaning of “type”.) In a type-sound language, by the time we run a program, we've proved that certain errors won't happen. The system for doing that proof is called the “type system”.

Now for the irony. Type-soundness is used to justify elimination of explanation in ML implementations. ML-family languages' type-soundness property means that the implementation can do away with run-time metadata (tags). This has the side-effect of removing the explanatory value that these tags would otherwise have. According to our initial definition, “type safety” was all about explanation.

But if we've proven type errors won't happen, there'll be no need to explain them, right? This us true, but the problem is that type errors are not the only errors that type tags can help explain. Even though no type errors will occur, other run-time failures will happen, and they are in need of explanation. Type tags are an obviously useful artifact for generating such explanations at run time. If we want to generate a stack trace, type tags make it easy to implement a more useful explanation—including the valuations of local variables, say. Without them, this becomes very hard.

Are other run-time errors really inevitable? The short answer is yes. Any general-purpose programming language, no matter how “safe”, has to be accept the fact that some run-time errors, for example unhandled exceptions or assertion failures, will be feasible. This is simply because computation is necessarily input-dependent. Even if you prove that your program never fails when run on in-spec input, out-of-spec input has to be detected and handled somewhere. No useful program runs in a completely closed system. (If you can safely assume you don't get any out-of-spec input, it's because it's been filtered out somewhere else in the wider system.)

Are type tags really necessary to present useful stack traces? No, not strictly. We can think of various schemes for restoring this ability, and my previous post goes into this in some depth. But these are much harder than just using tags, and it shows: implementers don't bother with them. On most implementations of ML-family languages, if a fatal exception occurs, the explanation offered will be, at best, a list of source-code locations (function names, possibly with file and line numbers) on the stack. It won't include valuations of locals, or anything about the concrete types of values. That's because the run-time tags which would allow printing out these things have been removed. In short, run-time failures are actively made less explainable by the erasure of type tags. (I'm told that AliceML is better than other MLs in this respect, though I haven't yet tried it myself.)

We should not be impressed by the “look, no tags!” property of ML unless some equivalent mechanism is in place to retain the degree of explainability which those tags afforded.

In theoretically-inclined PL literature, hard-to-formalise concepts such as debuggability or explainability—we could say “usability” generally—don't factor into how a design is evaluated. This is fine; it's simply Dijkstra's “separation of concerns” in action. Formalisation necessarily discards all details that are inessential to some specific (formalised) problem. A corollary is that problems that can't be formally stated are never addressed by theoretical research. Again, that's fine too. But producing a practical artifact calls for some kind of “systems thinking”: taking a much broader view than that of the theoretical results which may have inspired it. Any practical scenario is the superimposition of a collection of problems. Even for those that are amenable to formalisation, theoretically pleasing approaches to each one in isolation won't necessarily compose into a useful whole. The lesson of ML is that we have to be incredibly circumspect about transferring theoretical results, like erasability of tags, into programming practice. There must be a serious attempt to elicit and accommodate unformalised requirements. This needn't be deep or difficult: it's not rocket science to anticipate that we might want to observe the run-time state of an ML program, such as when explaining a run-time error. I find it sad that a succession of real ML implementations, all built by incredibly smart people, have adopted an unobservable, unexplainable approach for no good reason. Mistakes like this are holding us back.

[/research] permanent link contact


Powered by blosxom

validate this page