Rambles around computer science

Diverting trains of thought, wasting precious time

Fri, 27 Feb 2015

Talking about polymorphism

[In my last post, I talked about debugging, and more generally “observability”, with reference to ML-family languages. This post is continuing the theme.]

Before I go into what's different about polymorphism between compile time and run time, I need to get some terminology straight. What is polymorphism anyway?

People traditionally talk about “ad-hoc polymorphism” and “parametric polymorphism”. These terms go back a long way— to Strachey in the the mid-1960s. They're unhelpful today, for two reasons. Firstly, and most trivially, some flavours of apparently non-parametric polymorphism are nevertheless rather systematic, so ideally shouldn't be called “ad-hoc”. Type classes in functional languages and method overriding in object-oriented languages both illustrate this.

Secondly, calling polymorphism “parametric” confuses people into thinking that if code doesn't have type parameters, it's not parametrically polymorphic. But, to me at least, this kind of polymorphism is not about how code is presented; it's a deeper property, about how code is expressed. It's easy to find code expressed in apparently polymorphic ways in all languages—even ones where there is no additional metalanguage of types to describe that polymorphism. Some C code is chock-full of such polymorphism—take Linux, for example. In such code there are no type parameters, only void. Similarly, one of the main strengths of dynamic languages is that code often comes out with some degree of polymorphism by default. If you write a sort function, say, you tend to get something that can operate on any kind of ordered sequence.

So although “polymorphism” sounds fancy, it's really just talking about delaying specialisation. If we're write code in a way that avoids commitment to details that would specialise it to one kind of data or another, we're writing polymorphic code. Fancy type systems are a means of reasoning about polymorphism, hence enabling static safety guarantees about polymorphic code. But they're not what enables polymorphism itself.

(I am probably departing from Strachey at this point, although it's not clear. His presentation doesn't directly address the question of whether polymorphism is really polymorphism without a metalanguage to describe it. For the purposes of my subsequent writings, I declare that it is, and that the metalanguage is a separate concern. Even if you disagree, it won't affect the substance of anything that I'm about to say; only which choice of words is an agreeable way to say it.)

Expressing code in a way that defers specialisation invariably means introducing some kind of indirection. Here I mean “indirection” in a very general sense. Run-time indirection via pointers is certainly one very familiar way of introducing polymorphism: a list node having a void* payload has avoided specialisation, whereas one with int hasn't. But there can also be compile-time indirection, such as between an identifier and its binding. And then there is also run-time indirection across not only storage, but also computation. (The latter is the trick which, when applied to its fullest extent, gives us objects in the sense of Cook.)

Somewhere along the line I have stopped using the qualifier “parametric”. In fact, the view I've just expounded—that polymorphism is the deferral of specialisation—covers ad-hoc polymorphism too, if we look at it the right way. When we say a source language supports “ad-hoc polymorphism”, it means that it lets us group together collections of specialised definitions. Such a collection of definitions, viewed as a unit, start to look like a single, unspecialised definition—in other words, a polymorphic definition. Again, indirection is what enables this polymorphism. If I write the token “+” in a language with overloading or type classes—Haskell or C++, it doesn't matter—it denotes the whole collection of defined addition operators. The code thereby avoids directly selecting any particular one. Typically, the selection is done indirectly. Usually it happens in the compiler, since the language defines some rules for selecting which one will actually be used. But it might happen very late in compilation since these rules might allow dependencies on the code's context of use (think how “+ is resolved in a C++ template definition, for example). And all this could equally well happen at run time; multimethods allow us to do this for binary operators like plus, while the usual single dispatch is implementing precisely the restricted unary case: deferring specialisation, on (only) the leftmost argument, until the execution of each individual call.

If polymorphism is this incredibly general thing, we need some way of getting down to specifics—like the question I started with, about what “polymorphism” is present during execution of compiled OCaml programs on a particular OCaml runtime. One thing that helps is to qualify any mention of polymorphism with a particular time: perhaps compile time, perhaps run time. At any given time, we can describe a polymorphic entity (perhaps a definition in source code, perhaps an object at run time) as somewhere on a spectrum between strong “specialisation” and strong absence of specialisation, which we can happily call “genericity”. (I say “strong” not “full” because the logical extremities of this spectrum tend to be degenerate, like “the output of the program for given input”—the most specialised state!—and “not written any code yet”—the most generic possible state.)

This “one time” restriction forces us to distinguish a source-level view from a run-time view. It's clear that this distinction matters. Polymorphism in source code can sometimes be implemented by specialisation before execution. C++ templates are the classic example: I can define a generic template, but the compiler will implement it by generating lots of specialised instances. Meanwhile, certain whole-program optimisations in ML compilers, like MLton, do effectively the same. This kind of compile-time specialisation is sometimes called “whole-program monomorphisation”. Just as “polymorphism” is a fancy word for the deferral of specialisation, “monomorphisation” is a fancy word for applying some kind of specialisation.

Polymorphism only defers specialisation; it doesn't avoid it. As I'll talk about more in the next post, all polymorphism eventually ends up being specialised away somehow. What varies is where and when this specialisation occurs. It might be done early, in a compiler, or late, during execution, by “elaborating” a particular trace of machine instructions in a particular machine context—i.e. by actually running the generic code! In the latter case, each execution is “specialised” in the sense that it runs in a different machine context and so produces a different outcome. Either way, execution is a gradual process of specialisation, entailing the gradual removal of polymorphism.

Phew. Perhaps that was a lot of belabouring the point. But it certainly helped me clarify (to myself) in what ways “polymorphism” might be present in ML code at run time. I'll revisit this precise question in the next post.

[/research] permanent link contact

validate this page