Rambles around computer science

Diverting trains of thought, wasting precious time

Wed, 14 May 2014

Instrumenting casts in C++

A reviewer of my libcrunch paper wanted to see me back up my claim that its architecture is “multi-language” by demonstrating support for another language. In particular, for some strange reason, he wanted C++ support.

I had shyed away from C++ support because C++ front-ends are infamously complex. I don't know of a particularly friendly system for instrumenting C++ code. But on reflection, perhaps this isn't necessary. One of the joys of C++ (yes, you heard me) is that it's designed so that user-defined constructs are more-or-less on par with built-in constructs. My instrumentation needs to instrument pointer casts which would otherwise be unchecked. If you follow good C++ style, these are done using static_cast and reinterpret_cast. (Note that dynamic_cast is already checked by the runtime.)

Can we define our own checked_cast template that is a drop-in replacement for these two built-in operators? Then all we need is to #define the latter to the former and we've done the instrumentation.

I've managed to get something that appears to work well enough, but this is easier said than done. We might be tempted to write a function template like the following.

template <class From, class To>
To checked_cast(const From& f)
{
    // ... do our check, then...
    return static_cast<To>(f);  // or reinterpret
}

But that's no good, because we need to distinguish the case where To is a pointer from where it's not. We can't partially specialise function templates, so we can't do this. It's also no good because some instances of static_cast need to be constexpr. Indeed, static_cast to and from integers and chars is used extensively in header files, in constexpr contexts such as expressions which size arrays, or the bodies of inline library functions that are themselves constexpr. We need a more fine-grained approach.

Let's try a class template. This is awkward because casts have to look like function invocations. More specifically, we're instrumenting expressions of the form cast_operator<T>(expr) and we need something that drops in where the cast_operator bit goes—we can't modify the T or expr parts, or add anything afterwards. What seems to work is defining a class template such that the cast expression becomes a constructor invocation of a class we provide, say checked_cast_t<T>. We then define a type conversion operator to turn it back into something of the result type. The compiler inserts the conversion implicitly, so we have a drop-in replacement.

template <typename To>
struct checked_static_cast_t
{
    To temp; 
    operator To() const { return std::move(temp); }
    
    // construct a temporary, which decays 
    template <class From>
    checked_static_cast_t(const From& arg) : temp(static_cast<To>(arg)) {}
    
    // overloaded constructors go here
};

// ... specialisations go here

#define static_cast checked_static_cast_t

We need to specialise the above to make various cases constexpr. In particular, we want all casts to integral types to be constexpr, and all casts from integral types to be constexpr if they're not going to a pointer type. We handle the former by specialising the whole template, instantiating To for the various built-in integral types. We handle the latter by overloading constructors to be constexpr (which, perhaps surprisingly, does work). We also need constructors that take rvalue references. All that expands out to a lot of tedious code, but the result happily compiles various standard library headers that make use of static_cast and reinterpret_cast. We can then supply the header to our compiler using a command-line -include option, and hey presto: we've instrumented casts in C++ using just the C++ language itself.

It's not perfect, though. We can't instrument C-style casts. It's also certainly possible to construct code that compiles without the instrumentation but not with it, by requiring pointer casts whose results are constexpr (and which, by definition, we refuse to provide). In particular, it'd be nice if constexpr were a modifier that can be a basis for overloading—then we could ensure that a cast for constexpr input always yield a constexpr results, by overloading this case explicitly. As it is, I'm hoping it's good enough; I'll post back later when I have more experience of using it.

[/research] permanent link


Powered by blosxom

validate this page