C++ template instantiation problem (and some solutions)

| No Comments

Disclaimer: Really long post. If you're not interested in C++ templates (or in correcting me when I write about them), don't read this! Also, if you know more about C++ than I do, please respond with a better solution!

Now, these days I've been reading one of Herb Sutter's wonderful books (Exceptional C++ Style), and one advice he gives is "Where possible, prefer writing functions as nonmembers nonfriends". His arguments seemed pretty solid, so I decided to give it a shot. However, not a day after reading that, I've already found something that keeps me from doing it in certain types of templated code.

For the small engine I've been writing, I've coded a templated Vec2<T> class. And writing some code to test it, g++ gave this wonderful error:

error: no match for 'operator*' in 'geom::Vec2<float>(((const float&)((const float*)(&-1.0e+1f))), ((const float&)((const float*)(&0.0f)))) * ((#'float_expr' not supported by dump_expr#<expression error> * 5.00000000000000010408340855860842566471546888351e-3) + 1.0e+0)'

The line that gave that error is this one:

geom::Vec2<float> v = geom::Vec2<float>(-10,0) * (1+rand()%10*0.005);

The problem (leaving aside some really weird things about float_expr and dump_expr) seems to be that the compiler can't find operator*, even though I did write one. So, what's happening?

Let's see. operator* is defined as a nonmember method, like this:

template <typename T> const Vec2<T> operator*(const Vec2<T> &lhs, T rhs) 
    return Vec2<T>(lhs.x*rhs,lhs.y*rhs); 
}

And if we look close at the line where it fails, we see it's a multiplication just like    

Vec2<float>(10,0) * 0.005

See the problem now?

The compiler, at the moment of the operator* template instantiation, has to choose the type of the template to instantiate, but finds none that fits exactly. It finds that it should call operator*(Vec2<float>, double), but there's only operator*(Vec2<T>,T) defined, so it just sighs and proclaims "What the hell do you want me to do with this?".

In fact, what we probably want it to do is to convert that double to a float, and then choose the float version of operator*. However, the compiler is not smart enough to do that. As it happens, templated functions parameter type selection and automatic type conversion don't usually mix very well (I read something about it from one of Sutter's books, but I don't remember the exact details). So what can we do? 


Option 1 (bad)

One option would be to give the compiler a little nudge (well, not so little really) so it chooses the correct template instantiation. For example, this would compile cleanly:

geom::Vec2<float> v = geom::Vec2<float>(-10,0) * (float)(1+rand()%10*0.005);

However, it's not very polite of us, as utility class programmers, to negate the class user the option to multiply a float vector by a double scalar. So what other option is there? 


Option 2 (better?)

Another option would be to make the operator* method to be a template with 2 typenames, one for each side of the operation. Like this:

template <typename T,typename Y> 
const Vec2<T> operator*(const Vec2<T> &lhs, Y rhs)
{
    return Vec2<T>(lhs.x*rhs,lhs.y*rhs);
}

This would work pretty well. 

As a side note, we would have problems if we tried to do multiplications with the scalar in the left side, like: 

Vec2<float> v = 2.0 * Vec2<float>(10,0);

So we would have to define the swapped version too:

template <typename T,typename Y> 
const Vec2<T> operator*(Y lhs, const Vec2<T> &rhs)
{
    return Vec2<T>(rhs.x*lhs,rhs.y*lhs);
}

The problem with this approach is that every different type we use will instantiate another version of the code. For this particular method it won't really matter because the compiler will probably inline it anyway. But for more complicated methods it will just instantiate another full version of the method, creating a serious case of template bloat. For example (assuming they are not inlined), each of these expressions would instantiate another version of operator*:

Vec2<float> v1 = Vec2<float>(10,0) * 65; 
Vec2<float> v2 = Vec2<float>(10,0) * 65.0; 
Vec2<float> v3 = Vec2<float>(10,0) * 65.0f; 
Vec2<float> v4 = Vec2<float>(10,0) * 'a';


In fact, if we compile those lines with the -ggdb flag, and then inspect them with gdb, we'll see all the instantiations appear:

(gdb) info function geom::operator*

All functions matching regular expression "geom::operator":

File test.cpp:

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, char>(geom::Vec2<float> const&, char);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, double>(geom::Vec2<float> const&, double);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, float>(geom::Vec2<float> const&, float);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, int>(geom::Vec2<float> const&, int);


I know this is an exaggerated example, but template bloat (excessive template instantiation) can be a real problem. So, in the end, I'm not sure this option is a very good idea.


Option 3 (almost good?)

Another option (even though we wanted to avoid it from the start to follow the advice of Mr Sutter) is a member method. As simple as this:

template <typename T> struct Vec2
{
    ...
    Vec2<T> operator*(T scalar) const
    {
        return Vec2<T>(x*scalar,y*scalar);
    }
};

This method would have no problem instantiating, as there's only one possible T it can accept, and it's clear that we want to convert the type of scalar to that type T. And for the same reason, it would only instantiate once (again, assuming the compiler won't inline it because of size or something).

But we would still have the problem with the swapped version. If we try to do a multiplication like 2.0*Vec2<float>(10,0), we'll get a nice compiler error. What can we do about that?

The best way I've found is to do this:

template <typename T, typename Y> 
const Vec2<T> operator*(Y lhs, const Vec2<T> &rhs)
{
    return rhs*lhs;
}

That is, create a nonmember double-typenamed swapped version, and make it just call the member straight one. Hopefully the compiler will inline it (it's a small method after all) and we'll avoid the associated template bloat I talked about previously.

So, in the straight version we've got no template bloat, and in the swapped one we will most likely avoid it too, thanks to the compiler inlining capabilities. We would, however, be forced to use member methods and ignore Sutter's advice this time.


And I've already run out of ideas to get this to work properly. If anyone knows of a more elegant way to do it, or has found an error in code, logic, technique, grammar or whatever, please leave a comment :-)


Leave a comment