fredag den 29. maj 2009

Function Pointer Curiosities in C++

As a small exercise in my spare time I am writing a generic object serializer for C++. Most likely there exist plenty to of serializers for C++ out there. And that's perfectly fine - I am writing my own as a learning experience - to be up to speed with the language. But in this small project I have come across some curiosities that at first glance seemed quite odd, but after some investigations actually made sense - the only odd thing about function pointers is the notation, which can be very confusing.

I have previously written custom serializers for both Java and C#, but those languages provide the concept of Reflection, which makes the job much easier compared to C++. You cannot ask C++ to list all methods or fields on a given class. Not through the standard at least.

But to stay on the topic of function pointers, let me just quickly sketch out my simple solution for this serializer:

  • a static list of function pointers to each getter/setter of each field in each class


So each serializable class will have a list of function pointers to all the getter/setter pairs in that class.

In order to make the serializer as generic as possible, I will have to make any kind of no-parameter getter method and one-parameter setter method available and useable in the serializer. In Java, that would have been easy - you only have the possibility to provide either objects or primitives as parameter or return type of methods. In C++ that matter is somewhat more complicated. You can have objects, primitives, structs and pointers, references or by-value types and on top of that also const qualifiers on various levels. So a function pointer can come in any different shapes and sizes, even though it is only for single value setters and getters. And in the end, it actually turned out to be the const qualifiers that gave me the most trouble.

But let me start out by giving the simplest example of a getter and a setter function pointer with no pointers nor references and no const qualifiers.

I use typedefs to declare my function pointers and have them all defined in a template class with two templated types: one for the serializable class that the function belongs to (necessary when dealing with class member function pointers in C++) and one for the type of the field behind the getter and setter.It is expected that the type of the field is raw, e.g. stripped from pointers and references. Those are covered later.

template<class T_class, typename T_type>
class Property {
protected:
  typedef T_type(T_class::*Pt2Getter)(void);
  typedef void(T_class::*Pt2Setter)(T_type);
public:
  Property(Pt2Getter pGetter, Pt2Setter pSetter);
};

Note: Don't get confused by the use of the keywords class and typename in the template declaration - semantically they are equivalent. I only use them to differentiate between the meaning of the two templated types.

Introducing pointers and references does not really make the function pointer declaration more complicated:

typedef T_type*(T_class::*Pt2GetterPointer)(void);
typedef void(T_class::*Pt2SetterPointer)(T_type*);

and

typedef T_type&(T_class::*Pt2GetterRef)(void);
typedef void(T_class::*Pt2SetterRef)(T_type&);

By adding the above function pointer declaration to the Property class and introducing new constructors to accept the new function pointers, the class can then be instantiated with pointer, reference or by-value types.

My initial thought was that this would be enough to fully capture all function pointer variants for a getter and a setter. I was wrong.

It turns out that C++, understandably enough, differentiate between the various const qualifiers of a methods signature. A method that as parameter takes a const MyClass* is not the same as a method that just takes MyClass*.

Let us start with the const method. A const method cannot change the object in which it is declared. So this actually only makes sense for getters. Setters will always change the object, e.g. give a new value to a given field.

I will give the example using the pointer declaration:

typedef T_type*(T_class::*Pt2ConstGetterPointer)(void) const;

Another const qualifier is the const return type and const pointer/reference parameter, and here we start to get dirty.

typedef const T_type*(T_class::*Pt2GetterConstPointer)(void);
typedef void(T_class::*Pt2SetterConstPointer)(const T_type*);

And on top of that the getter can still be a const method:

typedef const T_type*(T_class::*Pt2ConstGetterConstPointer)(void) const;

But when we try to define a simple const type function pointer, C++ suddenly becomes cranky:

typedef void(T_class::*Pt2Setter)(const T_type);

where C++ claims that this is equivalent to:

typedef void(T_class::*Pt2Setter)(T_type);

Now what is this? It worked fine with pointers and also for references?

To explain this, we will need to start with the declaration of const variables (taken from the C++ FAQ Lite on Const Correctness):

  • const Fred* p means "p points to a Fred that is const" — that is, the Fred object can't be changed via p.

  • Fred* const p means "p is a const pointer to a Fred" — that is, you can change the Fred object via p, but you can't change the pointer p itself.

  • const Fred* const p means "p is a const pointer to a const Fred" — that is, you can't change the pointer p itself, nor can you change the Fred object via p.


And the fact that:

const Fred* p;

is equivalent to:

Fred const* p;

(See also: The C++ 'const' Declaration: Why & How)

But why does that have anything to do with the above example of method equivalence?

Well, it turns out that in C++ the signature of a method includes the const'ness of the parameter type, but not the const'ness of the parameter variable. This means that the following two methods will have the same signature, and are thereby not overloads of each other and could not co-exist in the same class:

void myMethod(int* const i);
void myMethod(int* i);

But the following two methods will have different signatures, and will be overloads of each other:

void myMethod(const int* i);
void myMethod(int* i);

This makes perfect sense when you think about it. When calling myMethod, you would actively choose to provide a const int* or a non-const int* to the method, and C++ will be able to distinguish between the two overloads. C++ however does not distinguish between whether the given pointer is const or not, only that the type is const.

So in the case of

typedef void(T_class::*Pt2Setter)(const T_type);

and

typedef void(T_class::*Pt2Setter)(T_type);

Here const T_type is equivalent to T_type const, where the const qualifier then belongs to the given variable, not the type. And since C++ does not distinguish between const or non-const parameter variables, these two method signatures are equivalent.

To finish the article on a great finale, I here give another valid function pointer declaration:

typedef const T_type* const(T_class::*Pt2Getter)(void) const;

This indicates that C++ distinguishes between the const'ness of the variable pointed to by the returned pointer and the returned pointer itself.

Phew..