C++ PIMPL idiom & C++ automatic allocation

1 - Forewords

The C++ PIMPL idiom is a technique used in C++ to provide a private implementation in classes.

The basic idea is to hide all private members (fields and methods) from the public class definition (and public header).

Such a technique provides several benefits:

1 - Compilation time
Since all the private stuff is not in the public headers, modifying private members (like adding/removing private member fields or methods) won't require a recompilation of the compilation units depending on the header.
This can very handy with large C++ projects with a lot of dependancies.
Such a technique can also minimise the inclusion of other headers in the public one, thus also requiring less compilation time.

2 - Cleaner headers Hiding the private members from the public interface also makes your public code cleaner.
A developer using your class will only see the stuff he can actually use. This surely removes some burden.
This technique can also be quite useful to protect your implementation details when providing for instance a compiled library to a client.
This is obviously not a silver-bullet for protecting your code, but at least it will keep private stuff private.

The PIMPL technique also have a few drawbacks, but we'll cover them later on.

Table Of Contents

  1. Forewords
  2. Additional readings
  3. Forward declarations
  4. Implementation
    1. Interface
    2. Implementation
      1. IMPL class
      2. Constructors
      3. Copy constructor
      4. Destructor
      5. Assignment operator
      6. Swap function
      7. Member methods
  5. Conclusion
    1. Drawbacks
      1. Memory usage
      2. Performance
    2. Possible enhancements
    3. Generic PIMPL

2 - Additional readings

In this tutorial, we'll be designing a PIMPL class, fully compatible with C++ automatic allocation.

I won't cover here the implementation details for automatic allocation, as it has been covered in the previous C++ tutorials:

Be sure to fully understand the concepts presented in the above articles before proceeding with this one.

3 - Forward declarations

The PIMPL idiom relies on forward class declarations.

Conceptually, a forward declaration is the declaration of an identifier with no formal definition.
Such an identifier can be a variable, a type, a constant, etc.

Let's see some example.
Imagine the following header file, assuming the Foo type represents a C++ class:

#ifndef HEADER_H
#define HEADER_H

#include "Foo.hpp"

Foo * GetFooObject( void );

#endif /* HEADER_H */

This header includes "Foo.hpp", and declares a function returning a pointer to a Foo object. Quite usual...

But at this point, the definition of the Foo class is not needed, so we could also get rid of the inclusion of "Foo.hpp" using a forward declaration:

#ifndef HEADER_H
#define HEADER_H

class Foo;

Foo * GetFooObject( void );

#endif /* HEADER_H */

Here, we simply tell the compiler that the Foo identifier is a C++ class.
The compiler won't know anything about it, except that is is a class.

As we are returning a pointer to such a class, the above code is perfectly valid.
The compiler doesn't need to know anything about the class definition to handle a pointer.

Note that obviously, you can only use forward declarations with pointer or reference types.
The following code is not valid:

#ifndef HEADER_H
#define HEADER_H

class Foo;

Foo GetFooObject( void );

#endif /* HEADER_H */

Here, the Foo object is returned as a copy, so there's no way for the compiler to know what to do without a complete definition of Foo.

As it is retuned by value, the compiler will have to allocate memory for the object.
And with a forward declaration, there's no way for the compiler to know the size of the object.
Also the class may have special methods, like a copy constructor or an assignment operator.
In such a case, the compiler also needs to know about it, so it can emit the correct code to handle the object.

So in the last example, you will have to include the "Foo.hpp" header file.
But with pointer or references, the forward declaration is enough, allowing you to get rid of unnecessary includes.

4 - Implementation

So let's implement a C++ PIMPL class.
As in the first tutorial about automatic allocation, we'll design a simple Point class, representing a 2D point.

4.1 - Interface

Here's the public interface, Point.hpp:

#ifndef POINT_HPP
#define POINT_HPP

class Point
{
    public:
        
        Point( void );
        Point( int x, int y );
        Point( const Point & o );
        ~Point( void );
        
        Point & operator =( Point o );
        
        friend void swap( Point & o1, Point & o2 );
        
        int GetX( void ) const;
        int GetY( void ) const;
        
        void SetX( int x );
        void SetY( int y );
        
    private:
        
        class IMPL;
        
        IMPL * iml;
};

#endif /* POINT_HPP */

Assuming you've read the previous tutorial, the public part should be straightforward.

Parameter-less constructor, copy constructor, assignment operator, swap function, etc.
We also added getter and setter methods for the X and Y properties.

Now let's take about the private part.
Here, we are forward-declaring an inner class, whose qualified name is Point::IMPL:

class IMPL;

This inner class will contains our private fields and methods.
We don't provide a definition of it here, as we want to keep everything private.

We also declare a single private member, which is a pointer to a Point::IMPL object:

IMPL * iml;

And that's all!
Unless adding public members, the Point class will never have to change.

4.2 - Implementation

Now let's take a look at the actual implementation for such a class, Point.cpp.

4.2.1 - IMPL class

First of all, as we are implementing the class, we'll now need a complete definition for the Point::IMPL class:

#include "Point.hpp"

class Point::IMPL
{
    public:
        
        IMPL( void ):
            _x( 0 ),
            _y( 0 )
        {}
        
        IMPL( const IMPL & o ):
            _x( o._x ),
            _y( o._y )
        {}
        
        ~IMPL( void )
        {}
        
        int _x;
        int _y;
};

Nothing fancy.
We have a parameter-less constructor, a copy-constructor and a destructor, as well as two integer members, _x and _y.

Note that everything is public here.
There's no need to have private members, as we're here in the context of the Point class implementation.

In other words, this is a single compilation unit, and the definition of Point::IMPL only exists in that compilation unit.

Now let's take a look at the implementation of the Point class.

4.2.2 - Constructors

First of all, the parameter-less constructor.
Remember that the Point class has a single member, named impl, which is a pointer to a Point::IMPL object.

Here, we need to ensure that this pointer is valid, by creating a new Point::IMPL object:

Point::Point( void ): impl( new IMPL )
{}

Using the initializer list, we initialize the impl field with a new instance of the Point:IMPL class (new IMPL).
We don't need to initialize its members, as this is done by the Point:IMPL constructor.

Now the second constructor:

Point::Point( int x, int y ): impl( new IMPL )
{
    this->impl->_x = x;
    this->impl->_y = y;
}

We also initialize the impl field, as in the previous example, and then we simply set the _x and _y fields with the values passed as arguments.

4.2.3 - Copy constructor

Now let's see how the copy constructor is implemented:

Point::Point( const Point & o ): impl( new IMPL( *( o.impl ) ) )
{}

That's it...
We simply rely on the copy constructor of the Point::IMPL class for initialising the impl field.

Note that the copy constructor take a reference type, and the impl field is a pointer.
That's why we dereference it (*( o.impl )), hence effectively calling the copy constructor.

4.2.4 - Destructor

Noting complicated in the destructor, we simply call the delete operator on the impl field:

Point::~Point( void )
{
    delete this->impl;
}

No need to check for nullptr as deleting a null pointer is fine, and because there's actually no way the impl field can point to anything else than a valid object.

4.2.5 - Assignment operator

As explained in the first tutorial about automatic allocation, our assignment operator takes its value by copy rather than by reference, so we can simply use our swap function, thus relying on our copy constructor to do the work:

Point & Point::operator =( Point o )
{
    swap( *( this ), o );
    
    return ( this );
}
4.2.6 - Swap function

With the PIMPL idiom, the swap function is actually very straightforward.
As the Point class only has a single member, which is a pointer, we only need to swap them:

void swap( Point & o1, Point & o2 )
{
    using std::swap;
    
    swap( o1.impl, o2.impl );
}

That's it... Swapping two objects only means swapping their implementations.

4.2.7 - Member methods

The only thing left is the member methods.
Nothing fancy here as well:

int Point::GetX( void ) const
{
    return this->impl->_x;
}

int Point::GetY( void ) const
{
    return this->impl->_y;
}

void Point::SetX( int x ) const
{
    this->impl->_x = x;
}

void Point::SetY( int y ) const
{
    this->impl->_y = y;
}

Again, no need to check for a null pointer, as there's no way the impl field can be invalid.

5 - Conclusion

As you can see, implementing the PIMPL idiom is quite straightforward.
It provides several benefits, and the actual implementation is really simple.

I personally think it also improves the overall readability of the code, as it keeps all member methods simple and generic.
All the work is basically done by the copy constructor of the IMPL class, so it even makes automatic allocation patterns easier, in my own opinion.

5.1 - Drawbacks

As stated in the beginning of this article, the PIMPL idiom also has a few drawbacks.

5.1.1 - Memory usage

First of all, the PIMPL idiom forces us to use dynamic memory allocation for the impl pointer field.

Each time we create a PIMPL object, there's actually a call to new to allocate memory for the impl field.

In most common scenarios, this is clearly not a problem, but that's definitively something you'll have to keep in mind.

5.1.2 - Performance

As stated in the previous point, we need to allocate dynamic memory for the impl field.

The calls to new and delete obviously have a small performance impact, but this also true for accessing fields of the IMPL object, as it means reading/dereferencing memory.

Again, not a very big deal in most common scenarios, but if you're really concerned about runtime performances, PIMPL might be a thing you'll want to avoid.

5.2 - Possible enhancements

I tried to keep things simple in this tutorial, but the PIMPL idiom can actually be extended in a lot of ways.

For instance, you could also decide not to copy the impl field each time an object is copied, but instead use some kind of reference counting mechanism, or stuff like std::shared_ptr.

Then you'll have different instances sharing the same implementation.
This way, you can have the benefits of shared objects, while having the benefits of automatic allocation.

5.3 - Generic PIMPL

If you're interested in using the PIMPL idiom, I recommend checking my Generic C++11 PIMPL implementation.

It consist of a template class that you can inherit from, and which will handle most of the PIMPL implementation for you.

Comments

Author
Vishal Oza
Date
09/28/2014 01:46
Could you substitute references for pointer or at least use smart pointer for ownership? Also how does auto fit in to this mess. I think auto might help with PIMPL. If not the future versions of C++ or maybe C++ 17 with class deduction of template code might help
Author
Jean-David Gadina
Date
09/28/2014 01:46
You can't really use a reference for the IMPL field, as, well, the reference has to point to something, but you can indeed use smart pointers if you find it easier.

About auto, you can also use it, but I personally avoid it in such situations, as I like my code to be explicit.