C++ Automatic Allocation - Part 1

About

This tutorial explains how to design C++ classes for automatic allocation.

It is composed of several parts:

  1. C++ Automatic Allocation - Part 1 (this tutorial)
  2. C++ Automatic Allocation - Part 2
  3. C++ PIMPL idiom & Automatic Allocation

Table Of Contents

  1. C++ allocation types
  2. Automatic allocation rules
    1. Copy constructor
    2. Assignment operator
    3. Swapping
    4. C++11

1 - C++ allocation types

When using C++ classes, you can basically use automatic or dynamic allocation.

Let's take a look at the following code:

#include <string>

int main( void )
{
    std::string   s1( "hello, world" );
    std::string * s2 = new std::string( "hello, universe" );
    
    delete s2;
    
    return 0;
}

Here, s1 uses automatic allocation, while s2 uses dynamic allocation.
Note the call to the new and delete operators, as well as the pointer type for s2.

With automatic allocation, the compiler will automatically allocates memory for the object (here from the stack), and call the constructor. When the object goes out of scope, it will the automatically call the detstructor, and reclaim the memory. Quite simple.

Dynamic allocation, using the new operator, will allocate memory from the heap. The constructor will be called, obviously, but then the compiler won't manage the lifetime of the object, and you'll have to manually call the delete operator when you're done using the object. delete will call the destructor and free the memory.

Both allocation styles are useful. One is not better than the other. It just depend on the context.
I won't extend here on this specific topic, as it's not the purpose of this tutorial.

2 - Automatic allocation rules

Now let's concentrate on automatic allocation. In order to make a class fully compatible with automatic allocation, a few rules need to be respected.

Let's take as an example a very basic Point class, representing two coordinates:

class Point
{
    public:
        
        int x;
        int y;
        
        Point( void ):
            x( 0 ),
            y( 0 )
        {}
        
        Point( int x_, int y_ ):
            x( x_ ),
            y( y_ )
        {}
};

We first have a parameter-less constructor, allowing to declare an empty/default Point object. Members will be initialised to zero:

Point p;

And also another constructor, allowing the initialisation of the object with some useful values:

Point p( 42, 42 );

Now with such a basic example, the compiler will generate all the necessary code to make this class compatible with automatic allocation.

But let's pretend it won't, and see what we'll need to do.

2.1 - Copy constructor

First of all, we need a copy constructor. That is, a constructor taking as parameter an object of the same type, and which will initialise the values of the object beeing constructed by copying the values from the other object.

The copy constructor will be called by the compiler in the following situations:

2.1.1 - Explicit call
Point p1( 42, 42 );
Point p2( p1 );

Here, p2 is constructed from p1. Both objects will have the same values.

2.1.2 - Passing parameters by value

Imagine the following function:

void foo( Point p );

The p parameter is here passed by value.
So when calling that function, like:

Point p( 42, 42 );

foo( p );

The function will receive a copy of the p local variable. The compiler automatically handles this by allocating the necessary memory and calling the copy constructor.

This obviously won't be the case when using a reference type (note the &):

void foo( Point & p );
2.1.3 - Returning values

Now let's imagine the following function:

Point bar( void )
{
    Point p( 42, 42 );
    
    return p;
}

Here, p is a local variable, belonging to the bar function. Memory is allocated from its stack frame, and might be reused by the compiler on return.

So when returning p, a copy will be made by the compiler, using the copy constructor, so the object will exist in the caller's stack frame.

Now let's take a look at the implementation of the copy constructor, for the Point class:

Point( const Point & p ):
    x( p.x ),
    y( p.y )
{}

Note this is the only valid signature for a copy constructor.

As you can see, it takes as a parameter a reference to another Point object. Can't take its parameter by value, as this will end up calling the copy constructor infinitely.

The parameter is also const, meaning you shall not modify any of its value.
You goal here is simply to construct yourself by copying the object's properties, and that's precisely what we do.

2.2 - Assignment operator

In order to make a class compatible with automatic allocation, we also need to declare an assignment operator.

The assignment operator will be used, for instance, in the following case:

Point p1;
Point p2( 42, 42 );

p1 = p2;

Here, p1 and p2 are valid, fully constructed objects.
In the last line, p2 is assigned to p1.

The assignment operator is in a way close to the copy constructor, as it also needs to set its members by copying from another object.
The main difference is that, when using the assignment operator, the object is already constructed, so it may have to free its previous resources prior to copying the other object's one.

In our Point class example, nothing to worry about, as our only members are integers.
So our assignment operator can simply be:

Point & operator=( const Point & p )
{
    this->x = p.x;
    this->y = p.y;
    
    return *( this );
}

We simply assign our member values by copying the member values from the object's reference passed as argument.

Now as you can see, the assignment operator also needs to return a reference type. In most situations, we simply return ourself, but keep in mind you could also return another object, if needed.

Unlike the copy constructor, this is not the only possible signature for an assignment operator, but we'll cover this in the next tutorial.

2.3 - Swapping

While not a requirement for automatic allocation, it's usually a good idea to provide a swap function.
As we'll see in the next tutorial, it might also help us get rid of some redundant code.

The idea of the swap function is to exchange two objects.
Here's an example implementation for our Point class:

friend void swap( Point & p1, Point & p2 )
{
    using std::swap;
    
    swap( p1.x, p2.x );
    swap( p1.y, p2.y );
}

Quite straightforward, we simply exchange all of the members of the objects passed by reference.

Note the function is declared as friend, so it can also access private members.
The using std::swap line tells the compiler to use std::swap as fallback, if there's no overload matching the given arguments.
This allows specialized swap functions to be used seamlessly, based on the type of the members.

2.4 - C++11

C++11 introduced rvalue references, denoted by &&.
I won't cover here what they are, but if targeting C++11, you can also declare a move constructor, and a move assignment operator.

The basic idea behind this is that sometimes, copy construction or assignment can be avoided.

While the use of rvalue references overloads can be manually triggered by using std::move, the compiler can also decide to avoid copy construction or assignment in some situations.

Such situations include the generation of temporary objects. In such a case, using only copy construction or assignment may have performance impacts.

Let's take a look at the following example:

Point foo( void )
{
    Point p( 42, 42 );
    
    return p;
}

int main( void )
{
    Point p;
    
    p = foo();
    
    return 0;
}

Without rvalue references, here's what a compiler might do (I'm not saying it will necessarily be the case, but this will illustrate the need of rvalue references):

  1. In main, a Point object is allocated. Parameter-less constructor is invoked.
  2. foo is called.
  3. A Point object is allocated in foo, with x = 42 and y = 42.
  4. The Point object is returned. As we saw before, this will create another Point object, calling the copy constructor.
  5. The temporary Point copy returned by foo is finally assigned to the local p object from main, invoking the assignment operator.

As we can see in points 4 and 5, we have a temporary copy which is then assigned to an existing object.
With our basic Point class, using only integers, there's nothing to worry about.

But our class might be different and allocate/acquire dynamic resources.
In such a case, we'll end up copying twice the object. One time when it is returned (copy constructor), and another time when it's assigned (assignment operator).

Depending on the implementation, this can have an impact on performance.

C++11 rvalue references solve this problem by avoiding the extra final step.

In the above example, by providing a move assignment operator, we'll be able to directly use the resources of the assigned object, as it is a temporary, rather that copy them.

Don't worry if this isn't clear right now, we'll see more of this in the next tutorial.

For reference, here's the signature of a move constructor for our Point class:

Point( Point && p ):
    x( p.x ),
    y( p.y ),
{}

And here's the move assignment operator:

Point & operator =( Point && p )
{
    this->x = p.x;
    this->y = p.y;
    
    return *( this );
}

Again, this is clearly not the best example with such a simple class.
But the point here is that objects are passed as rvalue references (note the &&), and are not marked const.

This mean we can actually steal/move their resources, rather than copying them.
Again, not a big deal with our Point class, but more about this in the next tutorial.