Two Weeks of C++ - Day 04

Intro

This is day 04 of the Two weeks of C++ series. Today is about classes and Object Oriented Programming.

A class is a user-defined type provided to represent a concept in the code of a program. […] A program built out of a well chosen set of classes is far easier to understand and get right than one that builds everything directly in terms of the built-in types.

~ Bjarne Stroustrup

If it’s your first time here, please, read the disclaimers before moving on.

Concrete classes

Concrete classes basically behave like built-in types. If you can instantiate an object directly using a class, this class is definitely a concrete class.

const as suffix of method

When const is applied to a class member function (a method), this function cannot modify the state of the object it is called for:

1
2
3
4
5
6
7
8
9
10
11
class Person {
private:
int age;
public:
int getAge() const;
}
int Person::getAge() const {
age = 42; // KO, const method cannot change state
return age;
}

default constructor

If a constructor can be invoked with no arguments, it’s called a default constructor. A default constructor eliminates the possibilities of uninitialized objects of its type.

Operator overloading

I’ve talked about operator overloading a little bit in day 02. I realized that I didn’t understand the difference between the subscript operator and operator overload back then, so I updated that post.

It’s quite an amazing language feature I didn’t know before C++. It’s one of those features that make me feel so powerful.

Operator overloading is when you give an operator more than one meaning (overloading it) by implementing your own logic depending on your user-defined type.

Refer to the References section below for examples.

It’s good to note that the syntax for overloaded operators is fixed by the language, therefore, you cannot define, for example, an unary / operator. It’s also not possible to change the meaning of an operator for built-in types: You cannot redefine the + operator to subtract integers.

The destructor

In higher level languages, you don’t need to implement a class destructor because of automatic garbage collection. The destructor is a way to release all the memory that you acquired during initialization. A good way to prevent memory leaks.

You define a destructor by using the complement operator (~, AKA tilde) on a function of the same name as the class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Vector {
private:
double *elem;
int sz;
public:
Vector(int s) :elem{new double[s]}, sz{s} // Acquire resource
{
for (int i = 0; i < s; ++i) // Initialize elements
elem[i] = 0;
}
~Vector() { delete[] elem; } // Release resource
}

Now the user of that class doesn’t have to worry about class-level memory leaks. It’s the same as using an int or char type.

Resource Acquisition Is Initialization (RAII)

RAII is the technique of acquiring resource through a constructor and releasing that resource later on through a destructor. Just like the example above.

Allocating memory in a large scope can lead to error-prone code and memory leaks if you’re not careful enough.

Initializing containers

When it comes to initializing containers, the first method that comes to mind is to initialize it with an appropriate number of elements and then assigning to them. There are more elegant ways and the author shares his two most favorite methods:

  1. An initializer-list constructor
  2. A push_back method

1. Initializer-list constructor

By using constructor overloading, we can make it possible to initialize the container using a braced-init-list:

1
2
3
4
5
6
7
8
9
10
11
#include <initializer_list>
class Vector {
// ...
public:
Vector(std::initializer_list<double> lst)
:elem{new double[lst.size()]}, sz{lst.size()} {
copy(lst.begin(), lst.end(), elem);
}
// ...
}

2. push_back method

A push_back method adds a new element at the end of the container. It increases the size of the container by one and reallocates the stored space if, and only if, the new size is greater than the current capacity.

Here’s an example of its usage:

1
2
3
4
5
6
7
8
9
Vector read(istream& is)
{
Vector v;
for (double d; is >> d;)
v.push_back(d);
return v;
}

Abstract classes

An abstract class is a class that cannot instantiate an object. Here’s an example:

An abstract class
1
2
3
4
5
6
class Container {
public:
virtual double& operator[](int) = 0;
virtual int size() const = 0;
virtual ~Container() {}
};

The reason why it cannot instantiate an object is there’s no implementation, only an interface.

A virtual function is a function that may be redefined later in a class derived from this one.

The = 0 at the end of the virtual functions above means that the functions must be redefined later by a derived class for this derived class to become concrete. They’re known as pure virtual functions.

A class with a pure virtual function is called an abstract class.

A derived class that implements all the pure virtual functions of its base class is a concrete class. This class can instantiate objects as we saw earlier.

An abstract class can only act as the interface to a class that implements its pure virtual functions.

A class that acts as the interface to a variety of other classes is often called a polymorphic class.

A class that derives from a base class is called a subclass and the base class is called a superclass.

The subclass inherits members from its superclass. We call this class inheritance.

When a subclass redefines a virtual function of its superclass, it’s called overriding.

Most abstract classes don’t have a constructor since they don’t know which resource their instance will need but they do have a destructor, which is also virtual since we might want to manipulate multiple instances using one generic function. Let’s see an example:

A generic function
1
2
3
4
5
6
void use(Container &c) {
const int sz = c.size();
for (int i = 0; i != sz; ++i)
std::cout << c[i] << std::endl;
}
A concrete class, subclass of Container
1
2
3
4
5
6
7
8
9
class Vector_container : public Container {
Vector v;
public:
Vector_container(int s) : v(s) {}
~Vector_container() {}
double& operator[](int i) { return v[i]; }
int size() const { return v.size(); }
}
Another subclass of Container
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class List_container : public Container {
std::list<double> ld;
public:
List_container() {}
List_container(std::initializer_list<double> il) :ld{il} {}
~List_container() {}
double& operator[](int i);
int size() const { return ld.size(); }
}
double& operator[](int i)
{
for (auto& x : ld) {
if (i == 0) return x;
--i;
}
throw out_of_range("List Container");
}
Usage
1
2
3
4
5
6
7
8
int main()
{
Vector_container vc {42, 21, 4, 3, 22, 14, 0};
use(vc);
List_container lc = {34, 55, 67, 2, 46, 1};
use(lc);
}

Notes:

  • The use function doesn’t know anything about implementation details. It simply uses the size method and the subscript operator without any idea of which class provides their implementation.
  • The destructor of the Vector instance is implicitly called by the destructor of Vector_container.

Virtual functions

The vtbl

For a base class to be able to resolve to the right function of its subclasses when used as in the example above, the compiler creates a table of pointers to functions for all the virtual functions. This table is often called the virtual function table or vtbl.

The virtual destructor

A virtual destructor is essential for an abstract class because an object of a derived class is usually manipulated through the interface provided by its abstract base class.

Explicit overriding

A function in a subclass overrides a virtual function in a superclass if those functions have exactly the same name and type. In large hierarchies, however, it’s not always obvious if overriding was intended. To avoid confusion, a programmer can explicitly state that a function is meant to override using the override specifier:

Explicit overriding
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class List_container : public Container {
std::list<double> ld;
public:
List_container() {}
List_container(std::initializer_list<double> il) :ld{il} {}
~List_container() {}
double& operator[](int i) override;
int size() const override { return ld.size(); }
}
double& operator[](int i)
{
for (auto& x : ld) {
if (i == 0) return x;
--i;
}
throw out_of_range("List Container");
}

Notes:

  • I would get an error if I mistyped size. (no such method in the base class)
  • I would also get an error if I used an operator other than the subscript operator.

Hierarchy navigation

In our example above, the use function accepts a container by reference so we can treat all Containers alike. However, what if we wanted to use a method from a specific type of Container?

We can use the dynamic_cast operator:

Using dynamic_cast
1
2
3
4
5
6
7
8
9
10
11
12
13
void use(Container &c) {
const int sz = c.size();
if ((List_container* lc = dynamic_cast<List_container*>(c)) {
// Use the method we want from List_container
} else {
// "c" is not an instance of List_container,
// dynamic_cast returned nullptr
}
for (int i = 0; i != sz; ++i)
std::cout << c[i] << std::endl;
}

When a different type is unacceptable, we should dynamic_cast to a reference type. If the object is not of the expected type, standard library bad_cast is thrown:

1
2
3
4
5
try {
Some_Type& st = dynamic_cast<Some_type&>(*t);
} catch (std::bad_cast e) {
// Not the type we expected
}

Notice how code is cleaner when dynamic_cast is used with restraint.

If we can avoid using type information, we can write simpler and more efficient code.

~ Bjarne Stroustrup

Avoiding resource leaks

Functions returning a pointer to an object allocated on the free store are dangerous.

~ Bjarne Stroustrup

This is a really important matter. If a function returns a pointer to a dynamically allocated object, there’s a high chance the user of that function forgets to free the allocated memory.

Not using unique_ptr
1
2
3
4
5
6
Some_type* Some_function()
{
// ...
return new Some_type;
}

The best solution for this is to return unique_ptrs instead:

Using unique_ptr
1
2
3
4
5
6
unique_ptr<Some_type> some_function()
{
// ...
return unique_ptr<Some_type>{new Some_type};
}

Now, the user of this function doesn’t need to worry about memory leaks, the allocated memory is implicitly destroyed as soon as its unique_ptr goes out of scope.

Copy and Move

Copying containers

An object, whether it’s of a user-defined type or a builtin type, can be copied. Copied, as in, each of its members get cloned to another object. This kind of “copy” is called a memberwise copy.

When designing a class, we must always consider if and how an object might be copied. Memberwise copy is often the right semantics for copy when it comes to simple concrete types;

A slightly simplified version of the standard library "complex"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class complex {
double re, im;
public:
complex(double r, double i) :re{r}, im{i} {}
complex(double r) :re{r}, im{0} {}
complex() :re{0}, im{0} {}
double real() const { return re; }
void real(double r) { re = r; }
double imag() const { return im; }
void imag(double i) { im = i; }
complex& operator+=(complex z) { re+=z.re; im+=z.im; return *this; }
complex& operator-=(complex z) { re-=z.re; im-=z.im; return *this; }
complex& operator*=(complex z); // defined somewhere, out-of-class
complex& operator/=(complex z); // defined somewhere, out-of-class
}
Using memberwise copy on a simple concrete type
1
2
3
4
5
6
7
void test(complex z1)
{
complex z2{z1}; // z2 now has the same values as z1
complex z3;
z3 = z2; // z3 also has the same values as z1
// ...
}

but when it comes to more sophisticated concrete types - a resource handle - it’s not such a good idea.

Using memberwise copy on a resource handle
1
2
3
4
5
6
7
void bad_copy(Vector v1)
{
Vector v2 = v1;
v1[0] = 42; // v2's first element is now also 42
v2[1] = 21; // v1's second element is now also 21
// ...
}

The fact that Vector has a destructor is already a sign that memberwise copy is not the right copy semantics for this class. So we must also define a copy assignment and a copy constructor. It’s called the rule of three.

A better Vector interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Vector {
private:
double *elem;
int sz;
public:
Vector(int s); // constructor: establish invariant, acquire resources
~Vector(); // destructor: release resources
Vector(const Vector& a); // copy constructor
Vector& operator=(const Vector& a); // copy assignment
double &operator[](int i);
int size() const;
}

Here’s what a copy constructor for Vector looks like:

Implementation of the copy constructor
1
2
3
4
5
6
Vector::Vector(const Vector& a)
:elem{new double[a.sz]}, sz{a.sz}
{
for (int i = 0; i < sz; ++i)
elem[i] = a[i];
}

And the copy assignment:

Implementation of the copy assignment
1
2
3
4
5
6
7
8
9
10
11
12
Vector& Vector::operator=(const Vector& a)
{
if (&a == this) return *this; // No need to copy the same object
double* p = new double[a.sz];
for (int i = 0; i < a.sz; ++i)
p[i] = a[i];
delete[] elem; // delete old elements
elem = p;
sz = a.sz;
return *this; // "this" is predefined in a method and points to the current instance of the class
}

Moving Containers

Copying can be costly for large containers. We already know how to avoid the cost of copying while passing an object to a function by passing it by reference, but it’s not possible to return a local object by reference from a function.

Consider this addition operator overload for Vector:

An addition operator overload for Vector
1
2
3
4
5
6
7
8
9
10
Vector Vector::operator+(const Vector& a, const Vector& b)
{
if (a.size() != b.size())
throw Vector_size_mismatch{}; // let's pretend we defined this exception earlier
Vector res(a.size());
for (int i = 0; i < a.size(); ++i)
res[i] = a[i] + b[i];
return res;
}

Now, with a use case like this:

Addition operator usage
1
2
3
4
5
6
7
void f(const Vector& x, const Vector& y, const Vector& z)
{
Vector w;
// ...
w = x + y + z;
// ...
}

We are actually copying Vector twice (y + z, then x + (y + z)). Now imagine one of those Vectors being about 10000 doubles large. That would be quite a lot to copy around.

Remember, res is just a temporary result that we want to get out of the addition operator overload, therefore, we didn’t want to copy it, we wanted to move it. This is when you realize your class needs a move constructor and a move assignment so you can state, at the right moment, the intent of moving instead of copying. This is an expansion to the rule of three, it’s called the rule of five.

An even better Vector interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Vector {
private:
double *elem;
int sz;
public:
Vector(int s); // constructor: establish invariant, acquire resources
~Vector(); // destructor: release resources
Vector(const Vector& a); // copy constructor
Vector& operator=(const Vector& a); // copy assignment
Vector(Vector&& a); // move constructor
Vector & operator=(Vector&& a); // move assignment
double &operator[](int i);
int size() const;
}
Implementation of the move constructor
1
2
3
4
5
6
7
Vector(Vector&& a)
:elem{a.elem}, // Grab the value from the other Vector
sz{a.sz}
{
a.elem = nullptr; // Makes sure it's empty
a.sz = 0;
}
Implementation of the move assignment
1
2
3
4
5
6
7
8
9
10
11
Vector& Vector::operator=(Vector&& a)
{
if (&a == this) return *this; // no need to move the same object
delete[] elem; // Get rid of old elements
elem = a.elem; // Grab the elements
sz = a.sz;
a.elem = nullptr; // Makes sure it's empty
a.sz = 0;
return *this;
}

Now, when we return a Vector object from a function, the compiler will choose the move constructor instead. So now, w = x + y + z involves no copying of a Vector.

The rvalue reference

It’s important to notice that we use && as the argument when it comes to the move constructor and assignment. It means rvalue reference.

An rvalue is a temporary value that does not persist beyond the expression that uses it. It’s also a value that can only be used on the right-hand side of an assignment.

The move assignment and constructor have the power to grab or “steal” the resources held by the argument (i.e.: a pointer to a dynamically-allocated object), leaving it in a valid but indeterminate state, which is why we assign a nullptr to the elements of the other Vector after grabbing them.

If you want to be explicit about a move operation (you’re sure you will not use an object anymore), you can use the standard library move method:

Explicit move operation
1
2
3
4
5
6
7
8
9
10
Vector f()
{
Vector a(1000);
Vector b(1000);
Vector c(1000);
a = b; // An implicit copy
c = std::move(b); // An explicit move
return a; // An implicit move (since we have the move constructor)
}

Here are the five situations when an object is copied or moved:

  1. As the source of an assignment
  2. As an object initializer
  3. As a function argument
  4. As a function return value
  5. As an exception

The copy or move operators will be applied consequently.

The default, delete and explicit specifiers

The default constructor, the copy constructor, the move constructor, the copy assignment, the move assignment and the destructor are called special member functions/methods of a class.

If you do not define those special member functions, the compiler will implicitly do it for you.

when you are explicit about some special member functions, the compiler will not generate other special member functions. For example, if you define a destructor, the compiler will not implicitly generate a move constructor.

You can force the compiler to generate the move constructor by using the default specifier:

Using the 'default' specifier
1
2
3
4
5
6
class SomeClass {
public:
SomeClass(someType);
~SomeClass(someType);
SomeClass(SomeClass&&) = default; // I really do want the default move constructor
}

Those special member functions are the only functions that can be defaulted.

Sometimes, having a default copy or move operation is a bad idea. It’s almost always the case when your class acquire resources. To make sure the compiler doesn’t generate a default copy or move operation, you use the delete specifier:

Using the 'delete' specifier
1
2
3
4
5
6
7
8
9
class SomeClass {
public:
SomeClass(someType);
~SomeClass(someType);
SomeClass(SomeClass&&) = delete; // No default move constructor
SomeClass& operator=(SomeClass&&) = delete; // No default move assignment
SomeClass(const SomeClass&) = delete; // No default copy constructor
SomeClass& operator=(const SomeClass&) = delete; // No default copy assignment
}

Now, trying to move/copy an instance of SomeClass will raise an error.

It’s important to note that it was unnecessary in this example since we added a destructor.
As I said earlier, being explicit about some special member functions prevents the compiler from generating other special member functions.

If we take a look at our Vector class from last day, it allows for an implicit conversion from int to Vector:

1
Vector v = 42; // v now has 42 elements

This is considered bad practice, to avoid it, we make the constructor explicit by using the explicit specifier.

Using the 'explicit' specifier
1
2
3
4
5
6
7
8
9
class Vector {
public:
explicit Vector(int s); // No implicit conversion from int to Vector
double& operator[](int i);
int size();
private:
double∗ elem;
int sz;
};

Now, using it looks like:

1
2
Vector v1(42); // OK, v1 has 42 elements
Vector v2 = 21; // KO, no implicit conversion from int to Vector

In many languages, resource management is primarily delegated to a garbage collector. C++ also offers a garbage collection interface so that you can plug in a garbage collector. However, I consider garbage collection the last alternative after cleaner, more general, and better localized alternatives to resource management have been exhausted.

~ Bjarne Stroustrup

Conclusion

Whew lad! This is the longest article in this series so far. Of course, what is C++ without OOP? Yeah, it’s practically C.

I really don’t expect anyone to follow this series, but if you do, you must have noticed the big gap between this article and the last one. Yeah, a lot has been happening in my little bubble lately. Freelance work, job interviews, personal projects… etc. On top of it all, today’s subject is a hard to master subject, it requires a lot of research to get the gist of.
You only realize how hard OOP is when you get to do it in low level languages.

I had to include a disclaimer section at the top of all the articles because I recently got told by some more experienced C++ programmers that my articles are a shame to the community and that I don’t know what I’m writing about.

Of course I don’t know what I’m writing about, this is exactly why I’m writing about it. To make sure I understand it. I already said so in Day 00: The targeted readers for this series are Me, Myself and I.

The reason why I put it online is because I believe explaining a subject and getting feedback from someone else makes you understand it better. Problem is, I don’t have anyone around to listen to me explaining C++ to them. XD, so I use this blog to explain it and I add a comment section below for people to share their opinion.

So yeah, do not take anything I say here as “the right way”. Instead, try understanding it, question it and do your own research. If you believe I’m wrong, comment below and show me the light. :)

Here’s what to remember for the day:

  • Express ideas directly in code
  • A concrete type is the simplest kind of class. Where applicable, prefer a concrete type over more complicated classes and over plain data structures
  • Use concrete classes to represent simple concepts and performance-critical components
  • Define a constructor to handle initialization of objects
  • Make a function a member only if it needs direct access to the representation of a class
  • Define operators primarily to mimic conventional usage
  • Use nonmember functions for symmetric operators
  • Declare a member function that does not modify the state of its object const
  • If a constructor acquires a resource, its class needs a destructor to release the resource
  • Avoid “naked” new and delete operations
  • Use resource handles and RAII to manage resources
  • If a class is a container, give it an initializer-list constructor
  • Use abstract classes as interfaces when complete separation of interface and implementation is needed
  • Access polymorphic objects through pointers and references
  • An abstract class typically doesn’t need a constructor
  • Use class hierarchies to represent concepts with inherent hierarchical structure
  • A class with a virtual function should have a virtual destructor
  • Use override to make overriding explicit in large class hierarchies
  • When designing a class hierarchy, distinguish between implementation inheritance and interface inheritance
  • Use dynamic_cast where class hierarchy navigation is unavoidable
  • Use dynamic_cast to a reference type when failure to find the required class is considered a failure
  • Use dynamic_cast to a pointer type when failure to find the required class is considered a valid alternative
  • Use unique_ptr or shared_ptr to avoid forgetting to delete objects created using new
  • Redefine or prohibit copying if the default is not appropriate for a type
  • Return containers by value (relying on move for efficiency)
  • For large operands, use const reference argument types
  • If a class has a destructor, it probably needs user-defined or deleted copy and move operations (Rule of 5)
  • Control construction, copy, move, and destruction of objects
  • Design constructors, assignments, and the destructor as a matched set of operations
  • If a default constructor, assignment, or destructor is appropriate, let the compiler generate it
  • By default, declare single-argument constructors explicit
  • If a class has a pointer or reference member, it probably needs a destructor and non-default copy operations
  • Provide strong resource safety; that is, never leak anything that you think of as a resource
  • If a class is a resource handle, it needs a constructor, a destructor, and non-default copy operations

References

Difference between a concrete class and an abstract class: Stack Overflow

Operator overloading: Microsoft Docs

The move constructor: CppReference

The move assignment operator: CppReference

Lvalues and Rvalues: MSDN