C++ Idioms and Best Practices

Back

Constructors, Destructors and Assignment Operators

Use initializer lists in constructors.

Use initializer lists instead of assignment to initialize member variables in constructors. This will improve both the readability and the performance of your code. List all of your member variables (in the order that you declared them) in the initializer list. If you don't use the correct order the compiler will probably complain because the variables must be initialized in the order they were declared.

Suppose that you wer to use assignment instead of an initializer list as shown below.

class Circle {
private
  std::tr1::<double, 2> _center;
  double _radius;
public:
  Circle(const std::tr1::<double, 2>& center, const double radius) {
    _center = center;
    _radius = radius;
  }
  ...
};
This will hurt performance because the variables will first be default initialized and then assigned a value. In effect, the following operations are performed.
// Initialization.
_center[0] = 0;
_center[1] = 0;
_radius = 0;
// Assignment.
_center[0] = center[0];
_center[1] = center[1];
_radius = radius;
It is better to use the following form.
 Circle(const std::tr1::<double, 2>& center, const double radius) :
    _center(center),
    _radius(radius) {
  }

Below is an example of the correct use of initializer lists. Note that you can only default initialize arrays. That is, you cannot set the array values in the initialization list. You must perform that assignment in the body of the constructor. (Note that you should prefer std::vector or std::tr1::array to plain arrays.)

class Foo {
private:
  double _parameter;
  double _coordinates[3];
  float _density;
  std::vector _values;
  std::string _name;

public:
  Foo(const double parameter, const double x, const double y,
      const double z, const std::size_t numValues, const std::string& name) :
    _parameter(parameter),
    _coordinates(),
    _density(0),
    _sizes(numValues),
    _name(name) {
    _coordinates[0] = x;
    _coordinates[1] = y;
    _coordinates[2] = z;
  }
  ...
};

Macros

Don't use macros to define constants.

The C++ preprocessor is basically a text substitution program. It doesn't know about C++ syntax. Don't use macros to define constants. The C++ language has several better ways of doing that. Instead of using

#define N 256
prefer the following
const std::size_t ArraySize = 256;

Using the preprocessor to define constants in a source file is ill-advised. Doing so in a header file is downright evil. The macro substitition rules spill over into an source file that include the header.

Don't use macros to define functions.

Sometimes people use macros to define functions to that the function works for a variety of argument types. Use templated functions instead. Macros don't know anything about C++ syntax. It is better to use language features. Instead of
#define MIN(x,y) (x < y : x ? y)
use
template<typename T>
inline
min(const T x, const T y) {
  return (x < y : x ? y);
}

Namespaces

Static

In C, the keyword static can be used to indicate that a function or object is local to a translation unit. For example, one can have:

static int a;
static void f() {...}
in multiple source files. Each translation unit will have its own variable a and function f(). This usage of static is deprecated in C++. (See "The C++ Programming Language" by Bjarne Stroustrup.) Instead, use unnamed namespaces:
namespace {
  int a;
  void f() {...}
}

Using

Avoid using using declarations to strip namespaces. Statements like

using namespace std;
using namespace ads;
are dangerous. They dump the contents of a namespace into the local scope. If you are lucky, this may lead to name collisions that are detected at compile time. However, it may also lead to the worst kind of error, a program that compiles and runs but gives unexpected results. Consider the following program.
#include <iostream>

double
distance(const double* x, const double* y) {
  return *y - *x;
}

int 
main() {
  double x = 1, y = 2.5;
  std::cout << distance(&x, &y) << "\n";
  return 0;
}
It gives the result 1.5. A novice user might put in a using statement to avoid having to qualify cout with std::.
#include <iostream>

double
distance(const double* x, const double* y) {
  return *y - *x;
}

int 
main() {
  using namespace std;
  double x = 1, y = 2.5;
  cout << distance(&x, &y) << "\n";
  return 0;
}
Now the program gives a different result, -1 when I ran it. In the second program, an STL function that computes distance between iterators is called.
template<class InputIterator>
typename iterator_traits<InputIterator>::difference_type
distance(InputIterator first, InputIterator last);
Some developers consider their code to be more aesthetically pleasing when they don't have to use namespace qualifications, and thus carelessly strip off namespaces with using statements. This is bad programming and can lead to insidous errors. While using using in this manner in a source code file is a bad idea, putting it in a header file is truly evil. Then this dangerous practice is propagated to every source file that includes that header.

Interfaces and Argument Dependent Lookup

If a class is defined in a namespace, put all associated free functions in that namespace as well. Note that the C++ rules on argument dependent lookup will enable you to use those free functions without explicitly qualifying them. In the example below, the output operator is declared in the foo namespace. When it is used in main, there is no need to explicitly qualify the namespace, i.e. foo::operator<<(std::cout, bar).

namespace foo {
  class Bar {...};

  std::ostream&
  operator<<(std::ostream&, const Bar& bar) {...}
}

int
main() {
  foo::Bar bar;
  std::cout << bar;
  return 0;
}

Including Files

Relative Paths

Within a library, use relative paths to include files. Do not rely on the user adding an include path at compile time. The user should be able to include a header in the library and compile with a command as simple as:

g++ test.cc

Dependency Errors

Dependency errors occur when a header B.h uses functionality from another header A.h which it does not include. Then using
#include "A.h"
#include "B.h"
in a .cc file works fine, but using
#include "B.h"
#include "A.h"
results in a compilation error. You can avoid dependency errors by using unit tests and not using convenience headers in your tests and examples.

Convenience Headers

In a project with many packages, it is a good idea to have convenience headers to make it easier for the user to access an entire package. For each source code directory, there is a corresponding header file, which includes all of the files in that directory. For example, if one were using bounding boxes from the computational geometry package, one could use:

#include "stlib/packages/geom/kernel/BBox.h"
If one wanted to use several geometric primitives in the kernel sub-package one could use
#include "stlib/packages/geom/kernel.h"
to include all of the header files in the kernel directory. One can also include all of the classes and functions in the computational geometry package.
#include "stlib/packages/geom.h"

It's a good idea to provide convenience headers because for the user they are... well, convenient. For someone who is not very familiar with a library, they allow one to remain blissfully unaware of the file structure. Using convenience headers comes at the cost of increased compilation time and decreased portability. Obviously, including unnecessary files increases the compilation time. If one is using a compiler that is only partially supported, then one would want to include as little code as possible. The more header files you include, the more likely you are to encounter a construct that your compiler doesn't like.

The developer should not use the convenience headers in the source code, the test code, or example code. The primary issues are correctness and portability; compilation time is of secondary importance. Using convenience headers in the source code decreases the portability because it increases the unnecessary headers that will be included in a user's application code. Using convenience headers in the test code or example code may hide dependency errors.

Exceptions

Don't use exception specifications such as the following.

void f() throw();
void g() throw(A, B);
void h() throw(...);
Part of the idea behind exception specifications was that they would help the compiler optimize code because it can make assumptions like: This function will never throw an exception. The reality is that these constructs may hurt performance because the compiler generates extra code that checks the exception specifications at run time.

Inheritance

Virtual Bases

When using ordinary (non-virtual) inheritance, a class may only initialize its direct bases. Below is an example.

class A {
public:
  A(int n);
}

class B: public A {
public:
  B(const int n) : A(n) {}
}

class C: public A {
public:
  C(const int n) : A(n) {}
}

class D: public B, public C {
public:
  D(const int n) : B(n), C(n) {}
}
The code below is erroneous. A is not a direct base of D.
class D: public B, public C {
public:
  // Error: A is not a direct base, so cannot call its constructor.
  D(const int n) : A(n), B(n), C(n) {}
}

By contrast, when you use virtual bases, it is the responsibility of the most derived class to initialize them. (Check out the Poisson random deviate classes in the packages/numerical/random directory. They each have UsesUniformRandomDeviate as a virtual base.) The code below illustrates this point. A is a virtual base for B and C. Hence it is a virtual base for D. D must explicitly call A's constructor to get the correct behavior, it will not be called through either B or C.

class A {
public:
  A(int n);
}

class B: public virtual A {
public:
  B(const int n) : A(n) {}
}

class C: public virtual A {
public:
  C(const int n) : A(n) {}
}

class D: public B, public C {
public:
  D(const int n) : A(n), B(n), C(n) {}
}
If D did not explicitly call A's constructor, then its default constructor (not the integer constructor) would implicitly be called. This behavior can cause logic errors that are difficult to detect. If the virtual base does not need a default constructor, make it private. This will force derived classes to explicitly initialize the virtual base.
class A {
private:
  // This forces derived classes to use the integer constructor.
  A();
public:
  A(int n);
}
Then the following code will result in a compilation error.
class D: public B, public C {
public:
  D(const int n) : B(n), C(n) {}
}

Number Types

Integer Types

Many hackers use int whenever an integer type is needed. They have various reasons for doing so:

In spite of these excellent reasons, using int is actually a bad idea. It can obfuscate the purpose of your code and introduce wicked bugs. Let's consider a typical code snippet:

std::vector<double> x;
...
for (int i = 0; i != x.size(); ++i) {
...

Compiling such code with warning enabled will give you an annoying message like "warning: comparison between signed and unsigned integer expressions." The problem is that the size() member function returns a std::size_t, which is either a 32-bit or a 64-bit unsigned integer, depending on whether you are making a 32-bit or a 64-bit executable. The compiler issues a warning because comparing signed and unsigned integers is a bad idea. To rid oneself of this annoyance, the typical solution is to either disable warnings or to truncate the size type, i.e.

for (int i = 0; i != int(x.size()); ++i) {
However, the correct solution is to use the size type for the iteration variable.
for (std::size_t i = 0; i != x.size(); ++i) {

Why are the first two loops incorrect? Because they give erroneous results whenever the number of elements in the vector exceeds 231. In the first example we compare a signed integer to a size type. If the vector is too large then this loop will never terminate. When the integer index reaches 2,147,483,647, incrementing it results in an overflow and it takes the value -2,147,483,648. Of course you would notice if your program entered an infinite loop. The second version has an error that may be more difficult to detect. The problem is in casting the size type to an int. If the size type has 32 bits then sizes greater that 231 will be converted to a negative integer. For 64-bit applications the truncation may result in an incorrect value that is either positive or negative. Depending on the circumstances you may encounter an infinite loop, a segmentation fault, or your program may complete and just give incorrect results.

Of course you are probably thinking "Yeah, but I would never create a vector with 2 billion elements." That may be true, but the fact that using int usually works is part of what makes the resulting errors so insidious. If they do appear, you will have no idea what is causing your code to crash or give incorrect results.

I think that most people use int because they don't know what type they should be using. Maybe they think that using the "right" approach would complicate their lives. If you are such an unfortunate soul, then rejoice and prepare to be enlightened! The rules for properly using integer types are really simple. Following them will improve the robustness and readability of your code.

See, that wasn't so bad. The first two cases are applicable 99% of the time. Thinking is rarely necessary. Below are some examples.

const std::size_t Dimension = 3;
typedef std::tr1::array<double, Dimension> Point;
std::vector<Point> centers;
...
// Initialize each coordinate to zero.
const Point origin = {{}};
bool hasPointAtOrigin = std::find(centers.begin(), centers.end(), origin) != centers.end();
...
// The return type of count() is an iterator difference type (for a good reason).
// Here it is sensible to assign that to a size type.
const std::size_t numberAtOrigin = std::count(centers.begin(), centers.end(), origin);
for (std::size_t i = 0; i != centers.size(); ++i) {
    centers[i] = ...
}
...
class Matrix {
...
    Matrix(const std::size_t numRows, const std::size_t numCols);
...
    std::size_t size() const;
    bool empty() const;
};

Now it's time for a quiz. What is troubling about the following code?

class Foo {
    ...
    int id() const {
        return int(this);
    }
    ...
};

If you said that you may lose precision in casting a pointer to an integer, then good for you! That was a lucky guess. In a 64-bit executable, casting a pointer to an integer truncates a 64-bit pointer to a 32-bit int. That means the identifiers may not be unique. Depending on your compiler, the above may or may not compile. Switching to std::size_t will get rid of the truncation problem and the compiler error.

    std::size_t id() const {
        return std::size_t(this);
    }

If at this point you are dissatisfied then you get a gold star! The deeply disturbing thing about the above example is that the address of the object is being used as its identifier. Switching from int to std::size_t makes the code syntactically correct, but it is not semantically correct. That is, the identifier produced above does not behave like an identifier should. I have no idea what a Foo is, but I would expect that if I copied one the identifier in the original and the copy would be the same. Consider the following uses of Foo.

Foo a;
...
Foo b = a; // Uh-oh. b.id() != a.id()
...
std::swap(a, b); // Uh-oh. The identifiers are not swapped.
...
std::vector<Foo> output;
for (std::size_t i = 0; i != 10; ++i) {
     Foo x;
     output.push_back(x); // Uh-oh. The identifiers get changed whenever the vector is resized.
}
std::map dict;
dict[a.id()] = a; // Uh-oh. Inserting the object changes its identifier.

Floating-Point Types

Const

If a variable or member function is constant, declare it as such.

const int Dimension = 3;

class A {
private:
  int _value;
public:
  int getValue() const {
    return _value;
  }
}
Concerning function arguments: when defining a function, use const for arguments that are not modified.
int
f(const int n) {
  ...
}
However, do not use const when declaring a function.
int
f(int n);
This is because int f(int n) and int f(const int n) have the same signature. The const is an implementation detail of the function. It has no effect on the caller.


http://www.its.caltech.edu/~sean/ / at(sean, dot(caltech, edu)) / Last Modified: September 3, 2009