Use initializer lists instead of assignment to initialize member variables in constructors. This will improve both the readability and the performance of your code. List all of your member variables (in the order that you declared them) in the initializer list. If you don't use the correct order the compiler will probably complain because the variables must be initialized in the order they were declared.
Suppose that you wer to use assignment instead of an initializer list as shown below.
class Circle {
private
std::tr1::<double, 2> _center;
double _radius;
public:
Circle(const std::tr1::<double, 2>& center, const double radius) {
_center = center;
_radius = radius;
}
...
};
This will hurt performance because the variables will first be default
initialized and then assigned a value. In effect, the following
operations are performed.
// Initialization. _center[0] = 0; _center[1] = 0; _radius = 0; // Assignment. _center[0] = center[0]; _center[1] = center[1]; _radius = radius;It is better to use the following form.
Circle(const std::tr1::<double, 2>& center, const double radius) :
_center(center),
_radius(radius) {
}
Below is an example of the correct use of initializer lists. Note that you can only default initialize arrays. That is, you cannot set the array values in the initialization list. You must perform that assignment in the body of the constructor. (Note that you should prefer std::vector or std::tr1::array to plain arrays.)
class Foo {
private:
double _parameter;
double _coordinates[3];
float _density;
std::vector _values;
std::string _name;
public:
Foo(const double parameter, const double x, const double y,
const double z, const std::size_t numValues, const std::string& name) :
_parameter(parameter),
_coordinates(),
_density(0),
_sizes(numValues),
_name(name) {
_coordinates[0] = x;
_coordinates[1] = y;
_coordinates[2] = z;
}
...
};
The C++ preprocessor is basically a text substitution program. It doesn't know about C++ syntax. Don't use macros to define constants. The C++ language has several better ways of doing that. Instead of using
#define N 256prefer the following
const std::size_t ArraySize = 256;
Using the preprocessor to define constants in a source file is ill-advised. Doing so in a header file is downright evil. The macro substitition rules spill over into an source file that include the header.
#define MIN(x,y) (x < y : x ? y)use
template<typename T>
inline
min(const T x, const T y) {
return (x < y : x ? y);
}
In C, the keyword static can be used to indicate that a function or object is local to a translation unit. For example, one can have:
static int a;
static void f() {...}
in multiple source files. Each translation unit will have its own variable
a and function f(). This usage of
static is deprecated in C++.
(See "The C++ Programming Language" by Bjarne Stroustrup.)
Instead, use unnamed namespaces:
namespace {
int a;
void f() {...}
}
Avoid using using declarations to strip namespaces. Statements like
using namespace std; using namespace ads;are dangerous. They dump the contents of a namespace into the local scope. If you are lucky, this may lead to name collisions that are detected at compile time. However, it may also lead to the worst kind of error, a program that compiles and runs but gives unexpected results. Consider the following program.
#include <iostream>
double
distance(const double* x, const double* y) {
return *y - *x;
}
int
main() {
double x = 1, y = 2.5;
std::cout << distance(&x, &y) << "\n";
return 0;
}
It gives the result 1.5. A novice user might put in a using
statement to avoid having to qualify cout with
std::.
#include <iostream>
double
distance(const double* x, const double* y) {
return *y - *x;
}
int
main() {
using namespace std;
double x = 1, y = 2.5;
cout << distance(&x, &y) << "\n";
return 0;
}
Now the program gives a different result, -1 when I ran it. In the second
program, an STL function that computes distance between iterators is called.
template<class InputIterator> typename iterator_traits<InputIterator>::difference_type distance(InputIterator first, InputIterator last);Some developers consider their code to be more aesthetically pleasing when they don't have to use namespace qualifications, and thus carelessly strip off namespaces with using statements. This is bad programming and can lead to insidous errors. While using using in this manner in a source code file is a bad idea, putting it in a header file is truly evil. Then this dangerous practice is propagated to every source file that includes that header.
If a class is defined in a namespace, put all associated free functions in
that namespace as well. Note that the C++ rules on
argument dependent lookup will enable you to use those
free functions without explicitly qualifying them. In the example below,
the output operator is declared in the foo namespace. When
it is used in main, there is no need to explicitly qualify
the namespace, i.e. foo::operator<<(std::cout, bar).
namespace foo {
class Bar {...};
std::ostream&
operator<<(std::ostream&, const Bar& bar) {...}
}
int
main() {
foo::Bar bar;
std::cout << bar;
return 0;
}
Within a library, use relative paths to include files. Do not rely on the user adding an include path at compile time. The user should be able to include a header in the library and compile with a command as simple as:
g++ test.cc
B.h uses
functionality from another header A.h which it does not include.
Then using
#include "A.h" #include "B.h"in a
.cc file works fine, but using
#include "B.h" #include "A.h"results in a compilation error. You can avoid dependency errors by using unit tests and not using convenience headers in your tests and examples.
In a project with many packages, it is a good idea to have convenience headers to make it easier for the user to access an entire package. For each source code directory, there is a corresponding header file, which includes all of the files in that directory. For example, if one were using bounding boxes from the computational geometry package, one could use:
#include "stlib/packages/geom/kernel/BBox.h"If one wanted to use several geometric primitives in the kernel sub-package one could use
#include "stlib/packages/geom/kernel.h"to include all of the header files in the kernel directory. One can also include all of the classes and functions in the computational geometry package.
#include "stlib/packages/geom.h"
It's a good idea to provide convenience headers because for the user they are... well, convenient. For someone who is not very familiar with a library, they allow one to remain blissfully unaware of the file structure. Using convenience headers comes at the cost of increased compilation time and decreased portability. Obviously, including unnecessary files increases the compilation time. If one is using a compiler that is only partially supported, then one would want to include as little code as possible. The more header files you include, the more likely you are to encounter a construct that your compiler doesn't like.
The developer should not use the convenience headers in the source code, the test code, or example code. The primary issues are correctness and portability; compilation time is of secondary importance. Using convenience headers in the source code decreases the portability because it increases the unnecessary headers that will be included in a user's application code. Using convenience headers in the test code or example code may hide dependency errors.
Don't use exception specifications such as the following.
void f() throw(); void g() throw(A, B); void h() throw(...);Part of the idea behind exception specifications was that they would help the compiler optimize code because it can make assumptions like: This function will never throw an exception. The reality is that these constructs may hurt performance because the compiler generates extra code that checks the exception specifications at run time.
When using ordinary (non-virtual) inheritance, a class may only initialize its direct bases. Below is an example.
class A {
public:
A(int n);
}
class B: public A {
public:
B(const int n) : A(n) {}
}
class C: public A {
public:
C(const int n) : A(n) {}
}
class D: public B, public C {
public:
D(const int n) : B(n), C(n) {}
}
The code below is erroneous. A is not a direct base of
D.
class D: public B, public C {
public:
// Error: A is not a direct base, so cannot call its constructor.
D(const int n) : A(n), B(n), C(n) {}
}
By contrast, when you use virtual bases,
it is the responsibility of the most derived
class to initialize them. (Check out the Poisson random deviate classes in
the packages/numerical/random directory. They each have
UsesUniformRandomDeviate as a virtual base.) The code below
illustrates this point. A is a virtual base for B
and C. Hence it is a virtual base for D.
D must explicitly call A's constructor to get
the correct behavior, it will not
be called through either B or C.
class A {
public:
A(int n);
}
class B: public virtual A {
public:
B(const int n) : A(n) {}
}
class C: public virtual A {
public:
C(const int n) : A(n) {}
}
class D: public B, public C {
public:
D(const int n) : A(n), B(n), C(n) {}
}
If D did not explicitly call A's constructor,
then its default constructor (not the integer constructor) would implicitly
be called. This behavior can cause logic errors that are difficult to detect.
If the virtual base does not need a default constructor, make it private.
This will force derived classes to explicitly initialize the virtual base.
class A {
private:
// This forces derived classes to use the integer constructor.
A();
public:
A(int n);
}
Then the following code will result in a compilation error.
class D: public B, public C {
public:
D(const int n) : B(n), C(n) {}
}
Many hackers use int whenever an integer type is needed.
They have various reasons for doing so:
short is too small, and
long is too big, but int is just right.
int is easy to type and using it makes code look
"clean."
int is The One true integer type. You can use even
use it for Boolean values; zero is false and nonzero is true.
In spite of these excellent reasons, using int is actually a bad
idea. It can obfuscate the purpose of your code and introduce wicked bugs.
Let's consider a typical code snippet:
std::vector<double> x;
...
for (int i = 0; i != x.size(); ++i) {
...
Compiling such code with warning enabled will give you an annoying message
like "warning: comparison between signed and unsigned integer
expressions." The problem is that the size() member
function returns a std::size_t, which is either a 32-bit
or a 64-bit unsigned integer, depending on whether you are making a 32-bit
or a 64-bit executable. The compiler issues a warning because comparing
signed and unsigned integers is a bad idea. To rid oneself of this
annoyance, the typical solution is to either disable warnings or to
truncate the size type, i.e.
for (int i = 0; i != int(x.size()); ++i) {
However, the correct solution is to use the size type for the iteration
variable.
for (std::size_t i = 0; i != x.size(); ++i) {
Why are the first two loops incorrect? Because they give erroneous results
whenever the number of elements in the vector exceeds 231.
In the first example we compare a signed integer to a size type. If the vector
is too large then this loop will never terminate. When the integer index
reaches
2,147,483,647, incrementing it results in an overflow and it takes the value
-2,147,483,648. Of course you would notice if your program entered
an infinite loop. The second version has an error that may be more difficult
to detect. The problem is in casting the size type to an int.
If the size type has 32 bits then sizes greater that 231
will be converted to a negative integer. For 64-bit applications
the truncation may result in an incorrect value that is either positive or
negative. Depending on the circumstances you may encounter an infinite loop,
a segmentation fault, or your program may complete and just give incorrect
results.
Of course you are probably thinking "Yeah, but I would never create
a vector with 2 billion elements." That may be true, but the fact that
using int usually works is part of what makes the resulting
errors so insidious. If they do appear, you will have no idea what is
causing your code to crash or give incorrect results.
I think that most people use int because they don't know what type
they
std::size_t for loops, array indexing, and quantities
that are sizes or counts.
bool for Boolean values.
std::ptrdiff_t for distances between pointers.
const std::size_t Dimension = 3;
typedef std::tr1::array<double, Dimension> Point;
std::vector<Point> centers;
...
// Initialize each coordinate to zero.
const Point origin = {{}};
bool hasPointAtOrigin = std::find(centers.begin(), centers.end(), origin) != centers.end();
...
// The return type of count() is an iterator difference type (for a good reason).
// Here it is sensible to assign that to a size type.
const std::size_t numberAtOrigin = std::count(centers.begin(), centers.end(), origin);
for (std::size_t i = 0; i != centers.size(); ++i) {
centers[i] = ...
}
...
class Matrix {
...
Matrix(const std::size_t numRows, const std::size_t numCols);
...
std::size_t size() const;
bool empty() const;
};
Now it's time for a quiz. What is troubling about the following code?
class Foo {
...
int id() const {
return int(this);
}
...
};
If you said that you may lose precision in casting a pointer to an integer,
then good for you! That was a lucky guess. In a 64-bit executable, casting
a pointer to an integer truncates a 64-bit pointer to a 32-bit
int. That means the identifiers may not be unique.
Depending on your compiler, the
above may or may not compile. Switching to std::size_t
will get rid of the truncation problem and the compiler error.
std::size_t id() const {
return std::size_t(this);
}
If at this point you are dissatisfied then you get a gold star! The
deeply disturbing thing about the above example is that
the address of the object is being used as its identifier. Switching
from int to std::size_t makes the code
syntactically correct, but it is not semantically correct. That is,
the identifier produced above does not behave like an identifier
should. I have no idea what a Foo is, but I would expect that if I
copied one the identifier in the original and the copy would be the
same. Consider the following uses of Foo.
Foo a;
...
Foo b = a; // Uh-oh. b.id() != a.id()
...
std::swap(a, b); // Uh-oh. The identifiers are not swapped.
...
std::vector<Foo> output;
for (std::size_t i = 0; i != 10; ++i) {
Foo x;
output.push_back(x); // Uh-oh. The identifiers get changed whenever the vector is resized.
}
std::map dict;
dict[a.id()] = a; // Uh-oh. Inserting the object changes its identifier.
If a variable or member function is constant, declare it as such.
const int Dimension = 3;
class A {
private:
int _value;
public:
int getValue() const {
return _value;
}
}
Concerning function arguments: when defining a function, use
const for arguments that are not modified.
int
f(const int n) {
...
}
However, do not use const when declaring a function.
int f(int n);This is because
int f(int n) and int f(const int n)
have the same signature. The