Friday, August 6, 2010

Cpp Theory Questions 8

Do inline functions improve performance?

 

Yes and no. Sometimes. Maybe.

 

There are no simple answers. inline functions might make the code faster, they might make it slower. They might make the executable larger, they might make it smaller. They might cause thrashing, they might prevent thrashing. And they might be, and often are, totally irrelevant to speed.

 

inline functions might make it faster: As shown above, procedural integration might remove a bunch of unnecessary instructions, which might make things run faster.

 

inline functions might make it slower: Too much inlining might cause code bloat, which might cause "thrashing" on demand-paged virtual-memory systems. In other words, if the executable size is too big, the system might spend most of its time going out to disk to fetch the next chunk of code.

 

inline functions might make it larger: This is the notion of code bloat, as described above. For example, if a system has 100 inline functions each of which expands to 100 bytes of executable code and is called in 100 places, that's an increase of 1MB. Is that 1MB going to cause problems? Who knows, but it is possible that that last 1MB could cause the system to "thrash," and that could slow things down.

 

inline functions might make it smaller: The compiler often generates more code to push/pop registers/parameters than it would by inline-expanding the function's body. This happens with very small functions, and it also happens with large functions when the optimizer is able to remove a lot of redundant code through procedural integration — that is, when the optimizer is able to make the large function small.

 

inline functions might cause thrashing: Inlining might increase the size of the binary executable, and that might cause thrashing.

 

inline functions might prevent thrashing: The working set size (number of pages that need to be in memory at once) might go down even if the executable size goes up. When f() calls g(), the code is often on two distinct pages; when the compiler procedurally integrates the code of g() into f(), the code is often on the same page.

 

inline functions might increase the number of cache misses: Inlining might cause an inner loop to span across multiple lines of the memory cache, and that might cause thrashing of the memory-cache.

 

inline functions might decrease the number of cache misses: Inlining usually improves locality of reference within the binary code, which might decrease the number of cache lines needed to store the code of an inner loop. This ultimately could cause a CPU-bound application to run faster.

 

inline functions might be irrelevant to speed: Most systems are not CPU-bound. Most systems are I/O-bound, database-bound or network-bound, meaning the bottleneck in the system's overall performance is the file system, the database or the network. Unless your "CPU meter" is pegged at 100%, inline functions probably won't make your system faster. (Even in CPU-bound systems, inline will help only when used within the bottleneck itself, and the bottleneck is typically in only a small percentage of the code.)

 

There are no simple answers: You have to play with it to see what is best. Do not settle for simplistic answers like, "Never use inline functions" or "Always use inline functions" or "Use inline functions if and only if the function is less than N lines of code." These one-size-fits-all rules may be easy to write down, but they will produce sub-optimal results.

---------------------------------------------------------------------------------------------------------------------

 

How do you tell the compiler to make a non-member function inline?

 

When you declare an inline function, it looks just like a normal function:

 

 

void f(int i, char c);

But when you define an inline function, you prepend the function's definition with the keyword inline, and you put the definition into a header file:

 

 

inline

void f(int i, char c)

{

...

}

Note: It's imperative that the function's definition (the part between the {...}) be placed in a header file, unless the function is used only in a single .cpp file. In particular, if you put the inline function's definition into a .cpp file and you call it from some other .cpp file, you'll get an "unresolved external" error from the linker.

---------------------------------------------------------------------------------------------------------------------

 

Is there any difference between List x; and List x();?

 

A big difference!

 

Suppose that List is the name of some class. Then function f() declares a local List object called x:

 

 

void f()

{

List x; // Local object named x (of class List)

...

}

But function g() declares a function called x() that returns a List:

 

 

void g()

{

List x(); // Function named x (that returns a List)

...

}

---------------------------------------------------------------------------------------------------------------------

 

Can one constructor of a class call another constructor of the same class to initialize the this object?

 

Nope.

 

Let's work an example. Suppose you want your constructor Foo::Foo(char) to call another constructor of the same class, say Foo::Foo(char,int), in order that Foo::Foo(char,int) would help initialize the this object. Unfortunately there's no way to do this in C++.

 

Some people do it anyway. Unfortunately it doesn't do what they want. For example, the line Foo(x, 0); does not call Foo::Foo(char,int) on the this object. Instead it calls Foo::Foo(char,int) to initialize a temporary, local object (not this), then it immediately destructs that temporary when control flows over the ;.

 

 

class Foo {

public:

Foo(char x);

Foo(char x, int y);

...

};

 

Foo::Foo(char x)

{

...

Foo(x, 0); // this line does NOT help initialize the this object!!

...

}

You can sometimes combine two constructors via a default parameter:

 

 

class Foo {

public:

Foo(char x, int y=0); // this line combines the two constructors

...

};

If that doesn't work, e.g., if there isn't an appropriate default parameter that combines the two constructors, sometimes you can share their common code in a private init() member function:

 

 

class Foo {

public:

Foo(char x);

Foo(char x, int y);

...

private:

void init(char x, int y);

};

 

Foo::Foo(char x)

{

init(x, int(x) + 7);

...

}

 

Foo::Foo(char x, int y)

{

init(x, y);

...

}

 

void Foo::init(char x, int y)

{

...

}

BTW do NOT try to achieve this via placement new. Some people think they can say new(this) Foo(x, int(x)+7) within the body of Foo::Foo(char). However that is bad, bad, bad. Please don't write me and tell me that it seems to work on your particular version of your particular compiler; it's bad. Constructors do a bunch of little magical things behind the scenes, but that bad technique steps on those partially constructed bits. Just say no.

---------------------------------------------------------------------------------------------------------------------

 

Is the default constructor for Fred always Fred::Fred()?

 

No. A "default constructor" is a constructor that can be called with no arguments. One example of this is a constructor that takes no parameters:

 

 

class Fred {

public:

Fred(); // Default constructor: can be called with no args

...

};

Another example of a "default constructor" is one that can take arguments, provided they are given default values:

 

 

class Fred {

public:

Fred(int i=3, int j=5); // Default constructor: can be called with no args

...

};

---------------------------------------------------------------------------------------------------------------------

 

Which constructor gets called when I create an array of Fred objects?

 

Fred's default constructor (except as discussed below).

 

 

class Fred {

public:

Fred();

...

};

 

int main()

{

Fred a[10]; ¡࣡lls the default constructor 10 times

Fred* p = new Fred[10]; ¡࣡lls the default constructor 10 times

...

}

If your class doesn't have a default constructor, you'll get a compile-time error when you attempt to create an array using the above simple syntax:

 

 

class Fred {

public:

Fred(int i, int j); ¡sume there is no default constructor

...

};

 

int main()

{

Fred a[10]; ¡ŒROR: Fred doesn't have a default constructor

Fred* p = new Fred[10]; ¡ŒROR: Fred doesn't have a default constructor

...

}

However, even if your class already has a default constructor, you should try to use std::vector rather than an array (arrays are evil). std::vector lets you decide to use any constructor, not just the default constructor:

 

 

#include

 

int main()

{

std::vector a(10, Fred(5,7)); ¡e 10 Fred objects in std::vector a will be initialized with Fred(5,7)

...

}

Even though you ought to use a std::vector rather than an array, there are times when an array might be the right thing to do, and for those, you might need the "explicit initialization of arrays" syntax. Here's how:

 

 

class Fred {

public:

Fred(int i, int j); ¡sume there is no default constructor

...

};

 

int main()

{

Fred a[10] = {

Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), // The 10 Fred objects are

Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7) // initialized using Fred(5,7)

};

...

}

Of course you don't have to do Fred(5,7) for every entry ¡ª you can put in any numbers you want, even parameters or other variables.

 

Finally, you can use placement-new to manually initialize the elements of the array. Warning: it's ugly: the raw array can't be of type Fred, so you'll need a bunch of pointer-casts to do things like compute array index operations. Warning: it's compiler- and hardware-dependent: you'll need to make sure the storage is aligned with an alignment that is at least as strict as is required for objects of class Fred. Warning: it's tedious to make it exception-safe: you'll need to manually destruct the elements, including in the case when an exception is thrown part-way through the loop that calls the constructors. But if you really want to do it anyway, read up on placement-new. (BTW placement-new is the magic that is used inside of std::vector. The complexity of getting everything right is yet another reason to use std::vector.)

 

By the way, did I ever mention that arrays are evil? Or did I mention that you ought to use a std::vector unless there is a compelling reason to use an array?

---------------------------------------------------------------------------------------------------------------------

 

What is the "Named Constructor Idiom"?

 

A technique that provides more intuitive and/or safer construction operations for users of your class.

 

The problem is that constructors always have the same name as the class. Therefore the only way to differentiate between the various constructors of a class is by the parameter list. But if there are lots of constructors, the differences between them become somewhat subtle and error prone.

 

With the Named Constructor Idiom, you declare all the class's constructors in the private or protected sections, and you provide public static methods that return an object. These static methods are the so-called "Named Constructors." In general there is one such static method for each different way to construct an object.

 

For example, suppose we are building a Point class that represents a position on the X-Y plane. Turns out there are two common ways to specify a 2-space coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle). (Don't worry if you can't remember these; the point isn't the particulars of coordinate systems; the point is that there are several ways to create a Point object.) Unfortunately the parameters for these two coordinate systems are the same: two floats. This would create an ambiguity error in the overloaded constructors:

 

 

class Point {

public:

Point(float x, float y); // Rectangular coordinates

Point(float r, float a); // Polar coordinates (radius and angle)

// ERROR: Overload is Ambiguous: Point::Point(float,float)

};

 

int main()

{

Point p = Point(5.7, 1.2); // Ambiguous: Which coordinate system?

...

}

One way to solve this ambiguity is to use the Named Constructor Idiom:

 

 

#include // To get sin() and cos()

 

class Point {

public:

static Point rectangular(float x, float y); // Rectangular coord's

static Point polar(float radius, float angle); // Polar coordinates

// These static methods are the so-called "named constructors"

...

private:

Point(float x, float y); // Rectangular coordinates

float x_, y_;

};

 

inline Point::Point(float x, float y)

: x_(x), y_(y) { }

 

inline Point Point::rectangular(float x, float y)

{ return Point(x, y); }

 

inline Point Point::polar(float radius, float angle)

{ return Point(radius*cos(angle), radius*sin(angle)); }

Now the users of Point have a clear and unambiguous syntax for creating Points in either coordinate system:

 

 

int main()

{

Point p1 = Point::rectangular(5.7, 1.2); // Obviously rectangular

Point p2 = Point::polar(5.7, 1.2); // Obviously polar

...

}

Make sure your constructors are in the protected section if you expect Point to have derived classes.

 

The Named Constructor Idiom can also be used to make sure your objects are always created via new.

 

Note that the Named Constructor Idiom, at least as implemented above, is just as fast as directly calling a constructor — modern compilers will not make any extra copies of your object.

---------------------------------------------------------------------------------------------------------------------

No comments:

Post a Comment