What is a reference?
An alias (an alternate name) for an object.
References are frequently used for pass-by-reference:
void swap(int& i, int& j)
{
int tmp = i;
i = j;
j = tmp;
}
int main()
{
int x, y;
...
swap(x,y);
...
}
Here i and j are aliases for main's x and y respectively. In other words, i is x not a pointer to x, nor a copy of x, but x itself. Anything you do to i gets done to x, and vice versa.
OK. That's how you should think of references as a programmer. Now, at the risk of confusing you by giving you a different perspective, here's how references are implemented. Underneath it all, a reference i to object x is typically the machine address of the object x. But when the programmer says i++, the compiler generates code that increments x. In particular, the address bits that the compiler uses to find x are not changed. A C programmer will think of this as if you used the C style pass-by-pointer, with the syntactic variant of (1) moving the & from the caller into the callee, and (2) eliminating the *s. In other words, a C programmer will think of i as a macro for (*p), where p is a pointer to x (e.g., the compiler automatically dereferences the underlying pointer; i++ is changed to (*p)++; i = 7 is automatically changed to *p = 7).
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object. It is not a pointer to the object, nor a copy of the object. It is the object.
What happens if you assign to a reference?
You change the state of the referent (the referent is the object to which the reference refers).
Remember: the reference is the referent, so changing the reference changes the state of the referent. In compiler writer lingo, a reference is an "lvalue" (something that can appear on the left hand side of an assignment operator).
---------------------------------------------------------------------------------------------------------------------
What happens if you return a reference?
The function call can appear on the left hand side of an assignment operator.
This ability may seem strange at first. For example, no one thinks the expression f() = 7 makes sense. Yet, if a is an object of class Array, most people think that a[i] = 7 makes sense even though a[i] is really just a function call in disguise (it calls Array::operator[](int), which is the subscript operator for class Array).
class Array {
public:
int size() const;
float& operator[] (int index);
...
};
int main()
{
Array a;
for (int i = 0; i < a.size(); ++i)
a[i] = 7; // This line invokes Array::operator[](int)
...
}
How can you reseat a reference to make it refer to a different object?
No way.
You can't separate the reference from the referent.
Unlike a pointer, once a reference is bound to an object, it can not be "reseated" to another object. The reference itself isn't an object (it has no identity; taking the address of a reference gives you the address of the referent; remember: the reference is its referent).
In that sense, a reference is similar to a const pointer such as int* const p (as opposed to a pointer to const such as const int* p). In spite of the gross similarity, please don't confuse references with pointers; they're not at all the same.
When should I use references, and when should I use pointers?
Use references when you can, and pointers when you have to.
References are usually preferred over pointers whenever you don't need "reseating". This usually means that references are most useful in a class's public interface. References typically appear on the skin of an object, and pointers on the inside.
The exception to the above is where a function's parameter or return value needs a "sentinel" reference a reference that does not refer to an object. This is usually best done by returning/taking a pointer, and giving the NULL pointer this special significance (references should always alias objects, not a dereferenced NULL pointer).
Note: Old line C programmers sometimes don't like references since they provide reference semantics that isn't explicit in the caller's code. After some C++ experience, however, one quickly realizes this is a form of information hiding, which is an asset rather than a liability. E.g., programmers should write code in the language of the problem rather than the language of the machine.
---------------------------------------------------------------------------------------------------------------------
Do inline functions improve performance?
Yes and no. Sometimes. Maybe.
There are no simple answers. inline functions might make the code faster, they might make it slower. They might make the executable larger, they might make it smaller. They might cause thrashing, they might prevent thrashing. And they might be, and often are, totally irrelevant to speed.
inline functions might make it faster: As shown above, procedural integration might remove a bunch of unnecessary instructions, which might make things run faster.
inline functions might make it slower: Too much inlining might cause code bloat, which might cause "thrashing" on demand-paged virtual-memory systems. In other words, if the executable size is too big, the system might spend most of its time going out to disk to fetch the next chunk of code.
inline functions might make it larger: This is the notion of code bloat, as described above. For example, if a system has 100 inline functions each of which expands to 100 bytes of executable code and is called in 100 places, that's an increase of 1MB. Is that 1MB going to cause problems? Who knows, but it is possible that that last 1MB could cause the system to "thrash," and that could slow things down.
inline functions might make it smaller: The compiler often generates more code to push/pop registers/parameters than it would by inline-expanding the function's body. This happens with very small functions, and it also happens with large functions when the optimizer is able to remove a lot of redundant code through procedural integration that is, when the optimizer is able to make the large function small.
inline functions might cause thrashing: Inlining might increase the size of the binary executable, and that might cause thrashing.
inline functions might prevent thrashing: The working set size (number of pages that need to be in memory at once) might go down even if the executable size goes up. When f() calls g(), the code is often on two distinct pages; when the compiler procedurally integrates the code of g() into f(), the code is often on the same page.
inline functions might increase the number of cache misses: Inlining might cause an inner loop to span across multiple lines of the memory cache, and that might cause thrashing of the memory-cache.
inline functions might decrease the number of cache misses: Inlining usually improves locality of reference within the binary code, which might decrease the number of cache lines needed to store the code of an inner loop. This ultimately could cause a CPU-bound application to run faster.
inline functions might be irrelevant to speed: Most systems are not CPU-bound. Most systems are I/O-bound, database-bound or network-bound, meaning the bottleneck in the system's overall performance is the file system, the database or the network. Unless your "CPU meter" is pegged at 100%, inline functions probably won't make your system faster. (Even in CPU-bound systems, inline will help only when used within the bottleneck itself, and the bottleneck is typically in only a small percentage of the code.)
There are no simple answers: You have to play with it to see what is best. Do not settle for simplistic answers like, "Never use inline functions" or "Always use inline functions" or "Use inline functions if and only if the function is less than N lines of code." These one-size-fits-all rules may be easy to write down, but they will produce sub-optimal results.
Why should I use inline functions instead of plain old #define macros?
Because #define macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4. Sometimes you should use them anyway, but they're still evil.
Unlike #define macros, inline functions avoid infamous macro errors since inline functions always evaluate every argument exactly once. In other words, invoking an inline function is semantically just like invoking a regular function, only faster:
// A macro that returns the absolute value of i
#define unsafe(i) \
( (i) >= 0 ? (i) : -(i) )
// An inline function that returns the absolute value of i
inline
int safe(int i)
{
return i >= 0 ? i : -i;
}
int f();
void userCode(int x)
{
int ans;
ans = unsafe(x++); // Error! x is incremented twice
ans = unsafe(f()); // Danger! f() is called twice
ans = safe(x++); // Correct! x is incremented once
ans = safe(f()); // Correct! f() is called once
}
Also unlike macros, argument types are checked, and necessary conversions are performed correctly.
Macros are bad for your health; don't use them unless you have to.
---------------------------------------------------------------------------------------------------------------------
No comments:
Post a Comment