Infinite Programming Tips: Advanced Concepts in C++

Monday, May 16, 2011

Advanced Concepts in C++

Inheritance

Inheritance captures the idea that certain classes of objects are related to each other in useful ways. For example, lists and sorted lists have quite similar behavior - they both allow the user to insert, delete, and find elements that are on the list. There are two benefits to using inheritance:

You can write generic code that doesn't care exactly which kind of object it is manipulating. For example, inheritance is widely used in windowing systems. Everything on the screen (windows, scroll bars, titles, icons) is its own object, but they all share a set of member functions in common, such as a routine Repaint to redraw the object onto the screen. This way, the code to repaint the entire screen can simply call the Repaintfunction on every object on the screen. The code that calls Repaint doesn't need to know which kinds of objects are on the screen, as long as each implements Repaint.
You can share pieces of an implementation between two objects. For example, if you were to implement both lists and sorted lists in C, you'd probably find yourself repeating code in both places - in fact, you might be really tempted to only implement sorted lists, so that you only had to debug one version. Inheritance provides a way to re-use code between nearly similar classes. For example, given an implementation of a list class, in C++ you can implement sorted lists by replacing the insert member function - the other functions, delete, isFull, print, all remain the same.

Shared Behavior

Let me use our Stack example to illustrate the first of these. Our Stack implementation above could have been implemented with linked lists, instead of an array. Any code using a Stack shouldn't care which implementation is being used, except that the linked list implementation can't overflow. (In fact, we could also change the array implementation to handle overflow by automatically resizing the array as items are pushed on the stack.)To allow the two implementations to coexist, we first define an abstract Stack, containing just the public member functions, but no data.

class Stack {
  public:
    Stack();
    virtual ~Stack();          // deallocate the stack
    virtual void Push(int value) = 0; 
                               // Push an integer, checking for overflow.
    virtual bool Full() = 0;   // Is the stack is full?
};

// For g++, need these even though no data to initialize.
Stack::Stack {}             
Stack::~Stack() {}

The Stack definition is called a base class or sometimes a superclass. We can then define two different derived classes, sometimes called subclasses which inherit behavior from the base class. (Of course, inheritance is recursive - a derived class can in turn be a base class for yet another derived class, and so on.) Note that I have prepended the functions in the base class is prepended with the keyword virtual, to signify that they can be redefined by each of the two derived classes. The virtual functions are initialized to zero, to tell the compiler that those functions must be defined by the derived classes.Here's how we could declare the array-based and list-based implementations of Stack. The syntax : public Stack signifies that both ArrayStack and ListStack are kinds of Stacks, and share the same behavior as the base class.

class ArrayStack : public Stack {  // the same as in Section 2
  public:
    ArrayStack(int sz);   // Constructor:  initialize variables, allocate space.
    ~ArrayStack();        // Destructor:   deallocate space allocated above.
    void Push(int value); // Push an integer, checking for overflow.
    bool Full();     // Returns TRUE if the stack is full, FALSE otherwise.
  private:
    int size;        // The maximum capacity of the stack.
    int top;         // Index of the lowest unused position.
    int *stack;      // A pointer to an array that holds the contents.
};

class ListStack : public Stack {
  public:
    ListStack(); 
    ~ListStack();
    void Push(int value);
    bool Full();
  private:
    List *list;         // list of items pushed on the stack
};

ListStack::ListStack() { 
    list = new List;
}

ListStack::~ListStack() { 
    delete list; 
}

void ListStack::Push(int value) {
    list->Prepend(value); 
}

bool ListStack::Full() {
    return FALSE;       // this stack never overflows!
}

The neat concept here is that I can assign pointers to instances of ListStack or ArrayStack to a variable of type Stack, and then use them as if they were of the base type.

Stack *s1 = new ListStack;
    Stack *s2 = new ArrayStack(17);

    if (!stack->Full())
        s1->Push(5);
    if (!s2->Full())
        s2->Push(6);

    delete s1;
    delete s2;

The compiler automatically invokes ListStack operations for s1, and ArrayStack operations for s2; this is done by creating a procedure table for each object, where derived objects override the default entries in the table defined by the base class. To the code above, it invokes the operations Full, Push, and delete by indirection through the procedure table, so that the code doesn't need to know which kind of object it is.In this example, since I never create an instance of the abstract class Stack, I do not need to implement its functions. This might seem a bit strange, but remember that the derived classes are the various implementations of Stack, and Stack serves only to reflect the shared behavior between the different implementations.
Also note that the destructor for Stack is a virtual function but the constructor is not. Clearly, when I create an object, I have to know which kind of object it is, whether ArrayStack or ListStack. The compiler makes sure that no one creates an instance of the abstract Stack by mistake - you cannot instantiate any class whose virtual functions are not completely defined (in other words, if any of its functions are set to zero in the class definition).
But when I deallocate an object, I may no longer know its exact type. In the above code, I want to call the destructor for the derived object, even though the code only knows that I am deleting an object of class Stack. If the destructor were not virtual, then the compiler would invoke Stack's destructor, which is not at all what I want. This is an easy mistake to make (I made it in the first draft of this article!) - if you don't define a destructor for the abstract class, the compiler will define one for you implicitly (and by the way, it won't be virtual, since you have a really unhelpful compiler). The result for the above code would be a memory leak, and who knows how you would figure that out!

Shared Implementation

What about sharing code, the other reason for inheritance? In C++, it is possible to use member functions of a base class in its derived class. (You can also share data between a base class and derived classes, but this is a bad idea for reasons I'll discuss later.)Suppose that I wanted to add a new member function, NumberPushed(), to both implementations of Stack. The ArrayStack class already keeps count of the number of items on the stack, so I could duplicate that code in ListStack. Ideally, I'd like to be able to use the same code in both places. With inheritance, we can move the counter into the Stack class, and then invoke the base class operations from the derived class to update the counter.

class Stack {
  public:
    virtual ~Stack();           // deallocate data
    virtual void Push(int value); // Push an integer, checking for overflow.
    virtual bool Full() = 0;    // return TRUE if full
    int NumPushed();            // how many are currently on the stack?
  protected:
    Stack();                    // initialize data
  private:
    int numPushed;
};

Stack::Stack() { 
    numPushed = 0; 
}

void Stack::Push(int value) { 
    numPushed++; 
}

int Stack::NumPushed() { 
    return numPushed; 
}

We can then modify both ArrayStack and ListStack to make use the new behavior of Stack. I'll only list one of them here:

class ArrayStack : public Stack {
  public:
    ArrayStack(int sz);   
    ~ArrayStack();        
    void Push(int value); 
    bool Full();     
  private:
    int size;        // The maximum capacity of the stack.
    int *stack;      // A pointer to an array that holds the contents.
};

ArrayStack::ArrayStack(int sz) : Stack() { 
    size = sz;
    stack = new int[size];   // Let's get an array of integers.
}

void
ArrayStack::Push(int value) {
    ASSERT(!Full());
    stack[NumPushed()] = value;
    Stack::Push();           // invoke base class to increment numPushed
}

There are a few things to note:

The constructor for ArrayStack needs to invoke the constructor for Stack, in order to initialize numPushed. It does that by adding : Stack() to the first line in the constructor:
```
ArrayStack::ArrayStack(int sz) : Stack()
```
The same thing applies to destructors. There are special rules for which get called first - the constructor/destructor for the base class or the constructor/destructor for the derived class. All I should say is, it's a bad idea to rely on whatever the rule is - more generally, it is a bad idea to write code which requires the reader to consult a manual to tell whether or not the code works!
I introduced a new keyword, protected, in the new definition of Stack. For a base class, protected signifies that those member data and functions are accessible to classes derived (recursively) from this class, but inaccessible to other classes. In other words, protected data is public to derived classes, and private to everyone else. For example, we need Stack's constructor to be callable by ArrayStack andListStack, but we don't want anyone else to create instances of Stack. Hence, we make Stack's constructor a protected function. In this case, this is not strictly necessary since the compiler will complain if anyone tries to create an instance of Stack because Stack still has an undefined virtual functions, Push. By defining Stack::Stack as protected, you are safe even if someone comes along later and definesStack::Push.Note however that I made Stack's data member private, not protected. Although there is some debate on this point, as a rule of thumb you should never allow one class to see directly access the data in another, even among classes related by inheritance. Otherwise, if you ever change the implementation of the base class, you will have to examine and change all the implementations of the derived classes, violating modularity.
The interface for a derived class automatically includes all functions defined for its base class, without having to explicitly list them in the derived class. Although we didn't define NumPushed() in ArrayStack, we can still call it for those objects:
```
ArrayStack *s = new ArrayStack(17);

    ASSERT(s->NumPushed() == 0);        // should be initialized to 0
```
Conversely, even though we have defined a routine Stack::Push(), because it is declared as virtual, if we invoke Push() on an ArrayStack object, we will get ArrayStack's version of Push:
```
Stack *s = new ArrayStack(17);

    if (!s->Full())             // ArrayStack::Full
        s->Push(5);             // ArrayStack::Push
```
Stack::NumPushed() is not virtual. That means that it cannot be re-defined by Stack's derived classes. Some people believe that you should mark all functions in a base class as virtual; that way, if you later want to implement a derived class that redefines a function, you don't have to modify the base class to do so.
Member functions in a derived class can explicitly invoke public or protected functions in the base class, by the full name of the function, Base::Function(), as in:
```
void ArrayStack::Push(int value)
{
    ...
    Stack::Push();           // invoke base class to increment numPushed
}
```
Of course, if we just called Push() here (without prepending Stack::, the compiler would think we were referring to ArrayStack's Push(), and so that would recurse, which is not exactly what we had in mind here.

Whew! Inheritance in C++ involves lots and lots of details. But it's real downside is that it tends to spread implementation details across multiple files - if you have a deep inheritance tree, it can take some serious digging to figure out what code actually executes when a member function is invoked.So the question to ask yourself before using inheritance is: what's your goal? Is it to write your programs with the fewest number of characters possible? If so, inheritance is really useful, but so is changing all of your function and variable names to be one letter long - "a", "b", "c" - and once you run out of lower case ones, start using upper case, then two character variable names: "XX XY XZ Ya ..." (I'm joking here.) Needless to say, it is really easy to write unreadable code using inheritance.
So when is it a good idea to use inheritance and when should it be avoided? My rule of thumb is to only use it for representing shared behavior between objects, and to never use it for representing shared implementation. With C++, you can use inheritance for both concepts, but only the first will lead to truly simpler implementations.
To illustrate the difference between shared behavior and shared implementation, suppose you had a whole bunch of different kinds of objects that you needed to put on lists. For example, almost everything in an operating system goes on a list of some sort: buffers, threads, users, terminals, etc.
A very common approach to this problem (particularly among people new to object-oriented programming) is to make every object inherit from a single base class Object, which contains the forward and backward pointers for the list. But what if some object needs to go on multiple lists? The whole scheme breaks down, and it's because we tried to use inheritance to share implementation (the code for the forward and backward pointers) instead of to share behavior. A much cleaner (although slightly slower) approach would be to define a list implementation that allocated forward/backward pointers for each object that gets put on a list.
In sum, if two classes share at least some of the same member function signatures - that is, the same behavior, and if there's code that only relies on the shared behavior, then there may be a benefit to using inheritance. In Nachos, locks don't inherit from semaphores, even though locks are implemented using semaphores. The operations on semaphores and locks are different. Instead, inheritance is only used for various kinds of lists (sorted, keyed, etc.), and for different implementations of the physical disk abstraction, to reflect whether the disk has a track buffer, etc. A disk is used the same way whether or not it has a track buffer; the only difference is in its performance characteristics.

Templates

Templates are another useful but dangerous concept in C++. With templates, you can parameterize a class definition with a type, to allow you to write generic type-independent code. For example, our Stackimplementation above only worked for pushing and popping integers; what if we wanted a stack of characters, or floats, or pointers, or some arbitrary data structure?In C++, this is pretty easy to do using templates:

template <class T> 
class Stack {
  public:
    Stack(int sz);    // Constructor:  initialize variables, allocate space.
    ~Stack();         // Destructor:   deallocate space allocated above.
    void Push(T value); // Push an integer, checking for overflow.
    bool Full();      // Returns TRUE if the stack is full, FALSE otherwise.
  private:
    int size;         // The maximum capacity of the stack.
    int top;          // Index of the lowest unused position.
    T *stack;       // A pointer to an array that holds the contents.
};

To define a template, we prepend the keyword template to the class definition, and we put the parameterized type for the template in angle brackets. If we need to parameterize the implementation with two or more types, it works just like an argument list: template <class T, class S>. We can use the type parameters elsewhere in the definition, just like they were normal types.When we provide the implementation for each of the member functions in the class, we also have to declare them as templates, and again, once we do that, we can use the type parameters just like normal types:

// template version of Stack::Stack
template <class T> 
Stack<T>::Stack(int sz) {
    size = sz;
    top = 0;
    stack = new T[size];   // Let's get an array of type T
}

Infinite Programming Tips

Monday, May 16, 2011

Advanced Concepts in C++

Inheritance

Shared Behavior

Shared Implementation

Templates

No comments:

Post a Comment

Featured Posts

Enable shared folders in ubuntu in vmware?