Review of recursion

Let’s say we want to test the Collatz conjecture (which I’ve mentioned before) but using recursion. Remember that the Collatz conjecture says that if you start with a number n and then

If n is even, divide by 2
If n is odd, multiply by 3 and add 1

And repeat this over and over, eventually, n will equal 1. We’re going to write a recursive function that will return the number of steps that it took to get to 1 (if n is 1 from the start then that’s 0 steps).

int collatz(int n) {

}

What’s our base case?

int collatz(int n) {
    if(n == 1)
        return 0;
    else
        ...
}

In the recursive case, we have to consider whether n is even or odd:

int collatz(int n) {
    if(n == 1)
        return 0;
    else
        if(n % 2 == 0)
            return collatz(n / 2);
        else
            return collatz(3 * n + 1);
}

If we trace through the call collatz(7) what do we get?

Inheritance

We’ve talked about how, with exceptions, there is a hierarchy so that every out_of_range is also logic_error and in turn a exception. Similarly, every ofstream (output file stream) is also a ostream which is in turn an ios. How are these relationships expressed in code? The answer is through inheritance. Inheritance is a mechanism by which we say that two classes are related to each other, either because one uses some of the implementation of the other (i.e., private stuff), or because one uses the interface of the other (public stuff). Let’s look at an example.

Suppose that every dog is also and animal. What attributes do animals have?

class animal {
  public:
    string color;
    float weight; 
};

To say “every dog is also an animal” we write

class dog : public animal {
  public:
    string name;
    string breed;
    int age;  
};

(The public before animal means that we are “publically inheriting” from animal. This just means that dog isn’t hiding anything that animal leaves public.)

One thing to note already: we didn’t give dog a color or weight member. Was this a mistake?

dog fido;
fido.color = "Brown";
fido.weight = 37;
fido.name = "Fido";

No, this code works just fine. Because every dog is an animal, and every animal as color and weight, dog gets them as well, for free, while adding its own members (name, breed, and age). This is the first important fact about inheritance:

A subclass inherits all methods and members from its parent class.

(Here, dog is the subclass and animal is the parent class, sometimes called the superclass.)

This means that we don’t have to repeat ourselves when creating classes that have methods/members in common. (Although normally you shouldn’t use inheritance just to avoid writing extra code; the classes should actually have some kind of conceptual “is-a” relationship.)

Sometimes we say that classes connected by inheritance have an “IS-A” relationship: a dog IS-A animal, a istream IS-A ios, and so forth. This is in constrast to the relationship “HAS-A”: a dog HAS-A string name. A class “has” its data members, but “is” its superclass.

This means that we can build a class in stages, by using an inheritance chain:

class organism {
  public:
    float weight;
};

class animal : public organism {
  public:
    string color;
};

class dog : public organism {
  public:
    string name, breed;
    int age;
};

class corgi : public dog {
  public:
    float derpiness; 
};

Pointers and references to base classes

There are two very similar scenarios that have very different meanings:

dog fido;
animal a = fido; 

animal* ap = &fido;
animal& ar = fido;

The first creates a copy of a dog and tries to store it inside an animal. But how will that work, an animal is only big enough to hold a weight. In this case, C++ performs slicing, which means that it literally chops off everything that makes a dog a dog, keeping only the animal parts. That’s kind of bad, if we wanted to keep those things. In particular, this means that if you write a function

void breed(animal a, animal b);

what the function breed receives are “pure” animals, with no dogginess to them.

In the the second two cases, we create a animal pointer/reference to a dog. Now, no slicing occurs. Instead, we can think of this as viewing a dog through an animal-shaped window. We can’t use ap or ar to access any of the specifically doggy parts of fido, but they’re still there. This will become more important as we add methods.

Virtual methods: overriding behavior

Subclasses inherit the attributes (members) of their parent class. What about behavior (methods)? Subclasses inherit those, too, but with some complexity.

class animal {
  public:
    bool omnivore() { return false; }
};

class dog : public animal {
  ...
};

Can we now do?

dog fido;

if(fido.omnivore())
  ...

Yes, you can. But can dog change what omnivore does? (Dogs are omnivores, so dog::omnivore() should return true.) That’s a little more complex.

We can just override omnivore in dog:

class dog : public animal {
  public: 
    bool omnivore() { return true; } 
};

This will seem to work:

dog fido;
if(fido.omnivore())
  cout << "I'll eat that.";
else
  cout << "Meat OR veggies, please.";

This will print I'll eat that, because dogs are omnivores.

But problems arise when we start using pointers/references to fido:

dog fido;
animal& a = fido;

if(a.omnivore())
  cout << "I'll eat that.";
else
  cout << "Meat OR veggies, please.";

Remember that a is not actually an animal, but rather an animal-view into something that is actually a dog. We would expect that its dogginess would be preserved but it’s not. In fact, this will print Meat OR veggies. The problem is that “normal” methods use the type of the reference or pointer to determine what to do: we have an animal reference, so it’s animal::omnivore that gets used. Most of the time, this is not what we want.

However, if we make a simple change to the animal class:

class animal {
  public:
    virtual bool omnivore() { return false; }
};

Adding the keyword virtual tells C++ that when we use this method through a pointer or reference then it should look at what the object really is, rather than looking at just the type of the pointer/reference. This means that when we do

animal& a = fido;
a.omnivore()

omnivore will look inside a, see that it is really a dog (and not just an animal), and use dog::omnivore, returning true.

Since this is nearly always the behavior you want, you should get in the habit of writing virtual before methods if:

You are writing a method in a class.
You intend for subclasses to be able to override that method, changing what it does.

In short:

Virtual methods can be overridden in subclasses, and the subclass’s version will be used, even if accessed through a pointer/reference to the parent class.

Note that if you create a slice, then the “correct” method cannot be used, because it no longer exists:

animal a = fido; // Not a pointer or reference
a.omnivore()     // false

After we slice fido, there’s nothing doggy left about a, so a.omnivore() refers to animal::omnivore.

Polymorphism

Taken together, inheritance and virtual methods give us powerful tools for writing code that magically “does the right thing” without knowing exactly what that is. Consider the following problem: we want to create monsters for a game, where each monster can move and attack. But we want to allow for different kinds of monsters. We’ll create a base class monster to keep track of all the stuff common to all monsters, and then we’ll create subclasses for each particular type of monster:

class monster {
  public:
    virtual void move() { } 
    virtual void attack(monster& target) { } 

  private:
    float health = 100; 
};

(Note that the default implementations of both move and attack don’t do anything.)

Note that monster::attack takes its argument by reference, so that if it uses any virtual methods on target, they can be customized by each monster type.

Now we can proceed to define different kinds of monsters:

class wolf : public monster {
  public:
    void move() {
      pursue_prey();
    }

    void attack(monster& target) {
      ?
    }
}

Let’s think a bit about who monsters can attack… Just other monsters? What about the player? If we want to allow for both options, we’ll have to create a player class and then somehow fit it into a hierarchy, so that player and monster share a common base class. Then we can attack anything of that base class. Let’s call the new base class entity:

class entity {
  public:
    virtual void hurt(float damage) { }
};

class player : public entity { 
  public:
    void hurt(float damage) {
      cout << "Ow!" << endl;
    }
};

class monster : public entity {
  public:
    void attack(entity& target) { } 
    void hurt(float damage) {
      health -= damage;
    }
  private:
    float health;
};

class wolf : public monster {
  public:
    void attack(entity& target) {
      target.hurt(10); // Attack for 10 damage
    }
}

One class can inherit from another:

class animal {
  public:
    float weight;
};

class dog : public animal {
  public:
    string name;
};

This has two effects:

When we are writing the dog class, we can use all the resources from animal to help us do it. E.g., if animal has some methods that are useful, we can use them (or override them, if they are virtual). Thus, inheritance helps with DRY, because we only have to write common things once, in the base class, and then we can use them without repeating them in all subclasses.
Outside the classes, anywhere where we expect a animal* or animal& we can actually give it a dog*/dog&. This is called subtyping: because dog is a subtype of animal (a dog can do anything an animal can do), we can use a dog anywhere where an animal is expected. Again, this helps with DRY, because all our code outside the class only needs to be written once, for animals and it will automatically work for dogs, too. We don’t need to write separate functions for dogs vs. animals.

These two things are actually separate in C++: we can get one without the other, although if we just do “normal” inheritance (i.e., dog : public animal) we get both.

Composition

The first effect is what is called composition. This is just a fancy word for using one class to invisibly help in the definition of another. Just as we might use one function to help write another, by calling it, we might use one class to define another, for example, by storing an instance of it. Here’s an example:

class engine {
  public:
    void throttle(float amount);  
};

class car {
  public:
    void gas(float amount) {
        e.throttle(amount);
    }  
  private:
    engine e;
};

Here, a car has an engine, because cars are (partly) built out of engines. That is, we form the car class via composition, using the engine class. Note that a car IS NOT AN engine. We can’t use a car somewhere where an engine is expected. A car doesn’t support the same operations as engines. So there is no subtype relationship between cars and engines; cars just use engines, internally.

Another way to phrase this in C++ is via private inheritance:

class engine {

};

class car : private engine {

};

This effectively “brings in” all the members of engine into car, but makes them private, so that they are only accessible from within car. So again, a car does not appear, to the outside world, to be an engine, but from the perspective of a car, there is an engine present that can be used:

class engine {
  public:
    void throttle(float amount);
};

class car : private engine {
  public:
    void gas(float amount) {
        throttle(amount);
    }
};

Note the only difference between using “pure” composition, where we just make an engine a member of car and this version, using private inheritance: if we have a member engine e then we have to call throttle through e. If we inherit privately, then the engine is sort of “hidden” in the background, so we can just call throttle directly. But note that throttle is private in car, even though it is public in engine! If we try to do

car c;
c.throttle(12);

We’ll get an error, because throttle is still private.

Using an engine member is more flexible: e.g., we could have more than one engine if we wanted to. Using private inheritance is easier to write; we don’t have to worry about e, we can just call throttle directly. Private inheritance is kind of like bringing in a “library” of tools (consisting of everything in the class that we privately inherit from) for use within our class.

Private vs. Public inheritance

When we write class dog : public animal or class car : private engine we are telling C++ how to bring in the things from animal or engine into the subclasses dog/car. In particular, we need to tell it how to handle things that are public vs. private: should things that are public still be public in the subclass, or should they be changed to private?

public inheritance means that everything in animal stays at the same access level: things that are public in animal are also public in dog, and likewise things that are private stay private.
private inheritance means that everything in engine, public and private, becomes private in car. That is, things that were public in engine are now private, inside of car. (The things in engine that were private do not change.)

Note that inheritance can only be used to either keep things the same, or make them more private: there is no way (through inheritance) to make something more public, less private. That involves writing member functions to expose private members.

Abstract base classes

What about the second one, can we create a class that is purely a subtype of another, without getting any of the “guts”? Well, sort of. In order to do this we have to specifically construct our base class so that it has no “guts” in the first place; it only defines an interface, what something looks like, and doesn’t specify any of the details. A class like this, which just describes the methods without saying how they work is called an abstract base class and it looks something like this:

class animal {
  public:
    virtual void eat() = 0; 
    ...
};

The method eat() is called “pure virtual”. A class with any pure virtual methods is an abstract class; in particular, we cannot create an animal:

animal a; // ERROR!

And this should make sense to us: raw “animals” do not exist in the world, only specific types of animals like dogs and pigs and cows. So the class animal only describes what an animal looks like, what kinds of operations (member functions) it supports, but it doesn’t say anything about how they actually work. That’s left for the subclasses to define:

class dog : public animal {
  public:
    virtual void eat() {
        cout << "Woof!" << endl;
    }
};

When a class dog inherits from animal it must override (replace) all the pure virtual methods in animal, otherwise dog will still be abstract. Having given an implementation for eat(), dog is now a concrete class so we can create instances of it:

dog d;
d.eat();

If we can’t create animals, then what good is it? Well again, just because we can’t create an actual animal doesn’t mean we can’t store something like a dog in an animal* or animal&. That will work just fine:

animal& a = d;
a.eat(); // Prints Woof!

This means we can write functions that take animal& and they will work with any kind of animal, without knowing what specific kind of animal it is:

void feed_animal(animal& a) {
  a.eat();
}

feed_animal(d); // Works just fine

An abstract base class defines an interface, a set of operations that we can expect to be supported. Thus, it fits the second half of our idea of inheritance: subtyping. (But note that nothing stops you from putting things other than pure virtual methods into an abstract class; then subclasses will inherit all of those things as well.)

Polymorphism

class monster {
  public:
    virtual void move() { } 
    virtual void attack(monster& target) { } 

  private:
    float health = 100; 
};

(Note that the default implementations of both move and attack don’t do anything.)

Note that monster::attack takes its argument by reference, so that if it uses any virtual methods on target, they can be customized by each monster type.

Now we can proceed to define different kinds of monsters:

class wolf : public monster {
  public:
    void move() {
      pursue_prey();
    }

    void attack(monster& target) {
      ?
    }
}

class entity {
  public:
    virtual void hurt(float damage) { }
};

class player : public entity { 
  public:
    void hurt(float damage) {
      cout << "Ow!" << endl;
    }
};

class monster : public entity {
  public:
    void attack(entity& target) { } 
    void hurt(float damage) {
      health -= damage;
    }
  private:
    float health;
};

class wolf : public monster {
  public:
    void attack(entity& target) {
      target.hurt(10); // Attack for 10 damage
    }
}

Abstract Data Types

A type like entity or monster or point2d or almost any class is called an abstract data type. It’s abstract in the sense that the ADT manages itself: we don’t deal with the guts of the ADT (those are private), indeed, we don’t need to know anything about how it works internally to use it. That’s the power of ADTs: they free us from having to think about those things. We just think about how to use them, not about how they work.

I should stress that ADTs are not classes; an ADT is an idea. But in C++, classes are how we build ADTs. (If we didn’t have classes, we could still have ADTs, we’d just have to build them some other way.)

The hallmark of an ADT is encapsulation and/or data hiding. That is, we either wrap up some data into a single object, or we hide some data away, or both.

As an example of a “complete” ADT, let’s look back to our polynomial class and think about what it would take to make it a first-class citizen in C++ land.

Overload all the arithmetic operators. Use << and >> to implement shifting.
Overload insertion and extraction.
Maybe build a class hierarchy for “algebraic structures”, things that support arithmetic operators and variables. Base class methods would include insertion and extractions, and evaluation at a point (by giving) a value for the variable. Maybe we want to distinguish between univariate and multivariate objects? Domain/range (floats vs. ints).

class poly {
  public:
    poly() { }
    poly(float v) {
      coeff.push_back(v);
    }
    explicit poly(int degree) {
      coeff.resize(degree, 0);
    }

    int degree() {
      return coeff.size() - 1;
    }

    void expand(int degree) {
      coeff.resize(degree, 0);
    }

    void normalize() {
      int i = degree();
      while(i >= 0) {
        if(coeff.at(i) != 0.0)
          break;
        i--;
      }
      coeff.resize(i + 1);
    }

    poly operator- () {
      for(float& v : coeffs)
        v = -v;
    }

    poly operator+ (poly b) {
      poly result{max(degree(), b.degree())};

      expand(result.degree());
      b.expand(result.degree());

      for(int i = 0; i <= result.degree(); ++i)
        result.coeff.at(i) = coeff.at(i) + b.coeff.at(i);

      normalize();
      b.normalize();
      result.normalize();

      return result;
    }

    poly operator- (poly b) {
      return (*this) + -b;
    }

    private:
      vector<float> coeff;
}

Inheritance for abstraction

Inheritance and virtual methods give us another way to do abstraction: by writing a function that takes a reference or pointer to a base class, we can pass it any instances of a subclass. Thus, we write one function, but it can operate on many different kinds of things (including things that haven’t even been written yet!). This is how functions on istream and ostream help us: we don’t have to write separate functions for files, cin/cout, string streams, etc.

As an example, let’s look at animals again:

class animal ;
class dog : public animal ;
class cat : public animal ;

Now suppose I want to create a class for pet owners. I don’t want to nail myself down to just cat owners or just dog owners. How can I do this?

class owner {
  public:

    owner(string n) {
      name = n;
    }

    string name;
    vector<animal*> pets;
};

Now an owner can own any combination of cats, dogs, or even other animals (including subclasses of animal that we haven’t even written yet).

cat whiskers{"Whiskers", "calico"};
dog fido{"Fido", "Corgi", "brown"};

owner jsmith{"John Smith"};
jsmith.pets.push_back(&whiskers);
jsmith.pets.push_back(&fido);

The catch is that the owner class (or anything else that deals purely in animal*) doesn’t really know what kinds of animals it owns: it has no real way of finding out what the “real” type of a given animal* is. Instead, if we want to do something different for cats vs. dogs, we have to write it as a virtual method on animal and then override it on both cat and dog. E.g., suppose we wanted to print out an animal. You might think of writing a function like this:

void print(animal* a) {
  if(a ISA cat)
    // Print cat details
  else if(a ISA dog)
    // Print dog details
  else ...
    // Etc. for all other animal subclasses
}

The problem with this is that it is brittle; we have to add extra branches to the if-else chain whenever we add a new subclass of animal.

The proper way to do this is to not even try to “centralize” printing like this: instead, we let each animal subclass be responsible for its own printing:

class animal {
  virtual void print() = 0;
};

class dog : public animal {
  virtual void print() { 
    // print dog details
  }
};

class cat : public animal {
  virtual void print() {
    // print cat details
  }
};

This way, each class is responsible for its own printing behavior. There is no central print function, but if we have an animal* a then we can do

a->print();

and it will be printed, regardless of what particular kind of animal it is. In addition, if we want to add another animal subclass, we only need to give that subclass its own print method; we don’t need to change an existing function, just add a new one.

Ideally, the owner class shouldn’t have anything in it that is specific to cats or dogs. Instead, whenever we find ourselves wanting to do some dog- or cat-specific operation, we should think about how we could make that operation into a virtual method on animal. If it’s something that really only applies to cats or dogs, that’s OK! Just give the other subclasses do-nothing versions of the method. (This is the reason for all the operations on ios that are input- or output- specific; they just have do-nothing implementations on the subclasses where they don’t apply.)