Review of recursion
Let’s say we want to test the Collatz conjecture (which I’ve mentioned before) but using recursion. Remember that the Collatz conjecture says that if you start with a number n and then
If n is even, divide by 2
If n is odd, multiply by 3 and add 1
And repeat this over and over, eventually, n will equal 1. We’re going to write a recursive function that will return the number of steps that it took to get to 1 (if n is 1 from the start then that’s 0 steps).
int collatz(int n) {
}
What’s our base case?
int collatz(int n) {
if(n == 1)
return 0;
else
...
}
In the recursive case, we have to consider whether n is even or odd:
int collatz(int n) {
if(n == 1)
return 0;
else
if(n % 2 == 0)
return collatz(n / 2);
else
return collatz(3 * n + 1);
}
If we trace through the call collatz(7)
what do we get?
Inheritance
We’ve talked about how, with exceptions, there is a hierarchy so that every
out_of_range
is also logic_error
and in turn a exception
. Similarly,
every ofstream
(output file stream) is also a ostream
which is in turn
an ios
. How are these relationships expressed in code? The answer is
through inheritance. Inheritance is a mechanism by which we say that two
classes are related to each other, either because one uses some of the
implementation of the other (i.e., private
stuff), or because one uses
the interface of the other (public
stuff). Let’s look at an example.
Suppose that every dog
is also and animal
. What attributes do animals
have?
class animal {
public:
string color;
float weight;
};
To say “every dog
is also an animal
” we write
class dog : public animal {
public:
string name;
string breed;
int age;
};
(The public
before animal means that we are “publically inheriting” from
animal
. This just means that dog
isn’t hiding anything that animal
leaves public
.)
One thing to note already: we didn’t give dog
a color
or weight
member.
Was this a mistake?
dog fido;
fido.color = "Brown";
fido.weight = 37;
fido.name = "Fido";
No, this code works just fine. Because every dog
is an animal, and every
animal
as color
and weight
, dog
gets them as well, for free,
while adding its own members (name
, breed
, and age
). This is the first
important fact about inheritance:
A subclass inherits all methods and members from its parent class.
(Here, dog
is the subclass and animal
is the parent class, sometimes called
the superclass.)
This means that we don’t have to repeat ourselves when creating classes that have methods/members in common. (Although normally you shouldn’t use inheritance just to avoid writing extra code; the classes should actually have some kind of conceptual “is-a” relationship.)
Sometimes we say that classes connected by inheritance have an “IS-A”
relationship: a dog
IS-A animal
, a istream
IS-A ios
, and so forth.
This is in constrast to the relationship “HAS-A”: a dog
HAS-A string name
.
A class “has” its data members, but “is” its superclass.
This means that we can build a class in stages, by using an inheritance chain:
class organism {
public:
float weight;
};
class animal : public organism {
public:
string color;
};
class dog : public organism {
public:
string name, breed;
int age;
};
class corgi : public dog {
public:
float derpiness;
};
Pointers and references to base classes
There are two very similar scenarios that have very different meanings:
dog fido;
animal a = fido;
animal* ap = &fido;
animal& ar = fido;
The first creates a copy of a dog
and tries to store it inside an
animal
. But how will that work, an animal
is only big enough to hold
a weight. In this case, C++ performs slicing, which means that it literally
chops off everything that makes a dog
a dog
, keeping only the animal
parts. That’s kind of bad, if we wanted to keep those things. In particular,
this means that if you write a function
void breed(animal a, animal b);
what the function breed
receives are “pure” animals, with no dogginess to
them.
In the the second two cases, we create a animal
pointer/reference to a
dog
. Now, no slicing occurs. Instead, we can think of this as viewing a
dog
through an animal
-shaped window. We can’t use ap
or ar
to access
any of the specifically doggy parts of fido
, but they’re still there.
This will become more important as we add methods.
Virtual methods: overriding behavior
Subclasses inherit the attributes (members) of their parent class. What about behavior (methods)? Subclasses inherit those, too, but with some complexity.
class animal {
public:
bool omnivore() { return false; }
};
class dog : public animal {
...
};
Can we now do?
dog fido;
if(fido.omnivore())
...
Yes, you can. But can dog
change what omnivore
does? (Dogs are omnivores,
so dog::omnivore()
should return true
.) That’s a little more complex.
We can just override omnivore
in dog
:
class dog : public animal {
public:
bool omnivore() { return true; }
};
This will seem to work:
dog fido;
if(fido.omnivore())
cout << "I'll eat that.";
else
cout << "Meat OR veggies, please.";
This will print I'll eat that
, because dogs are omnivores.
But problems arise when we start using pointers/references to fido
:
dog fido;
animal& a = fido;
if(a.omnivore())
cout << "I'll eat that.";
else
cout << "Meat OR veggies, please.";
Remember that a
is not actually an animal
, but rather an animal
-view
into something that is actually a dog
. We would expect that its dogginess
would be preserved but it’s not. In fact, this will print Meat OR veggies
.
The problem is that “normal” methods use the type of the reference or
pointer to determine what to do: we have an animal
reference, so it’s
animal::omnivore
that gets used. Most of the time, this is not what we
want.
However, if we make a simple change to the animal
class:
class animal {
public:
virtual bool omnivore() { return false; }
};
Adding the keyword virtual
tells C++ that when we use this method
through a pointer or reference then it should look at what the object
really is, rather than looking at just the type of the pointer/reference.
This means that when we do
animal& a = fido;
a.omnivore()
omnivore
will look inside a
, see that it is really a dog
(and not just
an animal
), and use dog::omnivore
, returning true
.
Since this is nearly always the behavior you want, you should get in the habit
of writing virtual
before methods if:
You are writing a method in a class.
You intend for subclasses to be able to override that method, changing what it does.
In short:
Virtual methods can be overridden in subclasses, and the subclass’s version will be used, even if accessed through a pointer/reference to the parent class.
Note that if you create a slice, then the “correct” method cannot be used, because it no longer exists:
animal a = fido; // Not a pointer or reference
a.omnivore() // false
After we slice fido
, there’s nothing doggy left about a
, so a.omnivore()
refers to animal::omnivore
.
Polymorphism
Taken together, inheritance and virtual methods give us powerful tools for
writing code that magically “does the right thing” without knowing exactly
what that is. Consider the following problem: we want to create monsters for
a game, where each monster can move and attack. But we want to allow for
different kinds of monsters. We’ll create a base class monster
to
keep track of all the stuff common to all monsters, and then we’ll create
subclasses for each particular type of monster:
class monster {
public:
virtual void move() { }
virtual void attack(monster& target) { }
private:
float health = 100;
};
(Note that the default implementations of both move
and attack
don’t do
anything.)
Note that monster::attack
takes its argument by reference, so that if it
uses any virtual methods on target
, they can be customized by each
monster type.
Now we can proceed to define different kinds of monsters:
class wolf : public monster {
public:
void move() {
pursue_prey();
}
void attack(monster& target) {
?
}
}
Let’s think a bit about who monsters can attack… Just other monsters?
What about the player? If we want to allow for both options, we’ll have to
create a player class and then somehow fit it into a hierarchy, so that
player
and monster
share a common base class. Then we can attack
anything
of that base class. Let’s call the new base class entity
:
class entity {
public:
virtual void hurt(float damage) { }
};
class player : public entity {
public:
void hurt(float damage) {
cout << "Ow!" << endl;
}
};
class monster : public entity {
public:
void attack(entity& target) { }
void hurt(float damage) {
health -= damage;
}
private:
float health;
};
class wolf : public monster {
public:
void attack(entity& target) {
target.hurt(10); // Attack for 10 damage
}
}
One class can inherit from another:
class animal {
public:
float weight;
};
class dog : public animal {
public:
string name;
};
This has two effects:
When we are writing the
dog
class, we can use all the resources fromanimal
to help us do it. E.g., ifanimal
has some methods that are useful, we can use them (or override them, if they are virtual). Thus, inheritance helps with DRY, because we only have to write common things once, in the base class, and then we can use them without repeating them in all subclasses.Outside the classes, anywhere where we expect a
animal*
oranimal&
we can actually give it adog*
/dog&
. This is called subtyping: becausedog
is a subtype ofanimal
(adog
can do anything ananimal
can do), we can use adog
anywhere where ananimal
is expected. Again, this helps with DRY, because all our code outside the class only needs to be written once, foranimal
s and it will automatically work fordog
s, too. We don’t need to write separate functions fordog
s vs.animal
s.
These two things are actually separate in C++: we can get one without the other,
although if we just do “normal” inheritance (i.e., dog : public animal
) we
get both.
Composition
The first effect is what is called composition. This is just a fancy word for using one class to invisibly help in the definition of another. Just as we might use one function to help write another, by calling it, we might use one class to define another, for example, by storing an instance of it. Here’s an example:
class engine {
public:
void throttle(float amount);
};
class car {
public:
void gas(float amount) {
e.throttle(amount);
}
private:
engine e;
};
Here, a car
has an engine
, because cars are (partly) built out of engines.
That is, we form the car
class via composition, using the engine
class.
Note that a car IS NOT AN engine. We can’t use a car somewhere where an
engine is expected. A car doesn’t support the same operations as engines.
So there is no subtype relationship between cars and engines; cars just
use engines, internally.
Another way to phrase this in C++ is via private inheritance:
class engine {
};
class car : private engine {
};
This effectively “brings in” all the members of engine
into car
, but makes
them private
, so that they are only accessible from within car
. So again,
a car does not appear, to the outside world, to be an engine, but from the
perspective of a car, there is an engine present that can be used:
class engine {
public:
void throttle(float amount);
};
class car : private engine {
public:
void gas(float amount) {
throttle(amount);
}
};
Note the only difference between using “pure” composition, where we just
make an engine
a member of car
and this version, using private inheritance:
if we have a member engine e
then we have to call throttle through e
. If
we inherit privately, then the engine is sort of “hidden” in the background,
so we can just call throttle
directly. But note that throttle
is private
in car
, even though it is public in engine! If we try to do
car c;
c.throttle(12);
We’ll get an error, because throttle
is still private.
Using an engine
member is more flexible: e.g., we could have more than
one engine if we wanted to. Using private inheritance is easier to write;
we don’t have to worry about e
, we can just call throttle
directly. Private
inheritance is kind of like bringing in a “library” of tools (consisting of
everything in the class that we privately inherit from) for use within our class.
Private vs. Public inheritance
When we write class dog : public animal
or class car : private engine
we
are telling C++ how to bring in the things from animal
or engine
into
the subclasses dog
/car
. In particular, we need to tell it how to handle
things that are public vs. private: should things that are public still be
public in the subclass, or should they be changed to private?
public
inheritance means that everything inanimal
stays at the same access level: things that are public inanimal
are also public indog
, and likewise things that are private stay private.private
inheritance means that everything inengine
, public and private, becomes private incar
. That is, things that were public inengine
are now private, inside ofcar
. (The things inengine
that were private do not change.)
Note that inheritance can only be used to either keep things the same, or make them more private: there is no way (through inheritance) to make something more public, less private. That involves writing member functions to expose private members.
Abstract base classes
What about the second one, can we create a class that is purely a subtype of another, without getting any of the “guts”? Well, sort of. In order to do this we have to specifically construct our base class so that it has no “guts” in the first place; it only defines an interface, what something looks like, and doesn’t specify any of the details. A class like this, which just describes the methods without saying how they work is called an abstract base class and it looks something like this:
class animal {
public:
virtual void eat() = 0;
...
};
The method eat()
is called “pure virtual”. A class with any pure virtual
methods is an abstract class; in particular, we cannot create an
animal
:
animal a; // ERROR!
And this should make sense to us: raw “animals” do not exist in the world,
only specific types of animals like dogs and pigs and cows. So the class
animal
only describes what an animal looks like, what kinds of operations
(member functions) it supports, but it doesn’t say anything about how they
actually work. That’s left for the subclasses to define:
class dog : public animal {
public:
virtual void eat() {
cout << "Woof!" << endl;
}
};
When a class dog
inherits from animal
it must override (replace) all the
pure virtual methods in animal
, otherwise dog
will still be abstract. Having
given an implementation for eat()
, dog
is now a concrete class so we can
create instances of it:
dog d;
d.eat();
If we can’t create animals, then what good is it? Well again, just because we
can’t create an actual animal
doesn’t mean we can’t store something like a
dog
in an animal*
or animal&
. That will work just fine:
animal& a = d;
a.eat(); // Prints Woof!
This means we can write functions that take animal&
and they will work with
any kind of animal, without knowing what specific kind of animal it is:
void feed_animal(animal& a) {
a.eat();
}
feed_animal(d); // Works just fine
An abstract base class defines an interface, a set of operations that we can expect to be supported. Thus, it fits the second half of our idea of inheritance: subtyping. (But note that nothing stops you from putting things other than pure virtual methods into an abstract class; then subclasses will inherit all of those things as well.)
Polymorphism
Taken together, inheritance and virtual methods give us powerful tools for
writing code that magically “does the right thing” without knowing exactly
what that is. Consider the following problem: we want to create monsters for
a game, where each monster can move and attack. But we want to allow for
different kinds of monsters. We’ll create a base class monster
to
keep track of all the stuff common to all monsters, and then we’ll create
subclasses for each particular type of monster:
class monster {
public:
virtual void move() { }
virtual void attack(monster& target) { }
private:
float health = 100;
};
(Note that the default implementations of both move
and attack
don’t do
anything.)
Note that monster::attack
takes its argument by reference, so that if it
uses any virtual methods on target
, they can be customized by each
monster type.
Now we can proceed to define different kinds of monsters:
class wolf : public monster {
public:
void move() {
pursue_prey();
}
void attack(monster& target) {
?
}
}
Let’s think a bit about who monsters can attack… Just other monsters?
What about the player? If we want to allow for both options, we’ll have to
create a player class and then somehow fit it into a hierarchy, so that
player
and monster
share a common base class. Then we can attack
anything
of that base class. Let’s call the new base class entity
:
class entity {
public:
virtual void hurt(float damage) { }
};
class player : public entity {
public:
void hurt(float damage) {
cout << "Ow!" << endl;
}
};
class monster : public entity {
public:
void attack(entity& target) { }
void hurt(float damage) {
health -= damage;
}
private:
float health;
};
class wolf : public monster {
public:
void attack(entity& target) {
target.hurt(10); // Attack for 10 damage
}
}
Abstract Data Types
A type like entity
or monster
or point2d
or almost any class is called
an abstract data type. It’s abstract in the sense that the ADT manages itself:
we don’t deal with the guts of the ADT (those are private), indeed, we don’t need
to know anything about how it works internally to use it. That’s the power
of ADTs: they free us from having to think about those things. We just think
about how to use them, not about how they work.
I should stress that ADTs are not classes; an ADT is an idea. But in C++, classes are how we build ADTs. (If we didn’t have classes, we could still have ADTs, we’d just have to build them some other way.)
The hallmark of an ADT is encapsulation and/or data hiding. That is, we either wrap up some data into a single object, or we hide some data away, or both.
As an example of a “complete” ADT, let’s look back to our polynomial class and think about what it would take to make it a first-class citizen in C++ land.
Overload all the arithmetic operators. Use
<<
and>>
to implement shifting.Overload insertion and extraction.
Maybe build a class hierarchy for “algebraic structures”, things that support arithmetic operators and variables. Base class methods would include insertion and extractions, and evaluation at a point (by giving) a value for the variable. Maybe we want to distinguish between univariate and multivariate objects? Domain/range (floats vs. ints).
class poly {
public:
poly() { }
poly(float v) {
coeff.push_back(v);
}
explicit poly(int degree) {
coeff.resize(degree, 0);
}
int degree() {
return coeff.size() - 1;
}
void expand(int degree) {
coeff.resize(degree, 0);
}
void normalize() {
int i = degree();
while(i >= 0) {
if(coeff.at(i) != 0.0)
break;
i--;
}
coeff.resize(i + 1);
}
poly operator- () {
for(float& v : coeffs)
v = -v;
}
poly operator+ (poly b) {
poly result{max(degree(), b.degree())};
expand(result.degree());
b.expand(result.degree());
for(int i = 0; i <= result.degree(); ++i)
result.coeff.at(i) = coeff.at(i) + b.coeff.at(i);
normalize();
b.normalize();
result.normalize();
return result;
}
poly operator- (poly b) {
return (*this) + -b;
}
private:
vector<float> coeff;
}
Inheritance for abstraction
Inheritance and virtual methods give us another way to do abstraction: by
writing a function that takes a reference or pointer to a base class, we can
pass it any instances of a subclass. Thus, we write one function, but it can
operate on many different kinds of things (including things that haven’t even
been written yet!). This is how functions on istream
and ostream
help us:
we don’t have to write separate functions for files, cin
/cout
, string streams,
etc.
As an example, let’s look at animals again:
class animal ;
class dog : public animal ;
class cat : public animal ;
Now suppose I want to create a class for pet owners. I don’t want to nail myself down to just cat owners or just dog owners. How can I do this?
class owner {
public:
owner(string n) {
name = n;
}
string name;
vector<animal*> pets;
};
Now an owner
can own any combination of cat
s, dog
s, or even other animal
s
(including subclasses of animal
that we haven’t even written yet).
cat whiskers{"Whiskers", "calico"};
dog fido{"Fido", "Corgi", "brown"};
owner jsmith{"John Smith"};
jsmith.pets.push_back(&whiskers);
jsmith.pets.push_back(&fido);
The catch is that the owner
class (or anything else that deals purely in
animal*
) doesn’t really know what kinds of animals
it owns: it has no real way of finding out what the “real” type of a given
animal*
is. Instead, if we want to do something different for cats vs. dogs,
we have to write it as a virtual method on animal
and then override it on
both cat
and dog
. E.g., suppose we wanted to print out an animal. You might
think of writing a function like this:
void print(animal* a) {
if(a ISA cat)
// Print cat details
else if(a ISA dog)
// Print dog details
else ...
// Etc. for all other animal subclasses
}
The problem with this is that it is brittle; we have to add extra branches to
the if-else
chain whenever we add a new subclass of animal
.
The proper way to do this is to not even try to “centralize” printing like this:
instead, we let each animal
subclass be responsible for its own printing:
class animal {
virtual void print() = 0;
};
class dog : public animal {
virtual void print() {
// print dog details
}
};
class cat : public animal {
virtual void print() {
// print cat details
}
};
This way, each class is responsible for its own printing behavior. There is no
central print
function, but if we have an animal* a
then we can do
a->print();
and it will be printed, regardless of what particular kind of animal it is.
In addition, if we want to add another animal subclass, we only need to give
that subclass its own print
method; we don’t need to change an existing
function, just add a new one.
Ideally, the owner
class shouldn’t have anything in it that is specific to
cats or dogs. Instead, whenever we find ourselves wanting to do some
dog- or cat-specific operation, we should think about how we could make that
operation into a virtual method on animal
. If it’s something that really only
applies to cats or dogs, that’s OK! Just give the other subclasses do-nothing
versions of the method. (This is the reason for all the operations on ios
that
are input- or output- specific; they just have do-nothing implementations on the
subclasses where they don’t apply.)