An end to inheritance
The End Of Object Inheritance & The Beginning Of A New Modularity
- Augie Fackler & Nathaniel Manista
- Google, Inc.
- 25 July 2013
Augie is a contributor to Mercurial and various Python libraries.
Nathaniel is a contributor to Pylint and Tech Lead of Melange, the
application that runs Google Summer of Code and Google Code-in.
Three Premises About Software
These should be completely uncontroversial, but I want
to state them clearly and succinctly since I'm going to rely
on them heavily.
1. We use types for nouns.
Of course they overlap, because at different times we
wish to speak precisely about different aspects of the same
things (and at different granularities of their properties).
2. We express ourselves structurally in code.
We prefer to program as structurally as possible.
When we say structurally, we mean in the shape of the
code. We've found that our intent is most explicit when it's in the
_actual code_, not merely in a docstring or comment. After all, how
many people ignore comments? They've got a good chance of being stale,
even in the presence of regular code review. NEXT: Two parts to this
structure, namespacing via directories and visibility via various
annotations.
We organize code in directories. As further proof of the power
of structure, note how frustrating all those IDEs that try to
completely remove directory structure from your view end up
being. Namespaces are good!
public hello();
protected sekrit();
private cantTouchThis();
def hello(self):
def _sekrit(self):
def __cant_touch_this(self):
We use visibility annotations to reveal only part of our
code. Of course Java's visibility and Python's visibility are
different - one is machine-enforced and the other is half
mostly-machine-enforced (dunder) and half socially-enforced
(under). But both are fundamentally different from visibility
specified only by documentation (such as in a hypothetical language
that says "please use the accessor method rather than altering this
field directly").
3. Most programming is parametric programming.
The behavior of our classes depends on the behaviors
of their members. The behavior and outputs of our functions
depend on their arguments. Our stateful algorithms depend
on system state. Almost all the programming that we do is
dependent, partial, or abstract in some way. Leaf nodes do
exist - sometimes an operation really is given a timeout of
sixty seconds, full stop, or the message echoed to the user
really is a given string, and certainly pi is pi. But all of
the hard work of programming is done partially. These
premises are VIRTUES.
A Note On Visual Terminology
This is a software architecture talk and we're going
to be talking about how to build complex behaviors out of
simple behaviors. We want you to have no trouble following
along, so we're going to introduce a simple visual grammar
for our illustrations.
An open symbol indicates an abstract description of
some behavior - like an interface or an abstract method. A
closed or filled-in symbol indicates a concrete implementation
of that specification.
We're going to use different tokens to represent
different unrelated behaviors.
So the other day I'm programming...
... and because it's a day ending in "y", my program is
abstract, partial, parameterized programming.
Now orthodox Object Orientation, at least the way we learned it, looks
at this situation and says "that's easy! one class!", and lays it out like this:
we have a single class, and it implements that enclosing interface, and our one
class has public methods conforming to the enclosing interface, and the
implementations of those methods call these abstract methods, and then of course
there are the methods of that innermost stage. If you're using Python, there are
probably underscores here and here. If you're using C++, there's probably "public"
here and "friend" here and here. If you're using Java, there's probably "public"
here and "protected" here and here.
This is a pretty picture but it has both theoretical and
practical problems. Let's take on the practical problems first. The
picture is only pretty when you've gotten it right and it's hard to
get right.
in our case, we've got to export to our clients our three-stage implementation. And
that only happens in annotation, not in the code itself. Visibility helps a little...
def GreenMethod(self):
"""
... Do not call OrangeMethod() in
your implementation ...
"""
raise NotImplementedError()
...but the rest happens in documentation, if you're lucky. It's
more than likely there will be public methods your code must not
call. Whatever internal staging you have, you've got to both
successfully communicate it to the authors of extending classes and
hope that they'll respect it. How many of you have ever used language
like "reserve the right" in your documentation? As in "these methods
reserve the right to call those other methods, and so those other
methods should not contain calls to these methods"? Next is
explaining is losing.
There's an expression about contemporary American electoral
politics: "Even if you're right, if you're explaining, you're
losing.". Between separating and documenting your stages of
abstraction and communicating what must be overridden, and how, and
what may be overridden, and how, properly designing a class for
inheritance is made up entirely of explaining. It's true that the
practical issues can be overcome with experience, mentoring, and
establishment of conventions, but there's also a theoretical problem...
We're violating premises 1 and 2 as we fulfill premise 3.
Who remembers our three premises? One: using types to organize,
two: expressing ourselves primarily structurally, three: most
programming is parametric
The implementation stages are nouns!
Did you notice that we keep saying "the" when talking about them? They're things!
We describe them. We refer to them. They deserve types.
Give those nouns some types.
Nouns love types.
The implementation stages are structure!
... and we said "we prefer to express ourselves as structurally
as possible". Not with visibility annotations. Not with doc strings
and javadoc and design docs outside of the code that say what is and
isn't our intended, safe, or guaranteed. DO NOT COVER COMPOSITION YET
This may have looked contrived or arbitrary, but I actually claim that our example
here applies to all nondegenerate cases of object inheritance. Every class meant for
inheritance has at least two of these three levels - maybe the abstract part is on top,
maybe on the bottom, maybe there's some bleed between levels (particularly if a method
both appears in the public interface *and* is suitable for use by other methods in the
public interface), but there are no classes in which inheritance is justifiably used that
have only one internal stage of abstraction. Even if all methods are at the same stage,
chances are that the object's fields constitute the second, inner stage.
So what's a way that is compatible with our premises to accomplish the same task?
Composition
"composition over inheritance" is a popular saying, but we keep
seeing inheritance based code.
It's an explicitly one-way relationship. Traditional OO
features superclasses calling abstract methods that will be filled in
by concrete subclasses. Or their subclasses, etc. Instead, we require
clients of our class to provide the bundle of behavior (some type that
meets our stated requirements) at compile time. In most languages,
this is trivial to check at compile time, and in Python you still get
the benefits of simpler, easier-to-comprehend code.
Note how it's impossible for green to accidentally depend on
orange. Also note how if someday, it turns out blue is wrong for a
customer, we can just pass in something else that implements the
interface.
Sometimes interfaces overlap.
Often one is a superset of the other. If a component
part (from composition) already offers the right behavior, just
delegate to it. Inheritance feels like it gets this for "free",
but that's only true as measured in code on the page.
Composition is _right_, and it's also not explaining, which
is winning. Next is "Fault Tolerance."
Fault Tolerance
They have very different fault tolerance properties - and I'm
talking about fault tolerance in their architecture, not fault tolerance
of the coded system itself.
Fault (In)Tolerance For Software Architecture
- Calling immediate attention to the error is good.
- Allowing the error to silently pass unremarked is bad.
This is the inversion of fault tolerance virtue in the coded
system itself. In an executing system, we think it's good if a small
problem in one small subsystem doesn't crash (or even escalate to)
the full system itself. In a software architecture, we think it's a
good thing if a programming defect *does* crash the whole system as
early and eagerly as possible (ideally parse-time or compile-time).
Object Inheritance Fails At Fault Intolerance
- What happens if an extending class violates its parent's stages of abstraction?
- You might get silent state corruption if the class stores mutable state.
- You might get an infinite loop.
- You might get an infinite loop... sometimes.
- You might get perfect behavior until a year later when the library in which the superclass is defined is upgraded.
(3) could be one of those happens-to-work-but-is-not-supported-to-work
cases that are so derided in short-term development and so crucial in
long-term development.
Composition Fails At Fault Tolerance
Doing the illegal is impossible because composition substitutes unidirectional
references for bidirectional references.
Extending classes can't call a method that they're not allowed
to call because they don't have handles to anything that they're not
allowed to call! Next is "make illegal states unrepresentable"
"Make Illegal States Unrepresentable"
(Yaron Minsky)
This is a great idea and it drives very good design in data representation
and value types. We're arguing something analogous which is: [transition to next
slide]
Make Illegal Behavioral Interactions Impossible
"Good code invariably has small methods and small objects. I get lots of resistance
to this idea, especially from experienced developers, but no one thing I do to
systems provides as much help as breaking it into more pieces."
—Kent Beck
This is not a tautology. You can't just rip your code up any which way.
The right way to break code into small methods and small objects
- Break it so that relationships are minimized among the resulting pieces.
- Break it so that unidirectional relationships dominate and bidirectional
relationships are absent.
Examples: breaking a system into a provider and client, caller and callee,
or even a master and two mutually-unaware slaves. NEXT: This is what motivates our premise
This is what motivates our premise about expressing ourselves structurally.
We make an entire large class of defects impossible for ourselves and
our collaborators to make. NEXT: How your code will change.
How You Can Expect Your Code To Change
What will happen if you adopt this?
You'll define types everywhere.
- Interfaces
- Completely abstract classes
Interfaces in the java sense, so in python these are completely
abstract classes.
Your clients will use your interfaces and not your classes.
Your classes will start to stick out in your API.
The only things they'll offer beyond what is specified by
your interfaces is... construction.
You'll stop providing public classes.
Once you define types to describe the values you pass across
API boundaries, communicating the particular class of the value that
implements the type becomes redundant. A client will request of your
code an object of type Foo, and there will be no need for you to fit
in any "oh and by the way this Foo is a NetworkFoo and not a FileFoo".
Your constructors will turn into factory functions.
Because clients only care about the type of the object that
they get from you and not its class, there will be no need to provide
them with constructors of specific classes. Your constructors will
transmute into factory methods and from there into simple functions.
Every one of your modules will transform into collections of types and functions.
NEXT: Carmack quote about sometimes only needing a function.
"Sometimes, the elegant implementation is just a function. Not a
method. Not a class. Not a framework. Just a function."
— John Carmack
Sometimes? How about always? NEXT: Don't be afraid of having
powerful functions.
Don't be afraid of winding up with a few very powerful functions.
That's it. That's all it takes to express what at the beginning
of this talk was a large, complicated, traditional class. And this is
an eloquent expression of how I initially described the problem: I can
give an implementation of the enclosing interface if only clients tell
me how to construct an implementation of the middle given an
implementation of the inner. NEXT: value types may survive.
Value Types
There may be some "dumb data object" types and "value" types
left over in public class form, or these may get swept up too
depending on your vigilance. After this, transition to
anti-rumsfeldianism.
NEXT: "And now for something completely different."
And now for something completely different...
Remember This Guy? He had a way with words.
"We know where they are. They're in the area around Tikrit and Baghdad and east, west, south, and north somewhat."
"... the absence of evidence is not the evidence of absence."
Anti-Rumsfeldian Software Design
"As you know, you go to war with the army you have, not the army you might want or wish to have at a later time."
Does anything like this apply to software?
We've established that we want to build software modules out of
simple collections of input types, functions, and output types. But
they don't fit together! And without fitting together we just have a
bunch of modules, not a single application. That said, don't just
clutter the modules to just fit together. Always write the complex,
sophisticated, hardworking parts of your software with the types and
dependencies that you wish you had. Never those that you actually
have. Then write adapter functions between the types to glue the
modules together into a single application.
Stateless adapters are cheap, and provide clearer higher-level code.
You read code a lot more than you write it. Having your
high-level logic not be full of asking a lower level for something
complex is valuable. Look at the popularity of ORMs. How many of you
have written SQL-using code without an ORM? It's a mess. Adapters let
you hide the mess. NEXT: Modules assembled into an application.
NEXT: "but I can't define the translation layer!"
...but I can't define the translation layer well!
This probably means you haven't thought the problem space
through enough. In our experience, once the problem is thought through
adequately, the translation layer becomes quite obvious. Note that we
don't always write our code with the adapter in place immediately - we
sometimes defer it until we've figured out the final shape.
Summary
Types, functions, & modules make applications.
Types, Functions, and Modules get stiched together into Applications. This is how
we build big software and think you should too.