The End Of Object Inheritance & The Beginning Of A New Modularity

Augie is a contributor to Mercurial and various Python libraries. Nathaniel is a contributor to Pylint and Tech Lead of Melange, the application that runs Google Summer of Code and Google Code-in.

Three Premises About Software

These should be completely uncontroversial, but I want to state them clearly and succinctly since I'm going to rely on them heavily.

 1. We use types for nouns.

Of course they overlap, because at different times we wish to speak precisely about different aspects of the same things (and at different granularities of their properties).

 2. We express ourselves structurally in code.

We prefer to program as structurally as possible. When we say structurally, we mean in the shape of the code. We've found that our intent is most explicit when it's in the _actual code_, not merely in a docstring or comment. After all, how many people ignore comments? They've got a good chance of being stale, even in the presence of regular code review. NEXT: Two parts to this structure, namespacing via directories and visibility via various annotations.

Directory listing screenshot

We organize code in directories. As further proof of the power of structure, note how frustrating all those IDEs that try to completely remove directory structure from your view end up being. Namespaces are good!

public hello();
protected sekrit();
private cantTouchThis();

 


def hello(self):
def _sekrit(self):
def __cant_touch_this(self):
We use visibility annotations to reveal only part of our code. Of course Java's visibility and Python's visibility are different - one is machine-enforced and the other is half mostly-machine-enforced (dunder) and half socially-enforced (under). But both are fundamentally different from visibility specified only by documentation (such as in a hypothetical language that says "please use the accessor method rather than altering this field directly").

goto


if


for

 3. Most programming is parametric programming.

The behavior of our classes depends on the behaviors of their members. The behavior and outputs of our functions depend on their arguments. Our stateful algorithms depend on system state. Almost all the programming that we do is dependent, partial, or abstract in some way. Leaf nodes do exist - sometimes an operation really is given a timeout of sixty seconds, full stop, or the message echoed to the user really is a given string, and certainly pi is pi. But all of the hard work of programming is done partially. These premises are VIRTUES.

A Note On Visual Terminology

This is a software architecture talk and we're going to be talking about how to build complex behaviors out of simple behaviors. We want you to have no trouble following along, so we're going to introduce a simple visual grammar for our illustrations.

Open And Closed Symbols

An open symbol indicates an abstract description of some behavior - like an interface or an abstract method. A closed or filled-in symbol indicates a concrete implementation of that specification.

Different Symbols

We're going to use different tokens to represent different unrelated behaviors.

So the other day I'm programming...

... and because it's a day ending in "y", my program is abstract, partial, parameterized programming.

Problem Statement

My Solution

Orthodox Inheritance Solution

Now orthodox Object Orientation, at least the way we learned it, looks at this situation and says "that's easy! one class!", and lays it out like this: we have a single class, and it implements that enclosing interface, and our one class has public methods conforming to the enclosing interface, and the implementations of those methods call these abstract methods, and then of course there are the methods of that innermost stage. If you're using Python, there are probably underscores here and here. If you're using C++, there's probably "public" here and "friend" here and here. If you're using Java, there's probably "public" here and "protected" here and here.

Orthodox Inheritance Solution (Subclass)

This is a pretty picture but it has both theoretical and practical problems. Let's take on the practical problems first. The picture is only pretty when you've gotten it right and it's hard to get right.

Orthodox Inheritance Solution

in our case, we've got to export to our clients our three-stage implementation. And that only happens in annotation, not in the code itself. Visibility helps a little...

def GreenMethod(self):
  """
  ... Do not call OrangeMethod() in
  your implementation ...
  """
  raise NotImplementedError()
...but the rest happens in documentation, if you're lucky. It's more than likely there will be public methods your code must not call. Whatever internal staging you have, you've got to both successfully communicate it to the authors of extending classes and hope that they'll respect it. How many of you have ever used language like "reserve the right" in your documentation? As in "these methods reserve the right to call those other methods, and so those other methods should not contain calls to these methods"? Next is explaining is losing.

Butterfly ballot

There's an expression about contemporary American electoral politics: "Even if you're right, if you're explaining, you're losing.". Between separating and documenting your stages of abstraction and communicating what must be overridden, and how, and what may be overridden, and how, properly designing a class for inheritance is made up entirely of explaining. It's true that the practical issues can be overcome with experience, mentoring, and establishment of conventions, but there's also a theoretical problem...

We're violating premises 1 and 2 as we fulfill premise 3.

Who remembers our three premises? One: using types to organize, two: expressing ourselves primarily structurally, three: most programming is parametric

The implementation stages are nouns!

Did you notice that we keep saying "the" when talking about them? They're things! We describe them. We refer to them. They deserve types.

Give those nouns some types.

Nouns love types.

The implementation stages are structure!

... and we said "we prefer to express ourselves as structurally as possible". Not with visibility annotations. Not with doc strings and javadoc and design docs outside of the code that say what is and isn't our intended, safe, or guaranteed. DO NOT COVER COMPOSITION YET

Orthodox Inheritance Solution

This may have looked contrived or arbitrary, but I actually claim that our example here applies to all nondegenerate cases of object inheritance. Every class meant for inheritance has at least two of these three levels - maybe the abstract part is on top, maybe on the bottom, maybe there's some bleed between levels (particularly if a method both appears in the public interface *and* is suitable for use by other methods in the public interface), but there are no classes in which inheritance is justifiably used that have only one internal stage of abstraction. Even if all methods are at the same stage, chances are that the object's fields constitute the second, inner stage.

So what's a way that is compatible with our premises to accomplish the same task?

Composition

"composition over inheritance" is a popular saying, but we keep seeing inheritance based code. It's an explicitly one-way relationship. Traditional OO features superclasses calling abstract methods that will be filled in by concrete subclasses. Or their subclasses, etc. Instead, we require clients of our class to provide the bundle of behavior (some type that meets our stated requirements) at compile time. In most languages, this is trivial to check at compile time, and in Python you still get the benefits of simpler, easier-to-comprehend code.

Interfaces

Our Implementation

Client Implementation

Note how it's impossible for green to accidentally depend on orange. Also note how if someday, it turns out blue is wrong for a customer, we can just pass in something else that implements the interface.

Sometimes interfaces overlap.

Often one is a superset of the other. If a component part (from composition) already offers the right behavior, just delegate to it. Inheritance feels like it gets this for "free", but that's only true as measured in code on the page. Composition is _right_, and it's also not explaining, which is winning. Next is "Fault Tolerance."

Fault Tolerance

They have very different fault tolerance properties - and I'm talking about fault tolerance in their architecture, not fault tolerance of the coded system itself.

Fault (In)Tolerance For Software Architecture

  1. Calling immediate attention to the error is good.
  2. Allowing the error to silently pass unremarked is bad.
This is the inversion of fault tolerance virtue in the coded system itself. In an executing system, we think it's good if a small problem in one small subsystem doesn't crash (or even escalate to) the full system itself. In a software architecture, we think it's a good thing if a programming defect *does* crash the whole system as early and eagerly as possible (ideally parse-time or compile-time).

Object Inheritance Fails At Fault Intolerance

(3) could be one of those happens-to-work-but-is-not-supported-to-work cases that are so derided in short-term development and so crucial in long-term development.

Composition Fails At Fault Tolerance

Doing the illegal is impossible because composition substitutes unidirectional references for bidirectional references.

Extending classes can't call a method that they're not allowed to call because they don't have handles to anything that they're not allowed to call! Next is "make illegal states unrepresentable"

"Make Illegal States Unrepresentable"

(Yaron Minsky)

This is a great idea and it drives very good design in data representation and value types. We're arguing something analogous which is: [transition to next slide]

Make Illegal Behavioral Interactions Impossible

"Good code invariably has small methods and small objects. I get lots of resistance to this idea, especially from experienced developers, but no one thing I do to systems provides as much help as breaking it into more pieces."

—Kent Beck

This is not a tautology. You can't just rip your code up any which way.

The right way to break code into small methods and small objects

  1. Break it so that relationships are minimized among the resulting pieces.
  2. Break it so that unidirectional relationships dominate and bidirectional relationships are absent.
Examples: breaking a system into a provider and client, caller and callee, or even a master and two mutually-unaware slaves. NEXT: This is what motivates our premise

before

after

This is what motivates our premise about expressing ourselves structurally.

We make an entire large class of defects impossible for ourselves and our collaborators to make. NEXT: How your code will change.

How You Can Expect Your Code To Change

What will happen if you adopt this?

You'll define types everywhere.

Interfaces in the java sense, so in python these are completely abstract classes.

Your clients will use your interfaces and not your classes.

Your classes will start to stick out in your API.

The only things they'll offer beyond what is specified by your interfaces is... construction.

You'll stop providing public classes.

Once you define types to describe the values you pass across API boundaries, communicating the particular class of the value that implements the type becomes redundant. A client will request of your code an object of type Foo, and there will be no need for you to fit in any "oh and by the way this Foo is a NetworkFoo and not a FileFoo".

Your constructors will turn into factory functions.

Because clients only care about the type of the object that they get from you and not its class, there will be no need to provide them with constructors of specific classes. Your constructors will transmute into factory methods and from there into simple functions.

Every one of your modules will transform into collections of types and functions.

NEXT: Carmack quote about sometimes only needing a function.

"Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."

— John Carmack

Sometimes? How about always? NEXT: Don't be afraid of having powerful functions.

Don't be afraid of winding up with a few very powerful functions.

A Very Powerful Function

That's it. That's all it takes to express what at the beginning of this talk was a large, complicated, traditional class. And this is an eloquent expression of how I initially described the problem: I can give an implementation of the enclosing interface if only clients tell me how to construct an implementation of the middle given an implementation of the inner. NEXT: value types may survive.

Value Types

There may be some "dumb data object" types and "value" types left over in public class form, or these may get swept up too depending on your vigilance. After this, transition to anti-rumsfeldianism.

Sea Of Self-Contained, Disconnected Modules

NEXT: "And now for something completely different."

And now for something completely different...

Donald Rumsfeld

Remember This Guy? He had a way with words.

Donald Rumsfeld

"We know where they are. They're in the area around Tikrit and Baghdad and east, west, south, and north somewhat."

Donald Rumsfeld

"... the absence of evidence is not the evidence of absence."

Anti-Rumsfeldian Software Design

Donald Rumsfeld

"As you know, you go to war with the army you have, not the army you might want or wish to have at a later time."

Does anything like this apply to software?

We've established that we want to build software modules out of simple collections of input types, functions, and output types. But they don't fit together! And without fitting together we just have a bunch of modules, not a single application. That said, don't just clutter the modules to just fit together. Always write the complex, sophisticated, hardworking parts of your software with the types and dependencies that you wish you had. Never those that you actually have. Then write adapter functions between the types to glue the modules together into a single application.

Adapter

Stateless adapters are cheap, and provide clearer higher-level code.

You read code a lot more than you write it. Having your high-level logic not be full of asking a lower level for something complex is valuable. Look at the popularity of ORMs. How many of you have written SQL-using code without an ORM? It's a mess. Adapters let you hide the mess. NEXT: Modules assembled into an application.

Interadapted Modules

NEXT: "but I can't define the translation layer!"

...but I can't define the translation layer well!

This probably means you haven't thought the problem space through enough. In our experience, once the problem is thought through adequately, the translation layer becomes quite obvious. Note that we don't always write our code with the adapter in place immediately - we sometimes defer it until we've figured out the final shape.

Summary

Types, Functions, & Modules Assembled Into Applications Types, functions, & modules make applications.

Types, Functions, and Modules get stiched together into Applications. This is how we build big software and think you should too.

qr code

code.google.com/a/google.com/p/end-of-object-inheritance/

Thank you!