Samstag, 1. August 2015

With great powers ..

[This is a reprint of an article I wrote in 2011 for AltDevBlogADay, which ceased to exist, which is why I felt like putting it on my blog. No doubt, I would write this article completely differently today, but I think it is actually kind of nice to preserve my perspective I had back then by not altering the article.]

Once, James Gosling (inventor of Java) was asked what he'd change if he could do Java over again. He replied: "I'd leave out classes". I've read about this in this —kind of controversial— article by Allen Holub: Why extends is evil.

But to set things clear: I don't want to start the same discussion here as the article got ("he's so wrong, 'extends' rulez!" vs. "he's absolutely right, worship 'implements'!"), and I'm pretty sure it wasn't Holub's intension either. Anyway, Gosling explained right after his statement that he actually addresses implementation inheritance to be the problem, not classes in general.

Now, when I recap my own programming education, I remember that object-orientation was always taught as something that is strongly connected to the mechanism of inheritance (which is not necessarily wrong, but only part of the truth).

And, talking to my students nowadays highlights the same issues I had back then: It is hard for novices to differentiate between implementation inheritance (as a reuse mechanism) and interface inheritance (as a software design mechanism), especially when you learn OO with Java or C++, where implementation inheritance always comes with interface inheritance automatically (reusing a class' implementation by extending it implicitly means that you inherit its interface).

So, soon you got statements like: "Why should I use explicit interfaces anyway?"… or, "I don't get the idea of interfaces, I use inheritance instead". What's more, other important aspects of object-orientation, like polymorphism, are also intertwined with inheritance in statically-typed languages (nothing to blame them for, it's just how it works).

My point here is, that this — in the minds of programming novices often, and in the minds of veterans often enough — leads to a simplified relationship: "inheritance is object-orientation"… which we could display in UML like this:

Beware! Not true!

In this post, I would like to introduce and discuss the fragile base class problem (FBCP). I think, it is a very good showcase why the introduction of an explicit interface concept in Java or C# has its reasons, but, first and foremost, I hope that it will illustrate how tricky your code can get when you use implementation inheritance (strong coupling). I also hope that this is not only interesting for the novices among us ;).

Note that the examples are dead simple and not good quality code, but intended to highlight the basics of the FBCP. If you are interested in getting a deeper insight, I recommend the paper A Study of The Fragile Base Class Problem.

Let us imagine the following classes, where the Collection class is part of a framework (base system) and the CountingCollection class is part of an extension somewhere else (sub system):

// Version 1
import java.util.ArrayList;


public class Collection {

 ArrayList data = new ArrayList();

 public void addItem (Object item) {
  data.add(item);
 }

 public void addRange (ArrayList items) {
  for(Object o : items) {
   this.addItem(o);
  }
 }
}
 
public class CountingCollection extends Collection {
 int n = 0;

 public void addItem (Object item) {
  n++;
  super.add(item);
 }

 public int getSize() {
  return n;
 }
}

The Collection class represents a collection of items and you can add either a single item or a range of them. The extension, CountingCollection, adds a counter variable to be aware the number of added items. Everything works as intended.

Now, after a revision of the base system, the base class got changed.


// Version 2
import java.util.ArrayList;


public class Collection {
 ArrayList data = new ArrayList();

 public void addItem (Object item) {
  data.add(item);
 }

 public void addRange (ArrayList items) {
  // revised
  data.addRange(items)
 }
}

This change is, considering the base system, valid, since it does not change the externally observable behavior of objects of type Collection. However, it breaks the sub system. This is because the subclass relies on the self-call in the first version of the base system in line 13, which means that it relies on the internal behavior (the implementation) of Collection. Here we face the FBCP.

Having a more general look at this circumstance, it means that "any open system applying code inheritance and self-recursion in an ad-hoc manner is vulnerable to this problem."

The fact that the immediate cause and the observable effect of the FBCP can spread between different systems makes it hard to track down, though the goal should be to avoid its occurrence in the first place. But how?

Well, in their above mentioned study, the authors introduce a flexibility property that must not be harmed by the programmers in order to avoid the FBCP. In short, the property describes that a modification M to a class C (the actual extension) must remain a valid refinement of C when applied to a refined version of C (C' in the figure below; mod reads "modifies").


Flexibility Property to avoid the FBCP

This is a bit theoretical, but in the essence it means that it is the duty of the programmer to ensure that everything's coded fine; in the end, everyone can easily google for things like the Open-Closed Principle, can't we?

Let's take a more cynical or maybe naive perspective while looking at the upcoming example. It is also borrowed from the mentioned study, and it is only one of five examples that show orthogonal aspects of the FBCP, making it more than a trivial thing.


public class BaseClass {

 int x = 0;

 public void aMethod() {
  x = x+1;
 }

 public void anotherMethod() {
  x = x+1;
 }
}
 
public class SubClass {

 public void anotherMethod() {
  this.aMethod();
 }
}
 
//New base class
public class BaseClass {
 
 int x = 0;

 public void aMethod() {
  this.anotherMethod();
 }

 public void anotherMethod() {
  x = x+1;
 }
}

This example highlights the aspect of "Unanticipated Mutual Recursion", and it could make us ask "Why do modern languages even allow that these problems can arise?", or in other words "Why don't we have languages that eliminate such issues by definition?"

Well, on the one hand, there are code validation and checking tools that already support us programmers in writing good quality code. But I don't think that, especially considering the last example, tools are able to detect fragile base classes automatically.

On the other hand, the questions address something that accompanies the history of programming from the very beginning. Take pointers, for example. In the hands of an expert powerful weapon, but amateurs can do horrible things (while having good intensions!). And every one of us knows a guy who still swears that Algol 60 is the best language ever.

So, maybe there will be a new language in the near future that explicitly separates implementation inheritance and interface inheritance (and maybe no one will consider it useful), but until then, we, as lecturers and senior programmers, need to make sure that the upcoming generation of programmers is aware of the dangers in implementation inheritance and that they understand object-orientation more like this:


Object-Orientation how it should be considered
In the end, it is just like Stan Lee once said: "With great power there must also come — great responsibility!"

Keine Kommentare:

Kommentar veröffentlichen