Wednesday, August 09, 2006

Automatic Overriding of Parameterized Methods in Java 5

I thought I was going crazy for a while because I seemed to be getting all kinds of odd behaviors, both at compile-time and at run-time, while trying to play with generics. Thanks to the help of the "javap" command, I think I've figured out a key part of how parameterized methods get overloaded versus how normal methods get overloaded.

Consider the following two classes:

class CompareAgainst<T>
{
public int compareAgainst ( T a ) { return 1; }
}

public class Main2 extends CompareAgainst<Main2>
{
public static int testClassComparable ( Object[] a )
{
Main2 x = (Main2)a[0];
Object y = a[1];
return x.compareAgainst( y ); // Compiler error!
}

public static void main (String[] argv)
{
Main2 m1 = new Main2( );
Main2 m2 = new Main2( );
Main2[] mArray = new Main2[] { m1, m2 };
System.out.println( testClassComparable( mArray ) );
}
}


This seems to be a pretty straight-forward piece of code that demonstrates the correct use of generics. We have a base class CompareAgainst that contains a method compareAgainst() that takes a parameterized type as its argument. Just as we'd expect, when we have Main2 extend CompareAgainst and fill in the parameterized type as Main2, the method compareAgainst(T a) effectively becomes compareAgainst(Main2 a), and so in testClassComparable when we force one of the classes to be an Object, the compiler checks to see if compareAgainst(Object a) exists; it doesn't, so there's a compiler error:

Main2.java:23: compareAgainst(Main2) in CompareAgainst cannot be applied to (java.lang.Object)

However, if we change the implementation of the testClassComparable method so that instead of casting the first parameter to Main2, it instead casts it to a CompareAgainst instance, surprisingly the code compiles and runs just fine!

public class   Main2  extends CompareAgainst<Main2>
{
public static int testClassComparable ( Object[] a )
{
CompareAgainst x = (CompareAgainst)a[0]; // Change cast
Object y = a[1];
return x.compareAgainst( y ); // works fine!
}

public static void main (String[] argv)
{
Main2 m1 = new Main2( );
Main2 m2 = new Main2( );
Main2[] mArray = new Main2[] { m1, m2 };
System.out.println( testClassComparable( mArray ) );
}
}

You might be wondering, as I was, how the compiler suddenly found compareAgainst(Object a) whereas previously it thought it didn't exist. The key is to understand that the compiler can only use the type that we specify. In the first case, we specified that the type of the class was Main2. What methods does Main2 contain? In this first case, the compiler looks at what methods Main2 has and sees only compareAgainst(Main2 a), because we parameterized the compareAgainst(T a) that was defined in the superclass CompareAgainst. So the compiler throws an error.

In the second case though, inside the testCompareAgainst() method we told the compiler that the type of the object was CompareAgainst. The compiler has no choice but to listen to us and go look at CompareAgainst to see what methods are available. You might be wondering what the compiler sees when it looks at CompareAgainst, since the class defines compareAgainst() to take a parameterized type T. But "javap" will settle that question pretty quickly:

javap CompareAgainst
class CompareAgainst extends java.lang.Object{
public int compareAgainst(java.lang.Object);
}

It looks like compareAgainst takes an Object as its parameter! This is our good old friend type erasure rearing its head. The compiler first compiles the CompareAgainst class down to binary form, and there it performs the type erasure, leaving just Object as the method parameter. So when the compiler is trying to compile Main2, it checks the binary version of CompareAgainst and sees that compareAgainst(Object a) exists, and thus compiles the code. This code will also run just fine.

All of this makes sense so far once you understand type erasure and how the compiler resolves methods in other classes during compilation. What's really surprising though is if you then overload compareAgainst() in Main2:

public class Main2 extends CompareAgainst<>
{
public int compareAgainst ( Main2 m ) { return -1; }

public static int testClassComparable ( Object[] a )
{
CompareAgainst x = (CompareAgainst)a[0];
Object y = a[1];
return x.compareAgainst( y ); // Still looks like CompareAgainst.compareAgainst(Object)
}

public static void main (String[] argv)
{
Main2 m1 = new Main2( );
Main2 m2 = new Main2( );
Main2[] mArray = new Main2[] { m1, m2 };
System.out.println( testClassComparable( mArray ) );
}
}

Here we have explicitly defined a compareAgainst( Main2 a ) method in Main2. We might assume that this is a case of method overloading, since we know that in the base class it's really a compareAgainst(Object o) that's defined. In that case, we know that method overloading is determined at compile-time based on the types that we have told the compiler about. In this case, the compiler still thinks it has a CompareAgainst instance, not a Main2 instance, and the parameter being sent in still looks like an Object, not a Main2 instance, so the compiler should still call the superclass's compareAgainst method, not our newly defined method. If this was regular method overloading, this is exactly the behavior we would get.

However, this is not the way the compiler sees the situation. From the compiler's point of view, when we specialized the base class with Main2, we created a compareAgainst(Main2) method automatically IN THE BASE CLASS. Again, this is not what the byte code says, but rather what the compiler thinks. When we then define our own compareAgainst(Main2) method, the compiler assumes we want to override the superclass's implementation. Thus, it proceeds to hide the superclass's implementation and make our explicitly defined implementation the one that is called. Running the code above confirms that this is what happens, our new compareAgainst(Main2) method is called, not the version in the base class.

But how does the compiler make the overridden method get called when all it knows is that a compareAgainst(Object o) is being called on an instance of a CompareAgainst object? Here is where the compiler does a bit of magic. The compiler understands that even though we have specialized CompareAgainst with Main2, and thus conceptually created a compareAgainst(Main2) in the base class, in reality all that will be left in the base class when it gets compiled is compareAgainst(Object o). So if the compiler wants to enforce that the overridden method is called every time, what it actually has to do is override not compareAgainst(Main2) from the base class, but compareAgainst(Object o)!

If we use "javap -c" to examine the source code for Main2, we see that that is exactly what has happened:

public int compareAgainst(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: checkcast #4; //class Main2
5: invokevirtual #9; //Method compareAgainst:(LMain2;)I
8: ireturn

public int compareAgainst(Main2);
Code:
0: iconst_m1
1: ireturn

Here, we see that the compiler has inserted a compareAgainst(Object o) automatically into our Main2 class. This causes the compareAgainst(Object o) from the superclass to be overridden. The compiler implements compareAgainst(Object o) so that it simply casts the argument to a Main2, and then calls our compareAgainst(Main2) method.

Now if we go back and check the code again, we can understand how our new compareAgainst() gets called even though the compiler only thinks it has a handle to a CompareAgainst class, and even though the parameter to compareAgainst() looks like an Object. During runtime, the JVM actually does invoke compareAgainst(Object), but there is a compareAgainst(Object) defined in Main2, namely the one the compiler defined for us. Thus that version gets called, and that version in turn calls our compareAgainst(Main2) method explicitly. Note that this mechanism will work across multiple levels of inheritance. If you were to create a class called Main3 that extended Main2, and then defined your own compareAgainst(Main2) in that Main3 class, the compiler would again insert the overridded compareAgainst(Object) into the Main3 instance to force the Main3 definition of the method to be called instead of any of the superclass versions.

This is the exact mechanism that enables Comparable to work in things like Arrays.sort(). Within the implementation of Arrays.sort(Object[]), the method casts each object to an instance of Comparable, and then calls compareTo(Object) on it. But because the compiler has inserted a compareTo(Object) into the definition of every class that implements Comparable, the correct compareTo(T) is called and the sorting algorithm works properly.

One last note: This mechanism works because the compiler is attempting to ensure that method overriding works correctly. This does not have anything to do with method overloading. If we define a compareAgainst(Integer) in any of our classes, then we are overloading compareAgainst(), not overriding it. As with any overloaded method, the compiler will only call that method if the types match up at compile-time. So unless we explicitly cast something to Integer and pass it into compareAgainst(Integer), that overloaded method will not be called.

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home