Tuesday, 14 May 2013

How are Java generics implemented under the hood?

Introduction


Today at lunch we had a chat about generics and their implementation in Java. Generics are available since Java 5. To stay byte code compatible between Java 1.4 and 5 generics are implemented by using type erasure. That means that at compile time all generic types are striped out and the resulting byte code only contains raw types like:

List a = ArrayList();
instead of
List<MyType> a = new ArrayList<MyType>();

This means that when we read elements of the list at runtime there needs be a cast added to check that both sides of the assignment have a compatible type:

MyType a = (MyType)list.get(0);
instead of
MyType a = list.get(0);

Java Class File Disassembler

To analyse how this cast is actually implemented in Java byte code one can use the Java Class File Disassembler. This command line tool ships with every Oracle JDK distribution and is called javap.

To inspect the byte code of a given class file use the command: javap -c <classfile>
For example:

javap -c org.henrikeichenhardt.javabytecode.Main

The output can get quite verbose and it makes sense to boil down the problem you want to look at to a few lines of code - just enough so you can create and compile you scenario.

Under the hood

For our scenario the Java code would look like this:

package org.henrikeichenhardt.javabytecode;

import java.util.ArrayList;
import java.util.List;

public class Main {
   public static void main(final String[] args) {
    final List<Main> list = new ArrayList<Main>();
    list.add(new Main());
    
    final Main value = list.get(0);
   }
}

We simply create an java.util.ArrayList with type of Main and add one element to this list. Then we retrieve this element again from the list. This few lines of code result in the following output of javap:

Compiled from "Main.java"
public class org.henrikeichenhardt.javabytecode.Main extends java.lang.Object{
public org.henrikeichenhardt.javabytecode.Main();
  Signature: ()V
  Code:
   0:   aload_0
   1:   invokespecial   #8; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Signature: ([Ljava/lang/String;)V
  Code:
   0:   new     #16; //class java/util/ArrayList
   3:   dup
   4:   invokespecial   #18; //Method java/util/ArrayList."<init>":()V
   7:   astore_1
   8:   aload_1
   9:   new     #1; //class org/henrikeichenhardt/javabytecode/Main
   12:  dup
   13:  invokespecial   #19; //Method "<init>":()V
   16:  invokeinterface #20,  2; //InterfaceMethod java/util/List.add:(Ljava/lang/Object;)Z
   21:  pop
   22:  aload_1
   23:  iconst_0
   24:  invokeinterface #26,  2; //InterfaceMethod java/util/List.get:(I)Ljava/lang/Object;
   29:  checkcast       #1; //class org/henrikeichenhardt/javabytecode/Main
   32:  astore_2
   33:  return

}


The first block of information beginning with
public org.henrikeichenhardt.javabytecode.Main();
is the code to generate the (implicit) constructor of the Main object. We can ignore this for now. The second block of information is the byte code representing the code of the static main method
public static void main(java.lang.String[]);
On the first few lines the instances of the ArrayList is created and a reference is stored in on the stack. Then an instance of the Main object is created and added to the list. Now lets focus on reading an element from the list:

final Main value = list.get(0);

This resulting Java byte code looks like this:


   24:  invokeinterface #26,  2; //InterfaceMethod java/util/List.get:(I)Ljava/lang/Object;
   29:  checkcast       #1; //class org/henrikeichenhardt/javabytecode/Main
   32:  astore_2

Without going into to much detail here what basically happens is that invokeinterface calls the method get of our list and puts the result on top of the stack. Then the command checkcast is executed. It takes only one argument which is a reference to the type which needs to be checked (Main). This type is then compared with the top of the stack (where we previously put the result of list.get(0)).

How does checkcast work?


The documentation of the checkcast instruction reveals how this check is done:
  • If the stack reference is null nothing happens.
  • If the stack reference can be cast to the type provided nothing happens.
  • Otherwise a ClassCastException is thrown.
That means that checkcast is very similar to an instanceOf test. If the checkcast test is passed execution continues normally and the top of the stack is stored in a local variable.

So what the problem boils down to is how does checkcast know if it can cast a value to a type? The following is copied from [1]:

The following rules are used to determine whether an objectref that is not null can be cast to the resolved type: if S is the class of the object referred to by objectref and T is the resolved class, array, or interface type, checkcast determines whether objectref can be cast to type T as follows:
  • If S is an ordinary (nonarray) class, then:
    • If T is a class type, then S must be the same class as T, or S must be a subclass of T;
    • If T is an interface type, then S must implement interface T.
  • If S is an interface type, then:
    • If T is a class type, then T must be Object.
    • If T is an interface type, then T must be the same interface as S or a superinterface of S.
  • If S is a class representing the array type SC[], that is, an array of components of type SC, then:
    • If T is a class type, then T must be Object.
    • If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3).
    • If T is an array type TC[], that is, an array of components of type TC, then one of the following must be true:
      • TC and SC are the same primitive type.
      • TC and SC are reference types, and type SC can be cast to TC by recursive application of these rules.
It is now up to the JVM implementation how efficient this checks are executed and if any profiling or heuristics are applied. In our example S is an ordinary class and T is a class type and both are the same which satisfies the first case above.

[1] Oracle checkcast

5 comments:

  1. Replies
    1. Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging. If anyone wants to become a Java developer learn from Java Training in Chennai. or learn thru Java Online Training India . Nowadays Java has tons of job opportunities on various vertical industry.

      Delete
  2. It is really a great work and the way in which u r sharing the knowledge is excellent.Thanks for helping me to understand basic concepts. As a beginner in java programming your post help me a lot.Thanks for your informative article.java training in chennai | chennai's no.1 java training in chennai

    ReplyDelete
  3. perfect explanation about java programming .its very useful.thanks for your valuable information.java training in chennai | java training in velachery

    ReplyDelete
  4. Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more .
    java training in chennai | Best java training in chennai | java training | java training in chennai with placement

    ReplyDelete