Thursday, September 29, 2011

Reference Types In Java - Part 1


Lately, I have been learning a thing or two about the JVM internals. And one of the most interesting things that I came to know about was, the existence of different types of references in Java.

To go about it, there are actually 4 kinds of reference types in Java.

  1. Strong references.
  2. Soft references.
  3. Weak references.
  4. Phantom references.

These references are different solely because of the existence of a garbage collection mechanism in the JVM. That is because, the decision of reclaiming memory from the object heap depends not only on the fact that there are active references to an object, but also on the type of reference to the object.

Lets try to see how each of them differ from one another.




Strong references

As a normal programmer(that is, if you consider programmers to be 'normal'), we are most likely to encounter only the most ubiquitous form of references - Strong references. The name of this reference type should by itself give you an idea of the importance of the reference in the opinion of the garbage collector. Any object with an active strong reference to it, will never be garbage collected(except in rare circumstances, such as cyclical references).

Say, for example, when you create a class Employee, and you create a reference to a new employee object like this

Employee emp = new Employee();

you are actually creating a strong reference. An object to which a reference is created in this way will be eligible for garbage collection only when the reference is to it is pointed to null.
So, if you write this

emp=null;


and there are no more references created to this object until the garbage collector runs again, this object will become eligible for garbage collection, and is most likely to be garbage collected.

So, strong references are pretty simple to understand, probably because most of the code that you commonly write mainly consists of strong references. And we let the smart garbage collector take care of cleaning up the memory for us.

However, there might be cases where you need more control over the lifetime of an object. What if you want to keep some kind of a pool of objects, but still want it to be garbage collected if your JVM is running out of memory? In such cases although you may want a reference to the object most of the time, you are willing to let go of a memory reference for a better JVM performance, when times are crucial. This is where the other kinds of references come into the picture.

The 3 remaining Reference types that I discuss here are subclasses of an abstract class Reference.




Soft references

The soft reference class is used to create a soft reference to an existing object that is already referred to by a strong reference. The actual object on the heap that is being referred to is called the referent.

In order to create a SoftReference to an object, you first need to create a strong reference to that object and then pass that object as a parameter to the constructor of the SoftReference object, as shown in the code below.

Employee emp = new Employee(); //1

SoftReference<Employee> softRef = new SoftReference<employee>(emp);

In the above code, a few interesting things have happened.

  • In line #1,a (referent)object is created and allocated memory on the heap that represents the Employee object.
  • Again, in line #1, a strong reference is created to the newly created employee object. We call this reference 'emp'.
  • In line #2, a new SoftReference object is created and allocated memory on the heap. This is a special object because it contains an internal reference to the (referent) object passed to it in the constructor, i.e. in our case, the actual Employee object on the heap.
  • Again in line #2, a strong reference is created to the SoftReference object on the heap. We call this reference 'softRef'.


So, in total, we have 2 strong references created. And the object that represents the soft reference, internally holds a reference to the actual employee object.

So what is it that makes soft references useful? Soft references are considered to be a special kind of reference by the garbage collector. Let us assume that at a later point of time, we nullify the 'emp' reference as follows and no new strong references were created to this emp object.


emp=null;


Now lets assume that the garbage collector decides to run at this point of time. What it will see is that the current employee object only has a soft reference pointing to it, and no strong references. The garbage collector may optionally choose to reclaim the memory occupied by the employee object. But what makes soft references even more special is the fact that the garbage collector will always reclaim the memory of objects that have only soft references pointing to them before it throws an OutOfMemory Error. This gives the garbage collector some sort of a last chance to save the life of the JVM.

At any given point of time, if the garbage collector has not yet reclaimed the memory of the actual referent object, you can get a strong reference to the referent via the get method.


Employee resurrectedEmp = softRef.get();


The above code resurrects the employee object in the sense that the garbage collector will not consider it a candidate for garbage collection because it now has a Strong Reference pointing to it.

This makes soft references highly useful in creating object pools, where the size of the pool needs to be dynamic. If you do not know the size of a pool when you begin, or choose not to set a minimum or a maximum size on the object pool, instead you want it to grow dynamically, and at the same time, you want to give the JVM a chance to cleanup unused objects, in that case SoftReferences are a perfect fit.




Weak References

Weak references are even more awesome. Thats because seemingly the garbage collector has no regard for an object that only has a week reference. What that means is that the garbage collector will make it eligible for garbage collection because object only has a week reference pointing to it. Not only is that awesome and useful, but desirable as well in some scenarios.

For example, let  us assume that you need to maintain some metadata related to a user per database connection. In such a case you will be tempted to use a hashmap where you can store the database connection as the key and the user metadata as the value. However, this approach has one drawback. When the database connection has been cleaned up by some other part of the application, then you need to ensure the removal of the connection from the hashmap as well. If you forget to do such a thing, a reference to the connection will remain in the hashmap thereby preventing it from being garbage collected. This means that over a period of time, you are bound to end up with a very large hashmap, a clear indication of a memory leak. And the JVM will eventually spit out a OutOfMemoryError.

So, what do you do in such cases? Oh, of course, Weak referencnes to the rescue!

You can simply create a weak reference to the object, in the same way that we created a soft reference.


DBConnection emp = new Employee(); //1

WeakReference<DBConnection> weakRef = new WeakReference<DBConnection>(emp);



This creates a weak reference to the DBConnection object. This means that if at any future point in of time during the execution of the program, if the garbage collector detects that the only reference to the actual DBConnection object is a Weak reference, then it will mark the object for garbage collection.

Weak references are primarily used in conjuction with a WeakHashMap. This is a special kind of hashmap where the keys are all made of weak references. So, in our database example, we could effectively create a weak reference of the Database connection and store it in the WeakHashMap and the metadata of the user as the value in the hashmap. In this way, when the application no longer holds a strong reference to the Database connection, the only reference to the database connection object will be the one that we have via the WeakReference entry in the WeakHashMap. When the garbage collector detects this, it will mark the object for garbage collection. When the object is garbage collected, the entry from the WeakHashMap will be removed. And then, finally, we can all go home and rest in peace.

So, colloquially speaking, this is what the Soft reference and the Weak reference tell us about themselves.

Soft Reference : Hey! I am a soft reference. Ill take your shit as long as the JVM has patience. When it begins running out of patience(i.e. about to throw an OutOfMemoryError), i take no more. Your object will be gone. And then you simply have to create a new object.

Weak Reference : Hey! You know what. I am even cooler than the WeakReference. Coz I wont take your shit at all. The moment you lose me, if the JVM detects that you're no longer around, am gonna punch you in the face and run away! (i.e. Marked for garbage collection). Can you dig it sucka!

As you see, our two awesome friends, Softy and Weaky, certainly have some ego there. But they are pretty useful, at crucial times.

Before I proceed any further, there is one more puny lil thing that you might need to know. Oh! Did i just say 'might'. Oh no.. I meant, you should know. And that is ReferenceQueues. ReferenceQueues are some sort of a queue where the JVM can store objects of type reference once it has decided to take some action on the objects to which they refer.

What I mean to say is, let us suppose you have a weak reference which points to an object in the heap. And that object has been left lonely and desolate by the rest of the application.i.e. No strong references to it. And the garbage collector detects this object during its garbage collection cycle. Since this object only has a weak reference to it, the garbage collector will mark it for garbage collection. But at the same time, it looks if there is a reference queue associated with the weak reference that points to this object. If yes, it puts this weak reference in the reference queue to indicate that the object has been marked for garbage collection. The subtle point to be noted here is that even though the object has been marked for garbage collection, garbage collection may not have happened yet. This may be because the object has a finalize method, which the JVM needs to execute before reclaiming memory.

This also means that you can, but should not,unless deemed necessary, resurrect an object in the finalize method and create a strong reference to it. But when you do that, the weak reference still remains en-queued in the ReferenceQueue. Overriding the finalize method to resurrect an object is a rare case, but since it is one of the options that the JVM supports, it is therefore something that needs to be considered. Nevertheless, when you do such things, its almost equivalent to artificially manipulating the life of an object. That's because the second time when the object becomes eligible for garbage collection, the finalize method wont run, which is a good thing because if you run it again, its simply gonna revive itself. So, practically speaking, an object in the JVM has only one spare life. You get just one medical kit at the max. And thats it. You screw up again, and ur doomed. Your object will be on mars, having a boss fight with the garbage collecting thread of the JVM, which will eventually win, and reclaim the memory of the object.

The same facts about the reference queue hold true for Soft references as well.

In order to associate a weak or a soft reference with a reference queue, you can use the 2 argument constructor as shown below.


Employee emp1 = new Employee();
Employee emp2 = new Employee();

ReferenceQueue softQueue = new ReferenceQueue();
ReferenceQueue weakQueue = new ReferenceQueue();

SoftReference softRef = new SoftReference(emp1, softQueue);
WeakReference softRef = new WeakReference(emp2, weakQueue);




Now, aint that simple? Yes of course it is.
Then again, you haven't met the Phantom yet, have ya?

The phantom reference is a place where it gets all the more interesting.




Phantom References

Phantom references tell a long tale themselves and its a topic that warrants a blog post of its own. However, in this blog post, ill give a brief idea about what they are. Phantom references are quite similar to Strong and Weak references in the sense that the garbage collector will collect the referent object if the only reference to it is the phantom reference. But that's where the similarity ends. 

What makes Phantom references are unique is the way in which they are used along with a reference queue. A phantom reference is always associated with a references queue during construction time. This is because phantom references are enqueued in the queue only when the the object is about to be garbage collected, after the finalize method(if any) has been executed on it. Calling a get() on the Phantom reference always returns null. And that is quite appropriate because the finalize function has already run on the referent object. So, there should be no 'legal' way of resurrecting the object (resurrecting i.e. creating a strong reference). This may at first seem to make no sense, since, what use is a phantom reference if we cant extract the referent from it? But on giving it a deep thought, you would realize that this is not the reason why phantom references are meant to be useful in the first place. It is the time at which the JVM puts a phantom reference in the reference queue that makes its use so intuitively amazing.

That's something that I will be discussing in more detail in the next post. So stay tuned, because, it might just turn out to be a weird crazy experiment. And of course, its possible that some so called 'laws' will be violated. I am still working on it right now. Ill put it up sooner or later.

So folks, see you on the other side! (Hint : That's what phantom references say too).

Happy Programming :)
Signing Off
Ryan