Saturday, August 7, 2010

Memory and collection: performance and garbage collection

Lately,
I have been working on an application which require me to load a large no of objects in memory..
Lemme define large... Around 25 lakh objects in memory.. all distributed in around 5 HashMaps

And then, I need to refresh this data in every 15 minutes, basically loading a completely new set of data and removing the existing data.

I faced the following problems:

1)OutOfMemoryException
2)Performance

Let me tell you how I handled each:
1) The reason for occurance of this error is that Java space allocated for objects runs out of memory...because my refresh interval was very short (15 mins) and the garbage collector runs at its own luxury.....

Not necessarily, though you can force Garbage Collector to be at your beck and call, you can force it to behave in a manner that can solve a lot of problems. In my case, I did the following..
I instructed Garbage Collector to run incrementally, so that i runs every once in a while instead of waiting for a large no of objects to become garbage-able....

the java command you need to set up while starting your server is

-XincGC

Ofcourse, you also need to allocate enough heap space etc so that you dont run out of memory.

I suggest you keep the Ms and Mx values same

-XMs1024M -XMx1024m

This will ensure that all the memory gets allocated at the same time and cycles of incremental memory allocation are avoided.

2)
The performance problem was occuring because of many reasons and evidently, there are many wasy to solve problem

The fastest map, when dealing with large number of objects is definitely HashMap (not sorted maps like TreeMap or any other map) forall kind of objects. So, use HashMap

I was dealing with Double objects and storing the same in HashMap.Believe me, changing the operations from Double to double can increase the performance to more than 200%. However, you can not store double primitive in HashMap directly. So, I made a wrapper object and stored in HashMap.

The wrapper looked something like class DoubleWrapper{ private double value; }

I stored the objects of DoubleWrapper in HashMap and used the double value directly in all calculations...the performance increase multifold..

Bascially, use primitives everywhere possible.....The results wil be unbelievable...


So, Use incremental garbage clustering, hashmaps and primitive variables, and your application is golden...:D

See you later

4 comments: