Wednesday, 23 September 2015

Collection resize and merge policy

Collection like ArrayList/Vector are extension of plain array with no size restriction, these collection has dynamic grow/shrink behavior.

Growing/shrinking array is expensive operation and to amortized that cost different types of policy are used.

In this blog i will share some of the policy that can be used if you are building such collection.

Growing
To decided best policy to extend we need to answer question
 - When to extend collection
-  By how much

When part is easy to answer this is definitely when element are added, but main trick is in how much.
Java collection(i.e ArrayList) grow it by 50% every time, which is good option because it reduces frequent allocation & array copy, only problem with this approach is that most of the time collection is not fully filled.

There are some option to avoid wastage of space
 - If you know how much element will be required then create collection with that initial size, so that no re-sizing is required
 - Have control on how much it grows , this is easy to implement by making growth factor as parameter to collection.

Below is code snippet for Growing list


Shrink
Remove operation on collection is good hook to decided when to reduce the size and for how much some factor can be used for e.g 25%, if after remove collection is 25% free then shrink to current size.

Java ArrayList does not shrink automatically when element are removed , so this should be watched when collection is reused because it will never shrink and in may production system collections are only half filled. It will be nice to have control on shrinking when elements are deleted.

Code snippet for auto shrinking



Hybrid Growth/Shrink
Above allocation/de-allocation has couple of issue which can have big impact on performance for e.g.
 - Big array allocation
 - Copy element

Custom allocation policy can be used to overcome this problem by having one big container that has multiple slots and element are stored in it.

Element are always appended to last pre allocated slot and when it has no capacity left then create new slot and start using it, this custom allocation has few nice benefit for e.g

 - No big allocation is required
 - Zero copy overhead
 - Really big list(i.e greater than MaxInteger) can be created
 - Such type of collection is good for JDK8 collectors, because merge cost is very low.

Nothing is free this comes with some trade off
 - Access element by index can be little slow because it has to work out which slot has item

Code snippet for slots based collection.


Code used in blog is available @ github

No comments:

Post a Comment