Are you ready: Lock Less Java Object Pool

Sunday 19 May 2013

Lock Less Java Object Pool

It is being while i wrote anything, i has been busy with my new job that involve doing some interesting work in performance tuning. One of the challenge is to reduce object creation during critical part of application.

Garbage Collection hiccups has been main pain point in java for some time, although java has improved over time with GC algorithmic. Azul is market leader developing pause less GC but Azul JVM are not free as speech!

Creating too many temporary/garbage object does't work too well because it create work for GC and it is going to have effect on latency, too much garbage also does't work well with multi core system because it causes cache pollution.

So how should we fix this ?

Garbage less coding
This is only possible if you know how many object you need upfront and pre-allocate them, but in reality that is very difficult to find that , but in-case if you still managed to do that then you have to worry about another issue

You might not have enough memory to hold all the object you need
You have to handle concurrency

So what is the solution for above problem

There is Object Pool design pattern that can address both of the above issue,it lets you to specify num of object that you need in pool and handles concurrent request to serve object request.

Object Pool has been base of many application that has low latency requirement, flavor of object pool is Flyweight design pattern.

Both of above pattern will help us in avoiding object creation, that is great so now GC work is reduced and in theory our application performance should improve but in practical does't happen that way because Object Pool/Flyweight has to handle concurrency and whatever advantage you get by avoiding object creation is lost because of concurrency issue.

What are most common way to handle concurrency

Object pool is typical producer/consumer problem and it can be solved by using following techniques

Synchronized - This was the only way to handle concurrency before JDK 1.5, apache has written wonder full object pool API based on synchronized

Locks - Java added excellent support for concurrent programming since JDK 1.5, there has been some work to use Locks to develop Object Pool for eg furious-objectpool

Lock Free - I could not find any implementation that is built using fully lock free technique, but furious-objectpool use mix of ArrayBlocking queue & ConcurrentLinked queue

Lets measure performance

In this test i have created pool of 1 Million object and those object are accessed by different pool implementation, objects are taken from pool and return back to pool.

This test first starts with 1 thread and then number of threads are increased to measure how different pool implementation perform under contention

X Axis - No Of Threads

Y Axis - Time in Ms - Lower time is better

This test include pool from Apache, Furious Pool & ArrayBlocking based Pool

Apache one is worst and as number of threads increase performance degrades further and reason for same is Apache pool is based on heavy use of "synchronized"

Other two Furious & ArrayBlocking based pool performs better but both of them also slows down as contention increase.

ArrayBlocking queue based pool takes around 1000 ms for 1 Million items when 12 threads are trying to access the pool, Furious pool which internally uses Arrayblocking queue takes around 1975 ms for same thing.

I have to do some more detail investigation to find out why Furious is taking double time because it is also based on ArrayBlocking queue.

Performance of arrayblocking queue is decent but it is lock based approach, what type of performance we can get if we can implement lock free pool.

Lock free pool.

Implementing lock free pool is not impossible but bit difficult because you have to handle multiple producer & consumer.

I will implement hybrid pool which will use lock on the producer side & non blocking technique on the consumer side.

Lets have look some numbers

I performed same test with new implementation (FastPool) and it is almost 30% faster than ArayBlocking queue.

30% improvement is not bad, it can definitely help is meeting latency goal.

What makes Fast Pool fast!

I used couple of technique to make it fast

Producer are lock based - Multiple producer are managed using locks, this is same as Array Blocking queue, so nothing great about this.

Immediate publication of released item - it publishes element before lock is released using cheap memory barrier. This gives some gain

Consumer are non blocking - CAS is used to achieve this, consumer are never blocked due to producer. Array Blocking queue blocks consumer because it use same lock for producer & consumer

Thread Local to maintain value locality - Thread Local is used to acquire last value that was used, this reduces contention to great extent.

If you are interested in having look at code then it is available @ FastObjectPool.java

Resources
java-object-tutorial

14 comments:

Chris Engelbert (noctarius)10 July 2013 at 12:16
Would it be possible to license the FastObjectPool under Apache License 2? :-)
ReplyDelete
Replies
undernet25 August 2013 at 15:37
Is there any chance of modifying your fastObjectPool so it behaves more like a ConcurrentLinkedqueue so that the take and release return and consume type T instead of holder objects so users of the library don’t have to mess around with managing holder objects everywhere.

It would also be great if you didn’t have to set the size at creation so that the pool grows on demand as objects are released to the pool.

The problem with ConcurrentLinkedqueue is that it’s offer method creates garbage with line 327>
final Node newNode = new Node(e);

you’r fastObjectPool does not seem to have this problem which makes it better suited for an object pool as well as its speed.
ReplyDelete
Replies
Ashkrit25 August 2013 at 23:55
You idea will make it much clean , but reason why i did that way was to maintain state related object whether it is used/free.
If i start returning T object then i have to also find then better way of threadlocal optimization that i have done for reducing contention.

Other option that came to my mind while implementation was exploring dynamic proxy but then it will be over engineering for simple problem, so i choose this trade off.

Regarding - flexible size, most of the object pool that i have seen/used are bounded by size, it is good to have that way to make your system salable.
If load is more that pool size then it is better to have proper waiting strategy or adjust pool size declarative.

You are right java concurrentlink based implementation create garbage & are unbounded due to which you system can behave very differently under burst traffic.
Another reason of not using ConcurrentLinkQueue is that it does random memory walk because it is based on linked list, random memory walk is not good for low latency system.
ReplyDelete
Replies
MANOJ KUMAR21 September 2013 at 13:20
1. Performance

Object pooling provides better application performance As object creation is not done when client actually need it to perform some operation on it Instead objects are already created in the pool and readily available to perform any time. So Object creation activity is done much before So it does help in achieving better run-time performance

2. Object sharing :

Object Pooling encourage the concept of sharing. Objects available in pool can be shared among multiple worker threads . One thread Use the Object and once used it returns back to its Object pool and then it can be used by some other worker thread. So once created objects are not destroyed and thus destruction and creation again and again is not required. That again help in generating better performing code.

3. Control on Object instances :
By declaring size of Object pool we can control the no of instance creation. Thus a finite no of objects are created as decided depending upon required application capacity and scalability or peak load.

4. Memory conservation :
Finite no of instances are created So it helps in better memory management . Too many instances are not

Read through extensive details here :

http://efectivejava.blogspot.in/2013/09/8-reasons-why-object-pooling-is.html
ReplyDelete
Replies
Unknown2 September 2015 at 10:56
Nice article Ashkrit. would like to see a comparision of using a disruptor pattern to solve the same issue by using a ring buffer which references to the object. This i believe might be better performant as its a CAS between producers, a CAS between consumers and a volatile for publishing. No lock anywhere ( as you claim to have one among the producers in FastPool).
ReplyDelete
Replies
Anonymous9 November 2016 at 17:22
//if(holder!=null && THE_UNSAFE.compareAndSwapObject(objects, (index*INDEXSCALE)+BASE, holder, null))
if (holder != null && THE_UNSAFE.compareAndSwapObject(objects, (index << ASHIFT) + BASE, holder, null)) {

could you please explain how you replaced multiplication with shift? especially ASHIFT part
ReplyDelete
Replies
Jason Marshall14 December 2019 at 23:24
Admiring the time and effort you put into your blog and detailed information you offer!.. spicewood custom pool builder
ReplyDelete
Replies

Add comment