Logging framework is heart of every application, it helps in troubleshooting production issue,knowing how your application is being use,what are bottle neck in application and may more.
Using right logging framework is key, Wikipedia has list of famous one in java world.
So logging has lot of benefit but it also brings lot of overhead ,so you do trade-off for benefit that you get. It is interesting to measure cost of overhead and we all know it has big overhead because it is I/O bound, so we come with best strategy on what to log, how much to log, which one is best framework etc.
One problem with current logging framework is that it , not much work has been done on performance improvement, especially if you talk about taking advantage of multi core architecture. Application try to do more work to take advantage of multiple cores and that may result into more..... log message and since log framework are not their yet, so they don't scale.
Cost Of Log Message
Lets measure time spent in logging.
I wrote java program that sums the number of element in array and it is done in parallel. Array is divided in chunk and each task sums that chunk and log message before & after it process chunk. I have taken simple computation problem because we want to measure cost of logging.
For bench marking purpose , i do calculation with/without logging to check how much time we spend on logging and numbers are crazy.
Java JDK 1.4 logging framework(java.util.logging) is used for benchmark, i only took java one because all other like log4j , apache logging , SLFJ etc are almost same when they log message to file.
Details of Machine
OS : Windows 7, 64 bit
Processor : Intel i5, 2.40 GHz
No Of Cores : 4
Using right logging framework is key, Wikipedia has list of famous one in java world.
So logging has lot of benefit but it also brings lot of overhead ,so you do trade-off for benefit that you get. It is interesting to measure cost of overhead and we all know it has big overhead because it is I/O bound, so we come with best strategy on what to log, how much to log, which one is best framework etc.
One problem with current logging framework is that it , not much work has been done on performance improvement, especially if you talk about taking advantage of multi core architecture. Application try to do more work to take advantage of multiple cores and that may result into more..... log message and since log framework are not their yet, so they don't scale.
Cost Of Log Message
Lets measure time spent in logging.
I wrote java program that sums the number of element in array and it is done in parallel. Array is divided in chunk and each task sums that chunk and log message before & after it process chunk. I have taken simple computation problem because we want to measure cost of logging.
For bench marking purpose , i do calculation with/without logging to check how much time we spend on logging and numbers are crazy.
Java JDK 1.4 logging framework(java.util.logging) is used for benchmark, i only took java one because all other like log4j , apache logging , SLFJ etc are almost same when they log message to file.
Details of Machine
OS : Windows 7, 64 bit
Processor : Intel i5, 2.40 GHz
No Of Cores : 4
Numbers are really crazy, once you add logging, performance drops by 6X to 10X times, it is not worth it to use logging.
More time is spent in logging than doing real work, but fact of life is we still need logging and if it is not there then it will be nightmare to troubleshoot production issue.
Why it is so slow!
To find out why is super slow, we have to dive into the code, so lets have look at default handler(FileHandler) that java logging framework provides.
FileHandler has synchronized function that perform core logging and we know what type of performance degradation you can get in multi threaded env, so that is number 1 reason for slowness.
public synchronized void publish(LogRecord record) {
if (!isLoggable(record)) {
return;
}
super.publish(record);
flush();
.........
..........
...........
}
Other reason for slowness is that I/O operation is performed for each and every call of log message, there is no buffering done, not sure why it is done like that may be to keep it simple and all these simplicity adds to performance cost.
All of these techniques result in lot of contention and we see big degrade in performance by harm-less looking code.
What can we do now ?
So we have to find some alternate framework that does't make our application 10X times slow!
We have to do couple of quick things.
- Get rid of synchronized
- Try to add some I/O buffering
- Use Async for logging
So i did add these stuff and created simple logging util and measured the cost of same.
So the light green one is the logger that has some of improvement and it is amazing to see that we are back to same performance and it is not adding any significant overhead, it is almost as there is no logging in code.
So by just switching logging framework we can get up to 6X to 10X benefit.
What gives up these performance improvement
- ConcurrentLinkedQueue is used to store all the log message, ConcurrentLinkedQueue is lock free queue but it is not bounded. Bouded queue can be used to reduce risk where producer is faster than consumer.
- Buffering is added for I/O operation, these reduces I/O operation
- Prealocated bytebuffer is used to keep all the message that needs to writen to file.
- Lock based wait strategy is used when there is no message in queue, better waiting strategy can be used based on requirement, but for logs i think lock based are fine.
Conclusion
So time has come to re-think logging framework used, just using by using correct logging framework we can get big performance boost. To make application more responsive, we have to have to look for alternate. This is very important in low latency & high throughput space.
Example that i used is very basic, it does't have all the features, but can be easily extended to add those stuff.
Link To Code
Code available @ github
No comments:
Post a Comment